I noticed that simutrans has several functions declared as inline, and it makes perfect sense, since some functions like decode_uint16 are called something like 100 million times during loading, or min(int, int) which is also called dozens of million times during game-play.
However, when compiling with GCC (with -O3, so on and so forth...), GCC does not inline those functions.
It happens certainly because GCC is not able to detect how much calls are going to be made to those functions (probably because they depend on the actual savegame being read).
Bottom line is: there are several "inline" functions that I believe are not being inlined, degrading performance when called millions of times.
I tried to force inline [by using __attribute__((always_inline)) ] and managed to inline a few functions just to test if they got inlined, and if that translated into faster code. They did got inlined, and it managed to improve the total time spent on decode_uint16 by 20%. I belive smaller functions would get a bigger improvement.
I stress that I tried to make GCC inline those functions using different optimization flags, yet, I could not find a way to get them to be inlined. Only by using the attribute.
Could you plz check if the deployed version has those functions actually inlined? Without debugging symbols, and without the makefile config used for deployment, its hard to check that.
If not, I would suggest to only add this attribute to the most performance-critical functions (and not just force the compiler to do that on every single "inline" function). If you plan to add this attribute to inline functions, here is a list of the ones that are called more often:
I profiled simutrans during pak/savegame loading, and during 20min gameplay (so that loading-related calls tends to be "negligible" - you could always subtract those, though.. ).
EDIT: Forgot to mention (perhaps of little importance): I am referring to the 120.0.1 nightly.