The International Simutrans Forum

 

Author Topic: Performance analysis  (Read 304 times)

0 Members and 1 Guest are viewing this topic.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 19975
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Performance analysis
« on: July 22, 2020, 12:34:08 PM »
I notice that some people on the server were discussing performance issues. A brief profiling run gives a breakdown of where the most computational effort is being spent (bear in mind that multi-threading in some but not all parts of the code may distort how this affects real world performance).
Some highlights from the findings (using the current Bridgewater-Brunel saved game in single player mode; performance may be subtly different in multi-player mode) are:
  • graphics (main_view_t::display_region), which are multi-threaded, take a total of 31.79% of CPU time;
  • passenger and mail generation (step_passengers_and_mail_threaded_ takes a total of 16.82% of CPU time, of which 10.01% (overall CPU time, not 10.01% of 16.82%) consists of finding a path between halts by querying the hashtable (haltestelle_t::find_route);
  • vehicle physics calculations (convoi_t::calc_acceleration) takes 10.89% of CPU time;
  • the main path explorer method, which runs in its own thread, (path_explorer_t::compartment_t::get_path_between) takes 5.04% of CPU time;
  • pedestrians (pedestrian_t::sync_step) take 1.70% of CPU time;
  • convoy routing (including ships), which is multi-threaded, (convoi_t::drive_to) takes 1.67% of CPU time;
  • convoy loading at stops (haltestelle_t::request_loading) takes 1.14% of CPU time
These results are with my computer on this particular saved game; different results with different saved games are likely.

Offline Freahk

  • Devotee
  • *
  • Posts: 1172
  • Languages: DE, EN
Re: Performance analysis
« Reply #1 on: July 22, 2020, 12:52:39 PM »
Good to know, but I am still quite sure CPU load by itself is not the main issue here.
Most of the time, none of my cores are even close to 50% computational load.
I do rather expect the the bottleneck to be located elsewhere. E.g. memory bandwith, pipeline stall or control hazzards, though i did not find out how to profile this properly yet.

Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2344
Re: Performance analysis
« Reply #2 on: July 22, 2020, 12:55:07 PM »
Good to know, but I am still quite sure CPU load by itself is not the main issue here.
Most of the time, none of my cores are even close to 50% computational load.
I do rather expect the the bottleneck to be located elsewhere. E.g. memory bandwith, pipeline stall or control hazzards, though i did not find out how to profile this properly yet.
Memory bandwidth issues (including issues relating to processor cache size) can be identified if performance increases from more threads tails off. It's very dependent on the system being used though.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 19975
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Performance analysis
« Reply #3 on: July 22, 2020, 12:56:00 PM »
Good to know, but I am still quite sure CPU load by itself is not the main issue here.
Most of the time, none of my cores are even close to 50% computational load.
I do rather expect the the bottleneck to be located elsewhere. E.g. memory bandwith, pipeline stall or control hazzards, though i did not find out how to profile this properly yet.

I suspect that memory bandwidth is a major issue, especially for graphics and the routing algorithms for passengers/mail/goods.

As to profiling, are you using Visual Studio? If so, it is quite straightforward: use the optimised debug build and go to Debug > Performance Profiler. In GCC, it is also fairly straightforward to profile, although I cannot remember the exact steps.

Edit: Disabling water animation can significantly reduce graphics load on the system.

Offline Freahk

  • Devotee
  • *
  • Posts: 1172
  • Languages: DE, EN
Re: Performance analysis
« Reply #4 on: July 22, 2020, 03:12:10 PM »
Quote
Edit: Disabling water animation can significantly reduce graphics load on the system.
I'll try this offline, if so it would be very kind of you to disable this on bridgewater.

I'm using clion. Visual studio under Linux is a mess, just like any microsoft product. I could install windows, but I don't agree with that company in very many points, so I prefer not using their products whenever possible.

Anyways, clion has a lot of debugging and profiling features either, I'll find it out...

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 19975
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Performance analysis
« Reply #5 on: July 22, 2020, 05:16:43 PM »
For Linux, I recommend GCC for profiling; it does have some fairly standard ways of doing this. I am not familiar with Clion - is that a compiler or an IDE?

So far as water animation is concerned, you can disable this locally for online games without disabling it on the server: simply set water_animation_ms=0 in the base simuconf.tab. This is one of the settings that is not synchronised between client and server.

Offline Freahk

  • Devotee
  • *
  • Posts: 1172
  • Languages: DE, EN
Re: Performance analysis
« Reply #6 on: July 22, 2020, 07:51:54 PM »
Clion is an IDE using cmake, gcc (or clang if you choose to), gdb and all these standard gnu tools under the hood.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 19975
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Performance analysis
« Reply #7 on: July 22, 2020, 08:29:29 PM »
Clion is an IDE using cmake, gcc (or clang if you choose to), gdb and all these standard gnu tools under the hood.


Splendid - you should be able to use the GDB profiler with that, in that case.

Offline Matthew

  • *
  • Posts: 353
    • Japan Railway Journal
  • Languages: EN, some ZH, DE & SQ
Re: Performance analysis
« Reply #8 on: August 08, 2020, 04:36:22 PM »
From the markers,  etc. thread:

Quote from: jamespetts
Thank you for your thoughts on this. We should really move discussion of the performance to the thread specific to performance; would you mind posting your results in that thread so that I can gauge better whether to reduce the framerate and also get feedback on this from other players?

I have not sure exactly what data you want here, so here is a first attempt.

I have continued trying to play around with settings to understand the UI lag. When playing Bridgewater-Brunel off-line in 1845, the in-game Display GUI now indicates a Frame Time of ~30-40ms and 10-18 fps normally. This goes down to ~60ms and 8-10fps when the Path Explorer is processing passenger classes 3 and 4 (the goods classes are no trouble). The Idle value is always 0ms (except when it jumps to a very high figure that I guess might be due to a an integer wraparound), but I don't know how significant that is in the days of multicore CPUs.

The fact that the performance is worst when the Path Explorer runs certain passenger classes makes me wonder whether it might be worth reducing path_explorer_time_midpoint in preference to hurting other players' graphical performance. I tried setting it to 48 and loaded the B-B savefile offline: Frame Time did not seem to be affected, but FPS went up to about 22 normally; during heavy passenger classes the values were ~42ms and 15-20fps; I also got sporadic idle time.

As a side note, that fact seems to support Freahk's idea that the issue is memory bandwidth, since there seems to be plenty of spare capacity in the CPUs and RAM. I accept that the Path Explorer is inevitably going to be computationally demanding in a game played on a large map and that nobody can alter the Path Explorer in the near future; just noting the point for future reference.


Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 19975
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Performance analysis
« Reply #9 on: August 08, 2020, 06:21:28 PM »
Thank you for the feedback - that is helpful. It is better to change one parameter at a time for testing. I have modified the framerate to reduce to 15fps on the server, and I should be grateful for feedback (from to-morrow when this change will take effect) as to how this affects lag.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10032
  • Languages: De,EN,JP
Re: Performance analysis
« Reply #10 on: Today at 01:07:48 AM »
The CPU load is not really a good indications, since Simutrans have many places, which must be processed in certain order to prevent desync. Thus involves neccessarily waiting, even with multithreading and thus the CPU load will be usually even leass than 50%, depending of the number of thread available.

Unless the Microsoft profiler did made tremendous progress, in my experience the numbers are rather off, since it count any library or OS call as as part of the instrumented code. One of the biggest number for a threaded server should be the percentage of waiting for barriers. It is not time spent in the code, but if most real time is wasted at barriers, then total time increases with multithreading despite profiling tells you otherwise.

In this regard, best results on total time are probably from profiling single and multithreaded and compare.