News:

Simutrans.com Portal
Our Simutrans site. You can find everything about Simutrans from here.

Performance Problem since r7373 / 120.0.1

Started by quirinus, December 23, 2014, 07:03:05 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

quirinus

Dear Boardmembers and devs,

this is my first post although i play this game since years now, so first i want to say thank you very much to the devs and the community for this great game.

about one week ago i returned to play simutrans again and i was very excited when i saw there is a new stable version (120.0.1 or 7373). i never had (serious) problems with game performance, but now i encounter big fps drops, several times during one ingame year. fps go down to 2-5, game is unplayable for some seconds.

i made a little testing and i can say:

1. fps drops happen when snow texture is changing. i saw, that snow melts in steps (has this always been like this?!) and does not go up from winter to summer snowline in one day. every time snow goes up a little, the fps drop. so most fps drops happen during spring/fall, there are only a few drops in summer.

2. curious: same problem happens as well if i set the landscape to no snow or to not to change the snowline. so i guess it has nothing to do with graphic changes.

3. it doesnt matter if i set the mountains to high or low.

4. i use pak128 (newest, 2.5.2 i guess), but i also tested pak128.britain, japan and german, same problem. so i guess it is not a pakset problem.

5. on small maps, this problem is not so big, fps drops are smaller. normally i play maps like 2000x1500.

i went through the settings (simuconf.tab, etc) but i didnt find any setting for climate changes, but somewhere has to be written when snow has to go up. i would not have any problem if snow goes up at one step.

thank you for help, please tell me if you need further information.

i use windows 7 ult. 64bit
intel core i5 760 2.8ghz
geforce gtx 760
8gb ram
samsung 840 evo ssd

up to date drivers for everything

edit: i tested nightly 7436, no difference..

Ters

Hi

Quote from: quirinus on December 23, 2014, 07:03:05 AM
1. fps drops happen when snow texture is changing. i saw, that snow melts in steps (has this always been like this?!) and does not go up from winter to summer snowline in one day. every time snow goes up a little, the fps drop. so most fps drops happen during spring/fall, there are only a few drops in summer.

It has been that way for years. First of all, the snow line moves up and down one height level every month, similar to how it is in real life. Secondly, the entire map isn't updated at the same time, except for small maps, because changing climate takes a toll for big maps.

DrSuperGood

Build your own MSCV build. The performance is about 30% better than the Linux based GCC nightly/release builds you can download from this site thanks to either better compiler optimization or the ability to use SSE2. Stick with 32bit as the 64bit will perform worse and could be buggy.

Quote5. on small maps, this problem is not so big, fps drops are smaller. normally i play maps like 2000x1500.
Too be expected with maps that size unfortunately. The 7000*5000 experimental server locks up for 1-3 seconds at the end of every month. The issue is probably because of improper staggering of monthly operations over the entire month instead of at month boarders.

gauthier

I also encounter FPS drop regularly ... only when I use fast forward. It is likely to be the same problem. However I find odd that you always have this problem whereas I get it only in fast forward, especially knowing that my computer is about five years old now and completely outdated compared to yours.

Ters

Quote from: DrSuperGood on December 23, 2014, 02:50:29 PM
Build your own MSCV build.

That's expecting quite a lot from a new poster. It's clearly not a computer illiterate, but still.

DrSuperGood

QuoteThat's expecting quite a lot from a new poster. It's clearly not a computer illiterate, but still.
I know that however it was only a suggestion which could help him immediately.

For a developer solution one would have to change the monthly update mechanics to be staggered over the entire month instead of just at the start. I am not sure how easy that would be.

TurfIt

Quote from: Ters on December 23, 2014, 10:26:35 AM
It has been that way for years. First of all, the snow line moves up and down one height level every month, similar to how it is in real life. Secondly, the entire map isn't updated at the same time, except for small maps, because changing climate takes a toll for big maps.
No, the snowline moves smoothly up/down one height level between the winter and summer heights in a roughly sinusoidal curve over the entire year. Months are not directly involved (they simply define a point on the curve, but the curve is interpolated from the monthly values). Currently the map is processed in chunks of 1/16 the map size, or 16384 tiles, whichever is smaller. Spreading out the update like this sounds like a good idea, but in practice it results in a bad interaction with the frame timing algorithm and causes an unnecessary several second hiccup. In my testing, increasing the amount processed at once improves the smoothness, but increase it to what? My system can easily handle 2^20 tiles at once. Slower systems would choke even worse than the current hiccup if set to that.


Quote from: DrSuperGood on December 23, 2014, 02:50:29 PM
Build your own MSCV build. The performance is about 30% better than the Linux based GCC nightly/release builds you can download from this site thanks to either better compiler optimization or the ability to use SSE2. Stick with 32bit as the 64bit will perform worse and could be buggy.
Do note the official builds are optimized debug builds so that assertions are enabled. This results in a rather substantial performance penalty (there's got to be a better way...), and likely why you're seeing MSVC better assuming you're not building in debug. Using MSVC is not recommended due to the disabling of the inline asm which gives a substantial performance boost to the graphic rendering in 32bit GCC builds. Also the lack of proper structure packing/alignment which is setup for GCC only.


Quote from: DrSuperGood on December 23, 2014, 02:50:29 PM
Too be expected with maps that size unfortunately. The 7000*5000 experimental server locks up for 1-3 seconds at the end of every month. The issue is probably because of improper staggering of monthly operations over the entire month instead of at month boarders.
Experimental is a completely different animal. It has several very time consuming functions tied to month end (needlessly IMHO...). At month end, Standard is only doing financial roll overs, and some basic housekeeping w.r.t available vehicles, etc. Nothing overly computation intensive like Experimental.

That said, a month change that also corresponds to a season change does trigger the iteration over the whole map even when the snowline doesn't change. Not sure why...


Quote from: gauthier on December 23, 2014, 03:10:29 PM
I also encounter FPS drop regularly ... only when I use fast forward. It is likely to be the same problem. However I find odd that you always have this problem whereas I get it only in fast forward, especially knowing that my computer is about five years old now and completely outdated compared to yours.
Fast forward runs at a fixed 10 fps. If you're managing fps drops here, your system is hopelessly too slow to run at the zoom out / screen resolution you're trying to. Normally when a system is too slow to run a map (i.e. the map is too complex), fast forward mode simply results in running at less than realtime (it turns into slow forward!).


Quote from: quirinus on December 23, 2014, 07:03:05 AM
1. fps drops happen when snow texture is changing. i saw, that snow melts in steps (has this always been like this?!) and does not go up from winter to summer snowline in one day. every time snow goes up a little, the fps drop. so most fps drops happen during spring/fall, there are only a few drops in summer.
A fps drop here is expected, however your system specs would indicate it should be but a blip. Certainly not dropping to unplayable. The new stable release is the first with the double heights / per tile climate feature. There is a performance hit for this feature, especially during the snowline change, but again it really shouldn't be dropping to unplayable. So I'm at a bit of a loss to explain your symptoms.


Quote from: quirinus on December 23, 2014, 07:03:05 AM
2. curious: same problem happens as well if i set the landscape to no snow or to not to change the snowline. so i guess it has nothing to do with graphic changes.
How are you doing this? I'm not aware of any 'no snow' setting. Best you can do is set the winter snowline equal to the summer. That will minimize number of times the game updates the climates across the entire map, however it will still occur when the season changes (even if the graphics don't change).


Quote from: quirinus on December 23, 2014, 07:03:05 AM
i went through the settings (simuconf.tab, etc) but i didnt find any setting for climate changes, but somewhere has to be written when snow has to go up. i would not have any problem if snow goes up at one step.
Setting the winter snowline in the landscape settings dialog at map creation is the only available setting to control this.

Markohs

#7
Quote from: TurfIt on December 24, 2014, 12:03:00 AM
Do note the official builds are optimized debug builds so that assertions are enabled. This results in a rather substantial performance penalty (there's got to be a better way...), and likely why you're seeing MSVC better assuming you're not building in debug. Using MSVC is not recommended due to the disabling of the inline asm which gives a substantial performance boost to the graphic rendering in 32bit GCC builds. Also the lack of proper structure packing/alignment which is setup for GCC only.

Has anybody actually benchmarked this? I whoudn't be surprised if msvc builds were way faster than gcc ones, even without compiler-specific optimizations on code, like we have for gcc, and even without the assembler. VS is a compiler made by a enterprise, and targets just one plattform, while gcc is designed to be multi-platform, and programmed by software enthusiasts (and contributions from some enterprises). Anyway gcc is obiously the main target of simutrans since it's a project that needs to run in both linux and windows.

EDIT: If someone points me to where can I get details of wich compiler version + settings are used in the official build, I volunteer myself to compare it with a VS2013 build.
EDIT2: Even I have to say years ago I did some basic comparisions and found gcc builds were faster because of that asm.

TurfIt

Quote from: TurfIt on December 24, 2014, 12:03:00 AM
That said, a month change that also corresponds to a season change does trigger the iteration over the whole map even when the snowline doesn't change. Not sure why...
Tree aging! doh. 

@quirinus - how does a map with no trees perform?
I have noticed the tree generator going berserk occasionaly and covering 90% of a map with trees. Maintaining forests does suck an inordinate amount of processing power...


Quote from: Markohs on December 24, 2014, 01:19:41 AM
Has anybody actually benchmarked this? I whoudn't be surprised if msvc builds were way faster than gcc ones, even without compiler-specific optimizations on code, like we have for gcc, and even without the assembler. VS is a compiler made by a enterprise, and targets just one plattform, while gcc is designed to be multi-platform, and programmed by software enthusiasts (and contributions from some enterprises). Anyway gcc is obiously the main target of simutrans since it's a project that needs to run in both linux and windows.

EDIT: If someone points me to where can I get details of wich compiler version + settings are used in the official build, I volunteer myself to compare it with a VS2013 build.
EDIT2: Even I have to say years ago I did some basic comparisions and found gcc builds were faster because of that asm.
It's not really compiler specific optimizations, more just compiler specific syntax to provide some hints to the compiler. MSVC could be made to work just as well I'm sure, but somebody who uses has to take the time to rewrite the GCC specific syntax into whatever MSVC is expecting such as AT&T asm to Intel syntax for MSVC. The optimizations are mainly processor/architecture specific (x86) - alignments and packing, nothing to do with platforms (OS).

When it comes to made by an enterprise vs enthusiasts, I'd certainly lean to the enthusiasts  8).

I think only Prissi can answer what compiler/settings are used for the releases, but as mentioned they are still debug builds. And I'm certain they're done with an old compiler, so find VS2005 or so for a 'fair' comparison! I'm using GCC4.7.2 personally as that's the latest I could graft into my MinGW install and get to work. At this point, it's largely impossible for someone to obtain a new MinGW from scratch, and successfully get Simutrans to work. I could maybe send you my install if you're truly interested. MSVC (or any MS development crap) is still banned from getting within 10 miles of any of my systems - once burned, never again.

DrSuperGood

QuoteIt's not really compiler specific optimizations, more just compiler specific syntax to provide some hints to the compiler. MSVC could be made to work just as well I'm sure, but somebody who uses has to take the time to rewrite the GCC specific syntax into whatever MSVC is expecting such as AT&T asm to Intel syntax for MSVC. The optimizations are mainly processor/architecture specific (x86) - alignments and packing, nothing to do with platforms (OS).
There are exactly 3 pieces of assembly as far as I can tell. Each could be easily ported to MSCV if required however it should be possible to get similar output assembly from C code (it does nothing that clever). Most of the performance differences will likely come from standard library implementations and structure packing.

Ters

I've seen from benchmarking tests that MSVC does/did output faster machine code than GCC, but not by much. (Intel's compiler is/was even better.) However, even building Simutrans oneself can yield a faster running Simutrans than the nightly build simply by telling the compiler to target a more modern CPU. Letting the compiler use SSE2 on the unoptimized C code might be faster than the handwritten assembly on computers with SSE2 support. (I think SSE2 might be required, not just SSE. It's been a while, so my memory might fail me.)

However, since this is tied to the months, the issue here is most likely tied to having to process the entire map for texture updates and/or tree aging. I doubt there is much the compiler can do with this on its own.

DrSuperGood

The only real solution is to stagger as much as possible. However if you have 3,000,000 tiles of trees needing to be updated every month maybe even that will not suffice.

Since trees are cosmetic, maybe their aging could be variable period? By this I mean the more trees there are on the map, the longer the period between the "ageing" tick with each tree aging more per tick. The number of such tree ticks per game tick could then be a constant.

Markohs

#12
 It's normal gcc generates slower code thn MSVC, gcc is designed to be multi-plattform, and it targets more languages. Gcc performs worse than MSVC, that's a fact, but it's a normal thing to expect, since professional products usually outperform free ones. Don't get me wrong, I love free software, and I think the worlds needs more free software, and it's vital for our society. But it's software made by hobbysts, mainly. It's like comparing solaris or Mac OS X to linux.

MSVC performs better than gcc (maybe not on all situations, ofc):

http://www.willus.com/ccomp_benchmark2.shtml?p8

EDIT: I remember one of the subjects on the CS degree I made was about programe performance tuning. What I remember it that most of the optimizing habbits programmers have today, inherit from the times when compilers were really simple, and did almost no optimization. We did multimple excercises where we aligned data structures, unrolled loops, packed data, reduced memory access patterns... What I learned is was that almost all the optimizations we made, were just useful with gcc -O1, and when we used -O3, were even conuterproductive, because the complier found even a better solution to optimize the problem, and our tries to optimize it, caused the compiler to not being able to optimize further, because we were telling him in our code, dependancies (or restrictions) that he had to honor, preventing him do optimize. The result was we got most of the times the faster executable with just simple code, and using -O3.

Compiler nowadays do lots of code optimization, lots, and they do it way better than human programmers. Just focus on giving the compiler a clear algorithm, and data structures, use the language natually, in a waythe compiler can understand what are you doing, and he'll find a better way of doing it, and faster. Packing structures is just useful for reducing memory usage, and mostly of times you'll end with slower code, even if memory usage is lower.

DrSuperGood

QuoteIt's normal gcc generates slower code thn MSVC, gcc is designed to be multi-plattform, and it targets more languages. Gcc performs worse than MSVC, that's a fact, but it's a normal thing to expect, since professional products usually outperform free ones. Don't get me wrong, I love free software, and I think the worlds needs more free software, and it's vital for our society. But it's software made by hobbysts, mainly. It's like comparing solaris or Mac OS X to linux.
One would imagine otherwise actually seeing how many professional bodies use GCC. It is actually kind of stupid that this is the case since companies like Oracle invest heavily in Linux attributing a significant number of commits to the official source code.

It is possible that much of MSVC's speed comes through better standard libraries. Since it is written by Microsoft they probably use the OS more efficiently than third party standard library solutions resulting in higher performing code. Silly things like less OS calls could contribute to speedups.

Quotehttp://www.willus.com/ccomp_benchmark2.shtml?p8
Does not apply to simutrans. Where as that simulation is heavy on double precision floats, simutrans hardly ever uses floats (with exception of some local output), mostly using 32 and 64 bit precision integer mathematics. One can assume it was done using an intel processor seeing how the intel compiler performed best (it probably understands floating point behaviour better than the other compilers).
Quote
EDIT: I remember one of the subjects on the CS degree I made was about programe performance tuning. What I remember it that most of the optimizing habbits programmers have today, inherit from the times when compilers were really simple, and did almost no optimization. We did multimple excercises where we aligned data structures, unrolled loops, packed data, reduced memory access patterns... What I learned is was that almost all the optimizations we made, were just useful with gcc -O1, and when we used -O3, were even conuterproductive, because the complier found even a better solution to optimize the problem, and our tries to optimize it, caused the compiler to not being able to optimize further, because we were telling him in our code, dependancies (or restrictions) that he had to honor, preventing him do optimize. The result was we got most of the times the faster executable with just simple code, and using -O3.
It all depends on how you try to optimize something. As soon as you try to go low level in c/c++ then chances are it can mess up the compiler. It is best to have clean high level code with minimized dependencies and explicitly stores complex intermediates.

Most large optimizations come algorithmically. An example in simutrans would be for cross-connect having separate lists for different production types so that you do not iterate through every factory on the map to find all factories that produce a specific good for each factory on the map (O(n^2) to O(n)). Only in very tight code could you possibly find a programming optimization but that would generally come by simplifying the mathematics involved. For example in Simutrans to display factory production before it performed the following...
32bit ints -> double floats -> some operations -> double float -> 32bit int.
However with my JIT2 system this was changed to simply...
32bit ints -> 64bit operations -> 32bit int
That is an optimization a compiler cannot pickup because it would change the code's functional behaviour. In this case it makes no difference since it was hardly performance critical however I did read that guys doing the open source 0A.D. game suffered from this issue in their core engine which wasted non-trivial time.

QuotePacking structures is just useful for reducing memory usage, and mostly of times you'll end with slower code, even if memory usage is lower.
Actually compilers still have problems with packing structures. I think it is because they purposely try to order member variables in the order of declaration as opposed to best optimized order. The people doing the Dolphin Wii/Gamecube emulator managed to obtain 5-10% speedups simply by re-ordering their structures. Unlike Simutrans their project only targets high end systems due to the complexity of the platform being emulated. They did however gain non-trivial performance (it worked better) using a 64bit build last I heard but that might be due to how the PowerPC chips in the Gamecube/Wii operated might have been 64bit.

Ters

Quote from: Markohs on December 24, 2014, 07:24:22 PM
It's normal gcc generates slower code thn MSVC, gcc is designed to be multi-plattform, and it targets more languages. Gcc performs worse than MSVC, that's a fact, but it's a normal thing to expect, since professional products usually outperform free ones. Don't get me wrong, I love free software, and I think the worlds needs more free software, and it's vital for our society. But it's software made by hobbysts, mainly.

Development of much central open source stuff is done by commercial enterprises, either companies that have found open source useful (Oracle, Google), or that was founded on it (Red Hat). GPL is much more suitable for big business than hobbyists. However, that people are actually employed to write and maintain open source software does not mean that quality is good, if the companies paying them don't pay for enough development time. But that is true for closed source as well. It seems to me that if things are sold with a support agreement, quality will be better. The Linux kernel gets lots of professional contributions because it is used in many commercially sold products, whether just a Linux distro, or somthing running Linux. GCC will likely benefit somewhat from this, as it is needed to build the Linux kernel. More obscure open source software may not get much of the money flow, which OpenSSL suffered from. (Although OpenSSL was/is maintained by a full-time employee, not just hobbyists. But that's the thing, it was just one.)

TurfIt

Back on topic (after scaring away the OP...), r7442 might help. I'm seeing a ~350% speedup of the season change.

Ters