The International Simutrans Forum

 

Author Topic: SDL2 performance regression  (Read 2273 times)

0 Members and 1 Guest are viewing this topic.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1309
SDL2 performance regression
« on: January 14, 2018, 11:56:03 PM »
Recently comparing the performance of the backends in Extended, I noticed SDL2 performing much worse than previously. Switching to Standard instead, the same problem exists.
Using dirty tile updates, the frame times are increased by 33%! But, changing to direct3d renderer instead, then using dirty tiles decreases frame times by 25%, as expected, and as previous.

The debugging switch to disable dirty tiles is still active "-use_hw". Can anyone duplicate? Especially on Linux / OSX platforms where switching away from opengl is not possible...



Offline captain crunch

  • *
  • Posts: 97
Re: SDL2 performance regression
« Reply #1 on: January 15, 2018, 12:59:01 AM »
On Linux 3.16, amd64, with SDL2 2.0.2, SDL Driver: x11:
  • "sim -objects pak -use_hw": ca. 38% CPU usage.
  • "sim -objects pak": ca. 27% CPU usage.
HTH

(Edit: added display stats).

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: SDL2 performance regression
« Reply #2 on: January 15, 2018, 07:12:07 AM »
Could it be spectre or meltdown related? If there are a lot of system calls and you've gotten the patch for them, a 30% performance hit has been suggested.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: SDL2 performance regression
« Reply #3 on: January 15, 2018, 08:55:59 AM »
Meltdown only effects Intel CPUs from what I read, it slightly increases the mode change cost overhead by forcing the translation lookaside buffer to dump all kernel pages when returning to application mode. AMD apparently does not need to do this, hence they leaked to the world about its existence.

Spectre effects all pipelined CPUs but the patches should either cause practically no performance loss (micro code) or cause performance loss with specific sensitive tasks (application patches). Basically they have to make sure that processing sensitive data due to pipeline predictions does not leave various measurable traces inside the processor pipeline. An example of a microcode fix would be to force an exact cycle delay in response to an illegal memory access, so that the measurable delay of the pipeline is invariant with respect to address accessed.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: SDL2 performance regression
« Reply #4 on: January 15, 2018, 04:56:22 PM »
Meltdown affects ARM as well, but I don't think that is relevant in this case. Wikipedia cites a PC World article saying that Spectre patches also cause performance drops, especially on older processors, but up to 14% on what I understand is the newest CPUs.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: SDL2 performance regression
« Reply #5 on: January 16, 2018, 03:24:33 AM »
Yes, but SDL memory copying is anyway accessing a lot of memory likely not cached, so we are already doing worst case slow memory access here ... so no slowdown due to those I think. Use_HW has never really been a good idea with SDL, and the documentation even says so.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1309
Re: SDL2 performance regression
« Reply #6 on: January 16, 2018, 03:57:12 AM »
I rather doubt the meltdown/spectre patches are involved - the computer in question is an older win7 unit that hasn't been updated since these came out.

I've now tried on two other systems. Neither shows the large slowdown of the first, but one has SDL2 performing bad period, and all 3 have GDI terrible.
1) I7-3700k, AMD 7970, Win7. SDL2 ok in directx mode, not opengl unless dirty tiles turned off (-use_hw). SDL1 ok - slightly faster, GDI bad.
2) I7-6700k, Nvidia 980ti, Win10. SDL2 ok (directx 10% faster than gl, but gl still ok). SDL1 same times as SDL2. GDI completely unusable zoomed out - > 100ms frame time!.
3) i5-3210M based laptop, Intel graphics, Win10. SDL2 slow across the board, but same directx and opengl. SDL1 ok (50-100% faster zoomed in, but same when zoomed out). GDI 20% slower than SDL2, but remains usable.

Too many variables. WAG - bad ATI driver update... and Intel drivers are always bad for performance.

Disabling the forced opengl mode 'fixes' the issue on system 1. I'd planned to commit that anyways since Dwach's SDL2 crash fixes seemed to fix the directx crashes too, but was/am waiting for after the release so it can have some wider testing...

GDI as a backend choice should just be removed. It's terrible, and just getting worse. >100ms frame times when zooming all the way out now - lol. SDL1 is doing 17ms displaying the exact same.


Use_HW has never really been a good idea with SDL, and the documentation even says so.
Note: the SDL1 -use_hw switch was hijacked by the SDL2 backend to do something very different for testing reasons, and never was removed. In SDL2 it disables the dirty tile calls to SDL_UpdateTexture() and just calls it once updating the entire screen. On system 1, opengl, any more then ~80 update calls it is faster to have one call doing the whole screen.

On Linux 3.16, amd64, with SDL2 2.0.2, SDL Driver: x11:
Thanks for trying, but I'm confused... SDL Driver:x11 is a printout from the SDL1 backend... SDL2 doesn't show such??  EDIT: actually it does.
Also SDL2 2.0.2 is quite old. Perhaps try with 2.0.7 current version if possible?
And, unless you're printing out the raw frame timing, you won't see the actual times. The gui displayed times include waiting time.

One way to test without a custom executable is to simply zoom out (and running a high enough resolution) enough to overload your computer so it can't keep up with the selected fps. That's how I noticed it in the first place - zoomed out on a computer than could previously easily handle 30 fps and ended up at 20 instead...
« Last Edit: January 16, 2018, 08:30:08 PM by TurfIt »

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: SDL2 performance regression
« Reply #7 on: January 16, 2018, 04:26:22 AM »
All modern OS run most of GDI (except a few select features) with software emulation as it is incompatible with newer more efficient display models. GDI used to be a lot faster as it would write directly to output buffers and have many of its draw options hardware accelerated.

Starting with Windows Vista most GDI hardware acceleration support was dropped as it conflicted with the new window management system. Instead GDI draw calls are applied to an internal window buffer using software routines. When it comes time to display the results the windows management system pushes this buffer out as a texture to the graphic sub system to be displayed. This was required because modern window management systems are hardware accelerated and responsible for drawing all windows unlike the GDI approach where each application was responsible for drawing its window when requested. The cost of this was a massive GDI performance regression as it was no longer possible to hardware accelerate GDI calls.

GDI was replaced by Direct2D in Windows Vista and newer OSes, which is pretty much fully hardware accelerated.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: SDL2 performance regression
« Reply #8 on: January 16, 2018, 05:53:43 AM »
I have used the GDI backend for years now, I think mostly to avoid having to set up SDL as a dependency, and Simutrans runs just fine. Never had any need for any of this threaded rendering stuff either. My computer is getting rather old, but I run Windows 10. But then I never zoom in or out (intentionally).

The lack of hardware acceleration in GDI might mean nothing for Simutrans, as Simutrans only uses a single GDI function every frame: StretchDIBits. That one may or may not still map pretty straight through to hardware, at least when no special effects are applied. The same is through for all other backends. Simutrans doesn't use them for drawing stuff, just the final upload of pixels to screen. However, since copying these pixels to screen may involve switching to kernel mode, doing a lot of small dr_textur may be less efficient these days, since switching to kernel mode has become more expensive with the patches for Meltdown.

It is possible that GDI is slow, or slower than the others, when DPI scaling is enabled. I have never tried out that, as I only have a HD monitor.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: SDL2 performance regression
« Reply #9 on: January 16, 2018, 11:33:45 PM »
Odd - with SDL2 in Extended (release build on Windows, 64-bit) I get a slightly higher framerate (15-16) with -use_hw off than with it on (12-13fps). This is zoomed all the way out in the current Bridgewater-Brunel server game and a 4k monitor (without DPI scaling).

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1309
Re: SDL2 performance regression
« Reply #10 on: January 17, 2018, 01:23:41 AM »
A little awkward that sentence.. but if by -use_hw off you mean not present as an argument, then lower frame rate when using it is the expected behaviour when things are working properly. It disables the dirty tile updates in SDL2, updating the entire screen every frame instead. You might also want to try directx instead and see if that's even faster for you.... (and assuming you're actually using opengl currently - unfortunately the helpful diagnostic messages got hidden behind some nasty debug macros so don't show anymore, but as long as you have 'proper' drivers installed, it should be using as opposed to the final software renderer fall back).

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: SDL2 performance regression
« Reply #11 on: January 17, 2018, 07:03:00 AM »
Whatever the cause, it sounds like there is an increased overhead in OpenGL calls. There was a rule of thumb back in the day to do as few calls to APIs like OpenGL and Direct3D with as much data as possible (ideally data that was already in VRAM). That was the terminal bottleneck in my attempt to use OpenGL to hardware accelerate simgraph16.cc.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: SDL2 performance regression
« Reply #12 on: January 20, 2018, 03:19:47 PM »
First, the backend drawing speed should not care of the zoom mode. As said, it just copies the already draw bitmap to the buffer. This one may be memory mapped or not, and is usually hardware accelerated (because this call is used for all icons etc. internally).

The main problem with OpenGL SDL2 is, that it crahses all the four laptop and two desktops with ancient GeForce, and inbuilt intel and AMD graphics. None of them can run the steam version. Unless this is fixed it means that 50% of the user cannot use the openGL SDL2 rendering.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: SDL2 performance regression
« Reply #13 on: January 20, 2018, 03:52:56 PM »
The main problem with OpenGL SDL2 is, that it crahses all the four laptop and two desktops with ancient GeForce, and inbuilt intel and AMD graphics. None of them can run the steam version. Unless this is fixed it means that 50% of the user cannot use the openGL SDL2 rendering.

I am able to run a native Linux SDL2 build of Extended on an Intel i5 (Skylake) NUC, which uses Intel integrated graphics, albeit I run Linux on that, so the drivers might be different.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: SDL2 performance regression
« Reply #14 on: January 21, 2018, 01:55:54 PM »
Sorry, I should have said Windows, because for Linux there is not GDI ... But the SDL2 builds with did trz OpenGL first crashed for me every time on all my WIndowscomputers.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: SDL2 performance regression
« Reply #15 on: January 21, 2018, 01:59:26 PM »
Hmm - is this a known issue with SDL2?

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: SDL2 performance regression
« Reply #16 on: January 22, 2018, 03:07:29 AM »
I never get much answers on my complains, since it mostly affect builtin graphic adapters.