News:

Simutrans Tools
Know our tools that can help you to create add-ons, install and customize Simutrans.

Performance - GDI vs SDL ( Windows XP vs OS X 10.8.2)

Started by meme, February 04, 2013, 06:44:53 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

meme

Hi,
I would like to report a huge performance difference between OS X and Win version:


Hardware configuration of Mac(book Pro):
Core 2 Duo 2,53GHz
8 GB RAM
nVidia Geforce 9600M GT
Screen: 22" external 1920*1080
OS X 10.8.2
----------------------
Configuration of virtual machine:
1 CPU core (that doesn''t make difference, simutrans is single-thread, isn't it?)
3GB RAM
256 MB of shared GPU VRAM
Win XP
---------------------


Performace difference is enormous.
OS X simutrans is unusable, 4FPS while running @ FullHD and GUI has delay about 5seconds ..
Windows simutrans is running fine, although  it's virtual PC.. while running @ FullHD, I receive 17 FPS!


I tried OS X native version which, unfortunately, worked only with OS X Lion - With that, FPS were similar to Windows GDI version (+-).


I would like to ask, what's causing this speed difference? Is SDL that slow?


Thank you







DirrrtyDirk

Quote from: meme on February 04, 2013, 06:44:53 PM
I would like to ask, what's causing this speed difference? Is SDL that slow?

Speaking as a pure Windows user here, I've never had any performance problems with the SDL version (and that's what I usually play with, since in the old days, the GDI version used to have some rather annoying habits, and I never changed back since...)

Have you tried the Windows SDL version on your virtual PC as well? If SDL itself was the problem, it should perform worse there as well I guess... or is it just the SDL-implementation for Mac that is so slow...?
  
***** PAK128 Dev Team - semi-retired*****

transporter

The only time I've had performance issues is the map refresh for like seasons and stuff. Even then, only with maps larger than 1028x1028. That's with OS X 10.6, 2.8 GHz Duo, a 9400M, and 4 GB RAM

prissi

Actually simutrans is using multiple cores for displaying (and some stuff like map rotation). Did you try some MAC executable from the nightly? THose are also SDL but I had never heard complains yet.

meme

I have updated SDL from 1.2.14 to 1.2.15, I've installed newest nightly.. But it's still same :(


Edit: On 2008 iMac (2,66GHz CPU, 3GB RAM, HD2600 Pro) is simutrans running much better.. I don't undestand why..


Ters

Maybe this is another case of the hypothetical throttling effect. The virtualization, like Wine in the Linux case, adds enough overhead so that the computer doesn't reduce it's speed to save power, while the slower computer gets a higher relative CPU load or doesn't do this kind of throttling at all.

meme

That would be weird, because other games are running fine ... (i.e C&C 3 Tiberium Wars)



And one note: WINE = WINE IS NOT EMULATOR ;) --> It's only complex of Windows libraries and some translating API, it isn't virtual OS :) (Can simutrans run in wine? Just for test :) )



Edit: It does, and better than native version  :-X


Markohs

I was thinking about the CPU throttling effect too. Can you somehow disable the CPU throttle somehow to test if this makes a difference? On PC's you can do this in BIOS setup.

meme

Hm... I think I might by deleting *.kext(s) which is controlling IntelSpeedStep - But I'd like not to do that ...

A why would CPU throttle, when other things are running fine? (and if I run some single-core intensive task, like my USB TV Tuner, it runs still same, only with a bit lower FPS ( perfectly fine, as it is heavy task for GPU)


Ters

I know Wine is not an emulator, but the mapping between the Windows API and the host API will add some overhead, as will the hypervisor when doing virtualization. The uncertainty regarding whether this overhead is significant enough is the reason I used the word hypothetical.

According to the throttling theory, other games might run fine because they either have a higher CPU load, which keeps it from being throttled down, or lower I/O demands, which means they run fine with the lower bus speed in power saving mode. (GPU load might also be a factor.)

meme


if it's too small: http://postimage.org/image/dzx1uqr2j/

Simutrans running in wine.
Simutrans running in native environment http://s11.postimage.org/tm7jevmgz/Sn_mek_obrazovky_2013_02_05_v_19_51_37.png


Markohs

We are just asking you to disable Speedstep to diagnose if the problem is there. There might be a bug on simutrans that already showed in Ubuntu machines, that makes it not perform properly on CPU's that scale speeds. But this is not confirmed.

Ters

I seem to remember that CPU load was lower in at least one of the two other cases, but I couldn't find the numbers at a glance. Unless OS X shows CPU usage as a percentage of the throttled CPU speed, 20 % could be high enough to prevent throttling.

How big is the map? The 20 % is about what I get on my single-threaded Simutrans on Windows with about the same CPU speed (but more cores) and display size. So what does Simutrans do the other 60 %? Unoptimized build?

meme

Map is 448^2 (448*448) tiles. 


Another screen: http://postimage.org/image/9qfq5n7fz/ (It's native version)



Edit: Looks like I'll have to talk with Apple ( They're either hiding EIST or it doesn't work at all)
hw.cpufrequency_min: 2530000000
hw.cpufrequency_max: 2530000000


Ters

That doesn't sound right. My 2.3 GHz computer does bigger maps with much less CPU than that, though it's a newer and more powerful CPU/computer which might have something to say. Since I build Simutrans myself, it can also utilize newer features than the general releases.

meme

And could you "lent" me your release? I'd try it and we (you) will see the difference, if any will be there..


Ters

My build won't work on a Core 2 duo. Besides, it's not the Windows version that you're having problems with.

meme



Markohs

Quote from: Ters on February 05, 2013, 07:22:49 PM
I seem to remember that CPU load was lower in at least one of the two other cases, but I couldn't find the numbers at a glance. Unless OS X shows CPU usage as a percentage of the throttled CPU speed, 20 % could be high enough to prevent throttling.

How big is the map? The 20 % is about what I get on my single-threaded Simutrans on Windows with about the same CPU speed (but more cores) and display size. So what does Simutrans do the other 60 %? Unoptimized build?

I'd say on all systems cpu % is the percentage of time it has been busy in a certain period of time, regardless of its frequency or potential computing power


prissi

I highly suspect SDL, which might have an issue with 16 bit color depth. That is seldomly used nowadays, and support for it became more and more crappy. Try SDL version 1.2.12 or 13.

meme

You're right, with 1.2.12 I get 13 FPS instead of 4! :) It's still less than with CDI, but it is playable ;)

Thank you

Link for other Mac users with this problem: http://www.libsdl.org/release/SDL-1.2.12.dmg


meme

I've probably found what's causing bad performance: Feb  7 16:34:38 MBP.local simutrans[53717] <Error>: The function `CGSFlushWindow' is obsolete and will be removed in an upcoming update. Unfortunately, this application, or a library it uses, is using this obsolete function, and is thereby contributing to an overall degradation of system performance. Please use `CGSFlushWindowContentRegion' instead.



- It's caused by using [size=78%]CGSFlushWindowContentRegion instead [/size][size=78%]CGSFlushWindow[/size]


prissi

THat should be rather done by the SDL people, as I have exactly zero control about this function :(

meme

But this appears with 1.1.12 version, which is last normal-working version of SDL with simutrans...  .13 is behaving as bad as the last one.


Ters

I can't really say I see any obvious culprit in the SDL source code for Mac between 1.2.12 and 1.2.13, but then the Quartz code is rather cryptic to me. It could also be a change that isn't specific to Mac.

prissi

I really suspect 16 bit support to be changed, probably from hardware to software support, i.e. emulation. That seems most likely. But I am MAC illíterate, thus this is only a guess.

One may try to compile 15 bit support, and see what happens. This almost certainly will require software emulation. However, slowing down was not the intend.

Sorrento

Quote from: Ters on February 07, 2013, 07:07:33 PM
I can't really say I see any obvious culprit in the SDL source code for Mac between 1.2.12 and 1.2.13, but then the Quartz code is rather cryptic to me. It could also be a change that isn't specific to Mac.


I m using 1.2.15 with 15" rMBP, and I have low FPS, roughly about 3~4 FPS. It seems to support Retina Display.

I have tried 1.2.12. It was smooth, except that it doesn't support Retina Display. I got four third of the screen with white and empty. No mouse neither.

prissi

A larger display (i.e. retina) needs four times the computing power (and in reality even about 10 times). Thus there is not much than can be done bu zoom in to have less objects on the screen. Does teh frame rate increase significantly when zooming in?

Sorrento

Quote from: prissi on March 12, 2013, 04:00:51 PM
A larger display (i.e. retina) needs four times the computing power (and in reality even about 10 times). Thus there is not much than can be done bu zoom in to have less objects on the screen. Does teh frame rate increase significantly when zooming in?

Zooming in have played no effect in my case, but it seems the windows size of the game in OSX have effects on it. if I enlarge the windows to near full screen, then frame rate would drop to 4 fps from 10 fps, but if I use the default windows size, the frame rate would drop to 6 fps from 10 fps.

The update from OSX 10.8.2 to OSX 10.8.3 has no help on it.

I now have to run the game in virtual pc to get the solid 10 fps with 1680 x 1050 in the virtual machine.

Hopefully SDL or GDI could have better support for retina in the future......

Ters

It might be that it's not computing power, but all the individual pixels having to move from system memory to the graphics card. When running in a virtual PC, it only needs to do the 1680x1050 pixels, which then might be scaled up when compositing on the graphics card. It doesn't explain why the default window size is slower, though, unless the default window size is somehow scaled up to avoid being tiny.

10 fps is still low on a modern computer (less than five years old, maybe a bit more).

Have all (or the one) Mac developer(s) left?

ArthurDenture

I spent some time investigating the performance of Simutrans on Mac, since I'm in a similar situation. I have a 15" Macbook Retina, OS X 10.8.3, running at 1920x1200. I have compiled Simutrans from source using SDL 1.2.15. When I maximize Simutrans, its framerate drops from 25fps to 5fps. Here are the observations that I've made:


- SDL runs in software mode when running windowed.
- SDL_Flip() takes about 150ms to execute. In software mode, that's equivalent to SDL_UpdateRect(screen, 0, 0, 0, 0), which is not fast. That accounts for almost the entire frame drawing time.
- Incidentally, I spent some time getting multithreaded mode to work on Mac. The main barrier is lack of pthread barrier support. But once I found that the multithreaded portions of simutrans were not the bottleneck, I abandoned this work.
- I get decent performance with hardware acceleration when running fullscreen, with the catch that I have to turn off automatic graphics switching in the system preferences, otherwise the game renders wrong. (It seems like only the bottom left quarter of the game is visible. No idea what's going on there.)
- I tried various flags to SDL_SetVideoMode, with no effect. I also tried setting the color depth there to 32 instead of 16, producing trippy colors but no effect on performance.
- I tried disabling USE_HW in dr_flush and dr_textur, such that SDL_UpdateRect would be called on dirty tiles and SDL_Fill would never be called, matching the behavior of other platforms. This performed way worse, taking >1s to render each frame.
- http://sdl.beuc.net/sdl.wiki/FAQ_MacOS_X_Windowed_Mode_is_slow seems *extremely* relevant :-)


A few resulting questions:
- What's with USE_HW? It's only defined on mac, and it appears that the other platforms just use software rendering (along with only updating dirty tiles instead of calling SDL_Fill). Does SDL software rendering simply perform better on the other platforms? (I certainly get fine performance on linux, though on an admittedly slightly lower-resolution monitor.)
- What's the status of the opengl backend? I was able to get it to compile with a handful of Makefile tweaks (locating the glew library with pkg-config; adding "-framework OpenGL"), and it seemed to work ok. (The news ticket had an awful flicker, but that went away if I forced pbo_able = false. Probably a straightforward bug in that branch of code.) It runs in hardware-accelerated mode, even windowed, and I get 25fps and 25ms idle time with it. Is there a backstory as to why it's not the default? Using OpenGL even for 2D rendering seems to be the right way to get good performance.

Ters

The OpenGL backend doesn't do 2D rendering. At least it didn't the last time I saw it. Simutrans just uses OpenGL to shuffle the data after rendering it the normal way.

The only advantage I can imagine that the SDL+OpenGL backend has over normal SDL is that the graphics completely bypass the window manager. Apart from that, it is actually a detour, and a bit hackish one at that.

prissi

@ArthurDenture You might want to try the allegro backend. If this works better on the MAC, maybe we should use this as default. But I think the main reason for not using OpenGL is the cross compiling. Not sure though.

ArthurDenture

@prssi Just went and tried that but couldn't get Allegro 4 to compile. https://www.allegro.cc/forums/thread/608825 suggests that Allegro 4 uses deprecated APIs that have been removed on Lion. (Which seems accurate: the compilation error was about an unreferenced variable "useLocalHdwrMem", which apparently is defined by older Quickdraw libraries.)

I'd suspect that cross-compiling the OpenGL backend is as easy as the SDL backend (since it's really the same except for one extra library). I'll be happy to try it if there are docs.

@Ters I guess that's what I meant by 2D rendering. It uses OpenGL just for copying the bitmap created by Simutrans onto the screen. The advantage is indeed purely that the hardware acceleration support is much better. Not sure what makes it a hack. (I mean, simsys_opengl.cc is very hackish in the sense of being a copy-paste from simsys_s.cc, in a manner that means fixes to the latter didn't always get applied to the former. But that can be cleaned up.)

Ters

It's a hack in the sense that it doesn't check capabilities, except perhaps in a single case or two, nor does it handle potential errors in any significant way. But the most hackish part is simply doing all this OpenGL stuff just to blit a bitmap to the screen. It's almost like buying a car each time one needs to go to the grocery store, only to discard it when getting home, when the grocery shop is just as far away as the car shop.

I, and later Markohs, have been looking into making better use of this "car" when we first have it, but the internals of Simutrans and best use of OpenGL do in almost no way agree.

Writing a native backend for Mac is probably the best solution. It might be that native backends are the best solution on all platforms in order to make use of touch devices and gestures. Platform independent libraries seems to neglect input the most. Unfortunately, I haven't seen any updates from the native Mac project in a while.