News:

Simutrans Sites
Know our official sites. Find tools and resources for Simutrans.

Was there an SDL 2.0 project?

Started by neroden, July 02, 2013, 07:34:11 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

neroden

I'm wondering about this because SDL_mixer 1.2 is giving me errors in Valgrind, and obviously I can't fix those.  I seem to remember that someone was working on porting simutrans to SDL 2.0.  How did that project go?

TurfIt


ArthurDenture

#2
That was me, in a weekend project a while back. The port was essentially working, missing only (iirc) porting SDL_mixer support forward to 2.0. I abandoned it for a few reasons:

  • SDL 2.0 wasn't officially released yet, and I imagine it would be quite the inconvenience if Simutrans depended on a compile-yourself version of SDL. (It looks like SDL 2.0 has now had its first (edit: third) release candidate... still not official, but at least there are now binaries.)
  • After I finished the initial port, there was a big commit in the main tree that multithreaded the rendering logic in a way that wasn't compatible with how I was doing the SDL 2.0 port. I'd probably have to discuss on the forum how to handle that -- the 2.0 version would likely want to rip out that logic.
  • There was a lukewarm response on this forum to the port. It fixed game performance on modern mac computers (i.e. going from terrible to fine) and was a mixed bag on other platforms. I personally consider that a big win, but I didn't really hear much feedback that I'd be able to get the patches accepted. (I'd be happy to hear otherwise, though.)

I could try to dust it off and port it forward if people think it has a reasonable chance of landing in the tree. Simutrans already has frankly too many graphics platforms (simsys_*.cc) with even more compile-time permutations (tile-based vs. full-screen blitting, hardware vs. software, multi vs. single-threaded), all in varying states of maintenance -- I wouldn't want to add another one unless it actually had the potential to replace the libsdl1.2 port and simplify / clean up the codebase overall.

(I don't want to hold up anyone else from attempting this, though. If you do, my patches are probably still useful to crib off of or to attempt to merge forward.)

kierongreen

As you say, until SDL 2.0 is officially released simutrans should stay with the existing SDL routines. However there could be both SDL 1 and 2 platforms supported until then. As the routines are separate from the rest of the code largely then maintaining both shouldn't be too much of a burden. What was the issue with multithreading?

Ters

I agree that there is a problem that lots of little things add up, or even multiply up. Introducing new feature in at the system level means changing n backends, some of which one may not know or have. When working on the OpenGL backend, I just ripped out everything to do with multithreading that I encountered. It got in my way. I never understood where the parallelism occured either, not that I bother doing much investigating.

prissi

One could easily add an sld2 backend. That would be probably used on mac and when self-compiled.

sdog

Quote from: kierongreen on July 06, 2013, 07:17:55 PM
As you say, until SDL 2.0 is officially released simutrans should stay with the existing SDL routines. However there could be both SDL 1 and 2 platforms supported until then. As the routines are separate from the rest of the code largely then maintaining both shouldn't be too much of a burden. What was the issue with multithreading?

http://lists.libsdl.org/pipermail/sdl-libsdl.org/2013-August/089768.html
On saturday: "We have the final SDL 2.0 RC builds up for testing over this weekend, and
we'll be releasing next week!"

Ters

From the looks of their download page, it has been released.

TurfIt

It has. Now, does it still halve performance vs 1.2?

Ters

I'm trying to figure out how to install this library. There seems to be some installer scripts targetting mingw64, but I have yet to find the binary release of mingw64 for Windows. I guess I have to manually set it up with lots of hacks, as usual. Apparently that's the price to pay for using free software on Windows.

TurfIt

I've not got around to trying the release yet. For the beta, I just did the usual './configure ; make ; make install' for MinGW32.

Ters

I downloaded the binary release. The installation script did install files to places that seems suitable, but neither msys nor mingw seems to contain support for pkgconfig files. I therefore need to tinker with C_INCLUDE_PATH and friends manually, like for everything else.

Ters

The new SDL looks a lot like talking directly to OpenGL, so the new simsys_s.c will be a lot like my first draft for simsys_opengl.c. All the stuff added to the latter afterwards is likely handled internally in SDL (apart from perhaps the tiled framebuffer). So I think an SDL2 backend will render simsys_opengl.c redundant, if the SDL guys didn't screw up somewhere.

TurfIt

Finally got around to trying with the release SDL2-2.0.0. Initially, performance was still terrible compared with SDL1; But changing to RGB565 fixed things! Now, performance is identical between them on Win7. At least once I restored the dirty tile handling to SDL2... I also note the performance hit to blit the whole screen has dropped from 40% with SDL1 to 7% in SDL2. Completely ripping out the dirty tile stuff might end with SDL2 faster!

One potential show stopper is the removal of per character unicode input in SDL2. ASCII still works fine, but unicode apparently now requires SDL to handle the text input itself. I've no idea how to incorporate such into Simutrans...

@ArthurDenture - can you try RGB565 on your Mac and see if performance is still ok? Otherwise we'll have to figure out what's busted with RGB555 (psychedelic transition ground textures).

Yona-TYT

This will help hardware acceleration?

Ters

Quote from: Yona-TYT on August 24, 2013, 04:33:19 AM
This will help hardware acceleration?
I think it more or less IS hardware acceleration, but only for displaying the finished result to the screen.

kierongreen

But given that it is hardware acceleration it means with retina displays we can halve the resolution in simutrans then use the hardware acceleration to scale it back up to full size?

Ters


TurfIt

The size of the rendering surface presented to Simutrans can now easily be different from the actual screen resolution, and the hardware scaler does do the hard work. Of course anything but integer scale factors looks like crap.

Sometime during my testing last night, the dirty tile speedup stopped working. Now, blitting only the dirty, or the entire screen results in the exact same timings (7% slower than SDL1). I cannot explain this at all. The 7% wouldn't be so bad, but I expect that penalty to be much worse on slower hardware. I'd test if I was going to continue with SDL2 but...

There's a humungous memory leak upon resizing the window. The SDL_Surface keeps getting allocated, and Simutrans can't free it as SDL crashes on trying. I've checked all through the simutrans code and everything appears fine there, the crash seems internal to SDL2. I've tried deferring the free, and many other things, but it always crashes. 10 resizes bloats Simutrans from 100MB to 1000MB!, and then insults by crashing anyways. I've no time for buggy libraries, so I think I'm done - maybe SDL2.1

meme

Will be SDL2(.1) version for OS X? Performance of native simutrans with 1.2 SDL is horrible and I'm not happy that I have to use wine for playing simutrans.. But it's still better than nothing.


Ters

Quote from: meme on August 25, 2013, 10:08:06 AM
Will be SDL2(.1) version for OS X? Performance of native simutrans with 1.2 SDL is horrible and I'm not happy that I have to use wine for playing simutrans.. But it's still better than nothing.

Apparently, the alternatives are SDL 1.2, which is slow, and SDL 2, which crashes.

meme

#21
Okay, I'll stay with GCI version using wine ...


ArthurDenture

#22
Quote
@ArthurDenture - can you try RGB565 on your Mac and see if performance is still ok? Otherwise we'll have to figure out what's busted with RGB555 (psychedelic transition ground textures).

Just tried it -- RGB565 works just as well as RGB555. The new rgb bitmasks are 0xf800 (i.e. 0x1f << 11), 0x07e0 (i.e. 0xcf << 5), and 0x001f.

Update: TIL about SDL_PixelFormatEnumToMasks, so I just pushed a change to https://github.com/artdent/simutrans/tree/sdl2 to get those constants from SDL instead of hardcoding them. I also added mousewheel support, which had been a TODO. I've noticed, though, that right-click map dragging is bugged in this port -- it drags the map slooowly. My branch is now hopelessly behind HEAD, of course.

I can't reproduce that memory link in my sdl2 mac build (the one on github.com/artdent/simutrans), btw.

OT: @TurfIt, nice pthreads patch! That looks like a reasonably sane way to get threading working on more platforms.

TurfIt

I did give it one more try... The huge memory leak/crashing appears due to the Direct3d renderer driver (the default on windows). Forcing to use opengl instead seems to work, and is the default for OSX.

The FreeSurface crash / smaller memory leak appears due using the Lock/UnlockTexture calls. A workaround is to ensure the texture is locked before freeing the surface - seems entirely wrong, and by the limited documentation the Unlock will then be streaming from the now free'd memory, which is rather bad, but it doesn't crash. Alternatively, just never ever call lock in the first place. There doesn't appear to be any performance difference between the R/W and W only texture access.
@ArthurDenture - another thing to check on the Mac - get rid of the lock/unlock and just stick a "SDL_UpdateTexture( screen_tx, NULL, screen->pixels, screen->pitch );" into dr_flush() instead.

Also, does your sdl2 branch compile? If so, you must be using a beta SDL2. The unicode member of keysym has been removed...
I'd already changed masks, got mouse wheel working (leaving X1, X2 mapped as well as it made sense atleast with my mouse.)

One more annoyance, they've totally broke any sensible keyboard support. Supposedly now there's scancodes and keycodes, but they're the same for me. The problem is everything is returned with the raw unmodified key. i.e 'A' and 'a' now return the same keycode. Worse  '3' and '#' are now the same so Simutrans would have to do it's own keyboard mapping for shifted keys rather than the OS! sigh. I'll make it work for the standard US qwerty, but if you have something different, you're outta luck.


Ters

Keyboard input is difficult, especially dead keys and similar. Even Microsoft has failed to get it right in IE9 and IE10. (Why these programs have rolled their own low level keyboard input is beyond me.) There were clearly problems with the old SDL character input model as well, as Simutrans with SDL doesn't handle dead keys well. I haven't checked how well Simutrans with GDI handles them, though.

ArthurDenture

QuoteAlso, does your sdl2 branch compile? If so, you must be using a beta SDL2. The unicode member of keysym has been removed...

Oops, it was indeed removed back in June: http://hg.libsdl.org/SDL/rev/b36811d7db33. I've pushed a fix.

http://wiki.libsdl.org/MigrationGuide#Input describes how to do Unicode input now: "you were lucky if the old method ever worked for anyone, so we've ripped it out. Use SDL_TEXTINPUT instead." So you might be able to hack something together for en_US by manually testing shift keys, but that doesn't sound like a good idea compared to using their new text input methods.

If you've got something compiling and halfway working, then do you feel like pushing it somewhere? :-) Honestly, the biggest barrier towards me hacking on this further is the thought of having to rebase my changes all the way to HEAD. If you're in better shape in that regard, then I'd sooner start from your work than my own.

TurfIt

SDL_TextInput events would be fine if one is doing text input. But what about hotkeys / commands? Sometimes you actually want just a key. I'd love to know what to call in this new system so I'd get SDLK_AT when I type '@' instead of SDLK_2 and KMOD_SHIFT as separate... As mentioned, I can hack something for en_US, but en_GB might expect '"' from shift-2 instead of '@'; And of course the 4725 other keyboard layouts out there. The OS should handle this crap...

I've attached my current state of things.

ArthurDenture

I figured it out: just always do text input :-). You can just call SDL_StartTextInput() on startup. I stole the technique from a demo program that comes with SDL: http://hg.libsdl.org/SDL/file/tip/test/checkkeys.c

SDL gives you the entered text as a UTF-8 string. I'm just naively converting it to UTF-16 and using the first character as the event code. I'm not positive that gui_textinput.cc does the right thing with the resulting character, since it seems to have some strange handling of characters > 128 that I'm not sure what it's really up to.

Anyhow, I've pushed a new commit to the usual spot.

Ters

Quote from: TurfIt on August 26, 2013, 02:16:57 AM
SDL_TextInput events would be fine if one is doing text input. But what about hotkeys / commands? Sometimes you actually want just a key. I'd love to know what to call in this new system so I'd get SDLK_AT when I type '@' instead of SDLK_2 and KMOD_SHIFT as separate... As mentioned, I can hack something for en_US, but en_GB might expect '"' from shift-2 instead of '@'; And of course the 4725 other keyboard layouts out there. The OS should handle this crap...

What if a hotkey is tied to 'ä'? That's a key on a Swedish keyboard, but requires two keypresses on mine. The first press can't yields a character code, or input would become '¨ä'. However, if the second keypress isn't 'a', but 'b', the input should be '¨b'. Now there are two characters generated for a single keypress event, as the event for the '¨' has already been sent and processed. As such, SDL did the right thing in separating keyboard events from text input. This is how the Win32 API works as well.

I've long since commented that using characters rather than keycodes for binding hotkeys is not such a good idea that it might seem to be, in most part due to dead keys. There might be issues with binding to keycodes as, depending on how they work, and certainly when binding to scancodes. In this case, all keys should work, but documenting which are which in a layout independent fashion is problematic. (Back in the days, I used the drawings of different keyboard layouts in the back of the MS-DOS manual, but almost nobody have these anymore.)

TurfIt

Quote from: Ters on August 26, 2013, 05:11:29 AM
What if a hotkey is tied to 'ä'? That's a key on a Swedish keyboard, but requires two keypresses on mine.
I'd expect the application to simply receive 'ä' from the OS. If multiple keypresses are requires to compose the character, better the OS handle that each application reinventing the wheel. Thankfully this is all moot...


Quote from: ArthurDenture on August 26, 2013, 03:48:13 AM
I figured it out: just always do text input :-).
Well, who'd a thunk? SDL documentation sure had me thinking the text input needed to be started/stopped for each entry field.

I think we now have a fully functional SDL2 backend. Thanks!
It does still come with a small (7%) performance hit over SDL1 on my systems, so for know it makes sense to just add it as yet another backend option.
Still to finalize is exactly how to update the screen, updating dirty rectangles with locked texture is marginally quickest for me. I'd rather not lock the texture due to the ugly hack of needing to free the surface while the texture is locked. Dirty rectangles also requires many calls to UpdateTexture, which is fine for me, but I remember how many SDL1 calls to UpdateRect killed the Mac, so options:
1) dirty tiles - UpdateTexture in dr_texture, dr_flush only does RenderCopy and Present, texture not locked
2) dirty tiles - UpdateTexture in dr_texture, dr_flush only does RenderCopy and Present, texture locked
3) whole screen - dr_texture does nothing, dr_flush does UpdateTexture, texture not locked
4) whole screen - dr_texture does nothing, dr_flush does UpdateTexture, texture locked
5) whole screen - dr_texture does nothing, dr_flush does UnLock/Lock to perform the update.
Listed in order of my preference. Can you let me know how the Mac performs with these options? If any cause the Mac trouble, we'll not do that. The timings on Win7 are so close it doesn't really matter which, and I expect to stick with SDL1 for those platforms where it works.

meme

Performance issues on OS X maybe be caused by 10.8 WindowServer Doublebuffering - I was bad on Lion, but due this it's even worse..


Ters

Quote from: TurfIt on August 27, 2013, 01:47:23 AM
I'd expect the application to simply receive 'ä' from the OS.

And they do, but as a text input event, not as a key event. It can (for the most part) only be used as a hotkey if it's a separate key, though. Multistroke (as opposed to multipress) hotkeys don't make much sense anyway, but by using characters rather than keycodes plus modfiers, one obscures the difference. Newer parts of the Java API even requires mnemonics (the underlined character in menus and labels) to be specified using keycodes, but that makes no sense to me.

Quote from: TurfIt on August 27, 2013, 01:47:23 AM
Well, who'd a thunk? SDL documentation sure had me thinking the text input needed to be started/stopped for each entry field.

I got that impression as well, and I must say that the SDL documentation is rather bad.

TurfIt

#32
Current state:


EDIT:  Still has keyboard issues. Control hotkeys don't work...
EDIT2: attachment removed. newer below.

ArthurDenture

I'll have to dig a little more to determine the difference between those various options for rendering. I wouldn't be surprised if some of them turn out to do the same operations under the hood. I should have the time to do that this weekend. But a few quick notes:

- For choosing the rendering driver, you might be interested in http://wiki.libsdl.org/SDL_HINT_RENDER_DRIVER and http://wiki.libsdl.org/SDL_SetHint. That looks like a cleaner way to tell SDL to prefer opengl over direct3d.

- Control keys will probably have to go through the keydown event and not the textinput event. i.e. if the control key is held down and the key is in the ascii range, generate a simutrans event. Ugly, but hopefully won't lead to the same keypress causing two events within simutrans.

Ters

Control keys shouldn't cause events. The only possible problem might be that while the control key was pressed when the other key was pressed, it may no longer be pressed by the time the character event arrives.

TurfIt

Seems hints are treated as suggestions, not commands. Set the hint for opengl, it still gives d3d. Hence the text comparison loop method.

Processing the keydown event with control pressed seems to work ok. There's a lot of keyboard events being passed into simutrans with code=0, but so far it's not causing issues.

What SDL2 (mis)feature shall we butt heads with next...

Ters

Quote from: TurfIt on August 29, 2013, 04:27:19 PM
Seems hints are treated as suggestions, not commands. Set the hint for opengl, it still gives d3d.
That's what hints are. I've spent some time trying to give an Oracle database hints on how to best perform a query, only to have it rush into a hopelessly inefficient execution plan.

Quote from: TurfIt on August 29, 2013, 04:27:19 PM
What SDL2 (mis)feature shall we butt heads with next...
I'm fully occupied with Hibernate misfeatures for the time being, so I'll pass for now.

Markohs

Correct me if I'm wrong, but if I understand correctly what you said so far, this port works more or less like the OpenGL backend, no? You update a texture and render it again. I wonder if there is another gain appart from using a newer library (the new keyboard handling it's something that we'd have to change some day, and reading what you expressed, it liiks similar to http://www.ogre3d.org/tikiwiki/tiki-index.php?page=OIS , no?).

If updating the texture is the way it works it's normal the performance is close (7% less, looks like close enough), because most of the time it's spent sending the data over the PCI bus to the videocard, that bottleneck will be there forever if we don't change lots of code to use texture atlas and a brand new drawing algorithm.

ArthurDenture

QuoteI wonder if there is another gain apart from using a newer library

Yes: it allows using hardware rendering on a Mac, bringing the framerate on Retina displays from <5fps at 100% cpu usage to 25fps at 40% cpu usage. This changes simutrans from "barely playable" to "playable" on that platform. (In theory, it also allows hardware rendering in windowed mode on Linux, which previously would've been software endering -- but in practice I haven't seen that translate to an actual speedup, for whatever reason.)

Markohs

Did you tried a simutrans opengl backend build? It might give similar or better results.

TurfIt

#40
Last I tried, the opengl backend was ~30% slower than the SDL1 (not on a mac though). I've now got SDL2 down to ~4% slower than SDL1 which is comparable to the slowdown with GDI. Still, if your platform handles it, SDL1 gives the best performance, and for platforms that don't - Mac, SDL2 is looking promising.

EDIT:  current attached. simply adds the CTRL-key handling.

ArthurDenture

I had a chance to return to this work, and the latest version of the SDL2 backend can be seen at https://github.com/artdent/simutrans/compare/sdl2-v2. It's also attached as a diff against HEAD. Notable fixes include sound stuckness and fullscreen mode, but see the commit messages on Github for the full details. It's also a separate backend option this time instead of trying to upgrade simsys_s.cc in place, since it's clear that's what people prefer.

I think it's now free of obvious bugs, though I'm sure there are some lurking.

Switching to calling SDL_UpdateTexture on only the dirty rects, with the texture locked for the whole program duration, did provide a nice performance boost over updating the entire screen, so I went that route. It's basically TurfIt's option #2.

Ters

Quote from: ArthurDenture on September 07, 2013, 06:04:55 PM
Switching to calling SDL_UpdateTexture on only the dirty rects, with the texture locked for the whole program duration, did provide a nice performance boost over updating the entire screen, so I went that route. It's basically TurfIt's option #2.

Strange. I would have thought that just uploading dirty rectangles would get very slow with more than just a few rectangles, but then again, maybe there seldom are more than a handful dirty rectangles per frame.

TurfIt

Dirty rects vs fullscreen update just shows how slow it still is writing to video memory / copying unnecessary stuff. When redoing the dirty logic to merge the rectangles a bit, I tried ignoring single tile non dirties surrounded by dirty and just copying one larger rect, ended up slower than just doing two rects. ie. copying an extra 16x16 pixel area is slower. There's typically 300-700 dirty rects per frame in my testing, obviously less when zoomed in.

Stuck sounds - I never even though of that. Sound in Simutrans is sooo horrible, I have it on perma mute!

FULLSCREEN_DESKTOP flag - I had issues with the mouse being misaligned to the gui when I tried this. Will require more work to decouple render area from screen area.

It seems stable/functional enough now. So committed r6688. Maybe mac nightlies should be switched to this?

Ters

Quote from: TurfIt on September 08, 2013, 05:50:06 PM
Dirty rects vs fullscreen update just shows how slow it still is writing to video memory / copying unnecessary stuff. When redoing the dirty logic to merge the rectangles a bit, I tried ignoring single tile non dirties surrounded by dirty and just copying one larger rect, ended up slower than just doing two rects. ie. copying an extra 16x16 pixel area is slower. There's typically 300-700 dirty rects per frame in my testing, obviously less when zoomed in.

It's just that with these moderns cards, it is said and written that initiating an operation is very costly. So writing one 32x32 area should be much faster than writing four 16x16 areas. I also thought than when a texture is locked in its entirety and partially updated by just writing to random parts of the locked memory region, the driver doesn't know which parts have been written to, and uploads the entire texture for system memory to video memory. Maybe the driver cooperates with the memory manager to know which pages are dirty. I guess time will show when this has been tested on a number of different machines. Unfortunately, I don't have time for testing it out for the time being.

ArthurDenture

Yay! Thanks for doing the merge. There's a minor oops where it segfaults at startup - see the attached diff for the trivial fix.

I think switching the mac nightlies to sdl2 is a good idea. I've also added sdl2 as a configuration over at http://ec2-54-242-171-11.compute-1.amazonaws.com/jenkins/job/. (And, yes, I think that the "run the built binary for 10 seconds" phase of the build would have caught the segfault, were not the build broken just now by an unrelated commit.)

I'll have to look into the fullscreen thing when I have more spare time. Right now I actually see the mouse position bug in either fullscreen-desktop or regular fullscreen mode, depending on the platform.

kierongreen

Committed the startup fix (I had come up with same solution last night but hadn't got round to committing then!).

Well done for getting this working I think there's a huge boost here for fullscreen with lots of vehicles moving on my computer at least.

captain crunch

Thank you very much, good Sir.
I noticed, that the keyboard might get scanned differently than in the SDL-1.2 front-end, when I used the [[CONTROL]]-click, as it did not work as expected. I have remapped [[CONTROL]] to [[CAPS LOCK]] which works in SDL-1.2, but in the SDL-2.0 front-end it is not mapped there, but to the original position.

ArthurDenture

#48
QuoteI have remapped [[CONTROL]] to [[CAPS LOCK]] which works in SDL-1.2, but in the SDL-2.0 front-end it is not mapped there, but to the original position.

Sigh, confirmed. I do the same thing, and for me it works on Mac but not on Linux. If I run the checkkeys test program from the SDL2 source tree, it prints when I press caps lock: "INFO: Key pressed: scancode 57 = CapsLock, keycode 0x400000E0 = Left Ctrl  modifiers: CAPS". If I do the same from SDL1.2, it prints "Key pressed:  306-left ctrl  modifiers: LCTRL NUM". I guess it's time to file a bug with SDL...

(Fwiw, this is probably the same behavioral change that required us to explicitly treat Cmd as a control key on macs in SDL2. In SDL 1.2 it got mapped to Ctrl out of the box.)

Update: filed https://bugzilla.libsdl.org/show_bug.cgi?id=2096

captain crunch

Another quirk on Linux/X11 is that when I press the SPACE key two spaces are printed. Easily recognisable because I have to press BACKSPACE twice to delete the previous SPACE character.

ArthurDenture

Quote from: captain crunch on October 22, 2013, 08:15:49 PM
Another quirk on Linux/X11 is that when I press the SPACE key two spaces are printed. Easily recognisable because I have to press BACKSPACE twice to delete the previous SPACE character.

Thanks for the report - I posted a quick patch to fix that.

Is it perhaps time for the a mod to close this thread? Reports of problems in the SDL2 backend can now just got in the normal bug reports forum.

Ters