News:

SimuTranslator
Make Simutrans speak your language.

Compilation under Linux (x86-64)

Started by Wheart, December 29, 2015, 01:52:32 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Wheart

Hi!

I've compiled Simutrans 120.1.1 from zip archive provided on Sourceforge.
OS: Ubuntu 14.04.3 LTS (using KDE)
Arch: x86_64

Two problems:
- crash (with coredump) on maximize window when compiled against SDL2 (but stretching window to full size works OK)

- won't compile against OpenGL, unless Makefile is corrected (wrong -l flag for linker - patch bellow), after compilation works OK, but "Message bar" (with scrolling text) is corrupted (but you can set notification via message windows :) )

As far as I tested (outdated Ubuntu 12.04) linker flag -lGLEW should be used also on "classic" x86 (non 64bit) linux distributions.

Compilation options (opengl version):

BACKEND = opengl
COLOUR_DEPTH = 16
OSTYPE = linux
OPTIMISE = 1 # Add umpteen optimisation flags
WITH_REVISION = 1 # adds the revision from svn; required for networkgames
MULTI_THREAD = 1 # Enable multithreading


Diff for Makefile:

--- Makefile.orig       2015-12-29 13:00:20.027673134 +0100
+++ Makefile    2015-12-29 13:58:46.063890083 +0100
@@ -561,7 +561,11 @@
     SDL_LDFLAGS := $(shell $(SDL_CONFIG) --libs)
   endif
   CFLAGS += $(SDL_CFLAGS)
-  LIBS   += $(SDL_LDFLAGS) -lglew32
+  ifeq ($(OSTYPE),linux)
+    LIBS   += $(SDL_LDFLAGS) -lGLEW
+  else
+    LIBS   += $(SDL_LDFLAGS) -lglew32
+  endif
   ifeq ($(OSTYPE),mingw)
     LIBS += -lopengl32
   else

Ters

The OpenGL thing is pretty much just a hack we tried out. I would not recommend using it unless SDL or SDL2 gives serious performance problems. We have only seen this with SDL on Macs, but SDL2 fixed that, making the OpenGL hack obsolete. Regular SDL should be the best choice for Linux, but trying to fix SDL2 can still be useful.

It should also be noted that Simutrans is made to run as a 32-bit program. Compiled as 64-bit will in theory give performance degradation, which may or may not be noticeable in practice.

Wheart

The OpenGL thing works for me much better than both SDL and SDL2. On both SDL and SDL2 using "transparent instead of hidden" in addition to "Smart hide objects" leaves some artefacts (shaded building elements) and SDL2 version crashes on window maximize (SDL maximizes OK), even if it's first thing after running Simutrans. In addition - it seems, that OpenGL works "smoother" - especially when "following" a vehicle. So it is very good hack :) The only problem with OpenGL is a scrolled text on "message bar" (bar seems to be "flashing" and text appears as "moved" several letters making it unreadable). Since there is an option to disable messages on that bar - and this setting is saved on game quit - it's acceptable to me.

I'm avoiding using precompiled 32 bit binary for Linux since it is linked against 32bit libraries - with all their mess, conflicting versions (for other "proprietary" software) etc.
I'm not sure, how it works today, but old releases (pre 112) distributed as binary and compiled against 32bit libraries works much "harder" on x86-64 (in terms of performance) than compiled "natively" - especially with big maps (1024x1024 and more).

There is also one more issue when building from source provided via zip file - since flag "WITH_REVISION" is set to 1, it seems that during compilation it tries to use svn/cvs/etc to get revision number and it fails, giving in result "rNiewersjonowany katalog" (with English locale settings on compilation time it could be something like "rUnversioned directory").

Ters

Quote from: Wheart on December 30, 2015, 08:14:02 AM
The OpenGL thing works for me much better than both SDL and SDL2. On both SDL and SDL2 using "transparent instead of hidden" in addition to "Smart hide objects" leaves some artefacts (shaded building elements)

This doesn't make any sense. The drawing is independent of the backend. The backend only provides access to video memory into which the final result is copied. Unless there is a bug somewhere, or an oversight due to the hackish way the OpenGL backend operates, that causes the entire "backbuffer" to be copied over, rather than just the parts that have changed. This should in theory put a higher load on the system bus, although it might be that on some systems, the overhead of many small copy operations outweighs the savings the dirty rectangle system is supposed to give us. Maybe the SDL backend also has some concurrency issues if multi-threading is enabled (I've never built Simutrans multi-threaded). I also see that batch copying is disabled when rendering is multi-threaded. The OpenGL backend turns a blind eye to the issue of multi-threading.

Quote from: Wheart on December 30, 2015, 08:14:02 AM
I'm avoiding using precompiled 32 bit binary for Linux since it is linked against 32bit libraries - with all their mess, conflicting versions (for other "proprietary" software) etc.
I'm not sure, how it works today, but old releases (pre 112) distributed as binary and compiled against 32bit libraries works much "harder" on x86-64 (in terms of performance) than compiled "natively" - especially with big maps (1024x1024 and more).

I was the one, or one of a few, that helped get Simutrans to compile and run in native 64-bit on Linux years ago, so I know it is easier. But Simutrans is tuned under the assumption that pointers are 32-bit, or four bytes. When pointers suddenly double in size, some things will get misaligned, which will result in slower logical memory access (physical memory access doesn't get slower, but there is more of it). On the other hand, allowing a modern compiler to use SSE and stuff, results in what I think is faster machine code, although Simutrans will complain that it is using the slowest of the three copy algorithms (the hand-optimized assembly only works with 32-bit GCC, and the middle one confuses GCC when vector optimizations is turned on).

Wheart

I've just sent bug report ;) In my opinion, if one can compile Simutrans against OpenGL on Linux (regardless of "bitnes") without errors, it should do it without digging stackexchange for valid flags for library that is more or less portable. If it works worse or better - it's kind of taste ;)
I remember some SDL rendering problems (artefacts, roughens of motion, etc) as far as I know Simutrans and finding that OpenGL works nearly "out of the box" is VERY nice thing ;)

I've found one more bug: native (polish) city names from savegame made in 120.0.1 are broken when loaded in 120.1.1 (missing polish characters, but in dialog boxes everything is OK). Maybe something with the fonts in distribution zip? (same problem regardless of backend)


Making some Off-topic just for fun:
I'm rather sysadmin than programmer and do not know too much about Simutrans internals, but I made some kind of an experiment:
- I run 4 instances of simutrans:
- 32bit binary (from SF) as a server, with 1024x1024 map
- same binary as client - "simutrans"
- 64bit binary with SDL backend (client) - "sim-sdl"
- 64bit binary with openGL backend (client) - "sim"

In listing below - result of "top" command. Lowest pid "simutrans" is server instance, "client" simutrans and sim-sdl has no vehicles yet.
All windows has same size and are centred on city sized ~7.5k, with pedestrians, buildings shown.


  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                                                                 
10315 wheart    20   0  315620 136684  12440 S  65,8  1,1  78:44.49 simutrans                                                                                                                                                               
19992 wheart    20   0  301196 114564  11932 R  53,2  0,9   2:04.14 simutrans                                                                                                                                                               
19758 wheart    20   0  893296 162584  11988 S  28,9  1,3   1:44.95 sim-sdl                                                                                                                                                                 
11437 wheart    20   0  989920 238836  38144 S  25,6  1,9  39:14.37 sim   


Machine is Core i7 960 (4 cores, 8 threads, 3.2GHz) with 12GB ram, overall load (with candy-looking KDE, several browser windows and all system stuff) - about 30%

Looking in process table I could say that 64bit version consumes much more memory (about triple in size virtual memory) but less CPU. Looking closer on physical memory usage, 64bit SDL binary uses 1.5 times more memory of it's 32bit counterpart and 64bit OpenGL version - twice as 32bit SDL. It could be acceptable, especially if one has much memory, so memory swapping on physical device doesn't occur.

As you said - code was optimized many times (As I remember few major versions ago such map with 300 vehicles works in "step by step" mode), but changing CPU architecture to 64bit and forcing manual 32bit optimizations seems to be like digging using only a half of a shovel, regardless of all optimizations and extensions made at compiler or CPU level. All registers are 64bit and usually cannot be treated as "doubled". Same occurs on system bus - in very low level burst transfer of some memory region (As far as I remember from computer architecture lectures it's done nearly entirely via MMU) will be much faster than calculating which bytes changed and should be copied (using CPU time and initiating bunch of small transfers) - and I think that's why "native" compilations seems to work with less effort. In case 64bit SDL vs. 64bit OpenGL - maybe there is a way to improve rendering in SDL version, but it's to far in to code and coding for me.

TurfIt

Quote from: Wheart on December 30, 2015, 08:14:02 AM
The OpenGL thing works for me much better than both SDL and SDL2. On both SDL and SDL2 using "transparent instead of hidden" in addition to "Smart hide objects" leaves some artefacts (shaded building elements)
Confirmed. The 'smart hider' is not marking things dirty correctly. As Ters alluded to, Simutrans only copies to screen things that have changed; This provides a good speed boost in all cases I've seen. i.e. I've not seen the hypothetical system that chokes on the smaller copies.

The only reason OpenGL isn't doing this is the 'hackish' GL backend doesn't use the dirty system, just copies everything every frame. Hence it is much (50%+) slower than the other backends.


Quote from: Wheart on December 30, 2015, 08:14:02 AM
and SDL2 version crashes on window maximize (SDL maximizes OK), even if it's first thing after running Simutrans.
I vaguely remember such from developing the SDL2 backend. IIRC it was solved by updating the version of the SDL2 library. What version are you using?

Also, SDL2 was created to solve the performance issues with OSX and SDL1. It's not really tested on platforms other than mac.


Quote from: Wheart on December 30, 2015, 08:14:02 AM
In addition - it seems, that OpenGL works "smoother" - especially when "following" a vehicle. So it is very good hack :) The only problem with OpenGL is a scrolled text on "message bar" (bar seems to be "flashing" and text appears as "moved" several letters making it unreadable). Since there is an option to disable messages on that bar - and this setting is saved on game quit - it's acceptable to me.
vsync?  In general, Simutrans isn't very smooth. Out of the box it's set for 25fps which doesn't play nice with 60Hz screen refreshes. Of course unless you're in a place with 50Hz... But even then vehicle movement is jerky due to the way their map positions are translated to the screen. Hence the slinky accordion effect on trains too.


Quote from: Wheart on December 30, 2015, 08:14:02 AM
There is also one more issue when building from source provided via zip file - since flag "WITH_REVISION" is set to 1, it seems that during compilation it tries to use svn/cvs/etc to get revision number and it fails, giving in result "rNiewersjonowany katalog" (with English locale settings on compilation time it could be something like "rUnversioned directory").
No flags are set in the source .zip.  config.template is provided with all options commented out. You have to create config.default youself, and uncomment the relavent lines. If you try to use WITH_REVISION when compiling from a non-SVN working directory, then you'll get this error.


Quote from: Ters on December 30, 2015, 11:04:14 AM
Maybe the SDL backend also has some concurrency issues if multi-threading is enabled (I've never built Simutrans multi-threaded). I also see that batch copying is disabled when rendering is multi-threaded. The OpenGL backend turns a blind eye to the issue of multi-threading.
Not that I know of...
What batch copy disabled?


Quote from: Ters on December 30, 2015, 11:04:14 AM
allowing a modern compiler to use SSE and stuff, results in what I think is faster machine code, although Simutrans will complain that it is using the slowest of the three copy algorithms (the hand-optimized assembly only works with 32-bit GCC, and the middle one confuses GCC when vector optimizations is turned on).
SSE and stuff works in 32bit too... The current compiler target is supposed to be 'pentium4' which enables the first SSE instructions. Benchmarking with the cpu target set newer (and hence able to use newer SSE instructions) resulted in minimal gains. That was gcc 4.5 though, perhaps newer versions have finally got better optimizing for the new instructions.

No compiler optimization will ever beat the brute force RLE blitting the assembly does. No sane compiler programmer would ever write a compiler that puts out such an ugly but very effective string of instructions.

Since people insisit on building 64 bit, the slow copy path should be replaced by a simple while loop rather than that memcpy call which just kills gcc. Or even better, just enable this section of assembly. I can't find anything wrong with it in 64 bit mode. There is another section of assembly which is indeed not 64 bit safe.


Quote from: Wheart on December 30, 2015, 02:27:50 PM
I've just sent bug report ;) In my opinion, if one can compile Simutrans against OpenGL on Linux (regardless of "bitnes") without errors, it should do it without digging stackexchange for valid flags for library that is more or less portable. If it works worse or better - it's kind of taste ;)
The OpenGL backend should be considered abandoned.
What library flags?


Quote from: Wheart on December 30, 2015, 02:27:50 PM
I'm rather sysadmin than programmer and do not know too much about Simutrans internals, but I made some kind of an experiment:
- I run 4 instances of simutrans:
- 32bit binary (from SF) as a server, with 1024x1024 map
- same binary as client - "simutrans"
- 64bit binary with SDL backend (client) - "sim-sdl"
- 64bit binary with openGL backend (client) - "sim"
Apples and Oranges I thinks... do note the official binaries are compiled in debug mode - that roughly halves performance.
Please self compile the 32bit using the exact settings you used for the 64bit. Also compile all libraries with such as well.
Then, you'll find the 32bit significantly faster than 64, especially at drawing the screen. Try running 2560x1600 or greater, single threaded, and with pak64 zoomed out. That'll really highlight the difference.


Quote from: Wheart on December 30, 2015, 02:27:50 PM
Looking in process table I could say that 64bit version consumes much more memory (about triple in size virtual memory) but less CPU. Looking closer on physical memory usage, 64bit SDL binary uses 1.5 times more memory of it's 32bit counterpart and 64bit OpenGL version - twice as 32bit SDL. It could be acceptable, especially if one has much memory, so memory swapping on physical device doesn't occur.
The problem isn't not having enough memory, it's being able to access it. RAM is slow. Very very slow compared to modern processors. Simutrans working set far blows away even the biggest L3 caches, hence it has to read almost everything from RAM every single frame. I have unfinished work that mucks about the with memory layout of the game's objects and yields 20-50% faster performance, but it's far from ready.


Quote from: Wheart on December 30, 2015, 02:27:50 PM
"doubled". Same occurs on system bus - in very low level burst transfer of some memory region (As far as I remember from computer architecture lectures it's done nearly entirely via MMU) will be much faster than calculating which bytes changed and should be copied (using CPU time and initiating bunch of small transfers)
You can actually do an awful lot of calculations in the time it takes to wait for a cache miss to be fulfilled. Packing things tightly together and using CPU to break them apart when needed is far better than having them separate in memory and wasting bits due to alignment.

Ters

Quote from: Wheart on December 30, 2015, 02:27:50 PM
I've just sent bug report ;) In my opinion, if one can compile Simutrans against OpenGL on Linux (regardless of "bitnes") without errors, it should do it without digging stackexchange for valid flags for library that is more or less portable

I'd rather see that Simutrans didn't give the option at all. The OpenGL backend is silly, and that it outperforms anything, means that something is wrong with the straight forward approaches. It was an experiment to try to figure out where the problem is, or rather isn't, and maybe its nature. As such, I'm more interested in how SDL and SDL2 performs.

Quote from: TurfIt on December 30, 2015, 08:14:02 PM
The only reason OpenGL isn't doing this is the 'hackish' GL backend doesn't use the dirty system, just copies everything every frame.

Actually, the OpenGL backend tries to honor the dirty system. dr_textur() uses the values passed from simgraph16. But if the pointer into the PBO gives direct access to the texture in VRAM or pseudo-VRAM, the dirty rectangle stuff effectively gets ignored (glTexSubImage2D is a no-op).

Wheart

Thanks for your explanation. Since I'm "a weekend programmer" and my ASM knowledge vanished many years ago I must admit - it looks reasonably.
Simple invocation of Simutrans via perf and zooming out shows higher rate of stalled cycles in backend - twice as big, but it's hard to make any "real" benchmark by moving mouse in random pattern and still using binary provided on SF. Comparing 64bit SDL vs OpenGL - OpenGL has bigger page fault rate (~0.5k/sec compared to ~0.27k/sec) but slightly less branch-misses (at same branch-rate 2.8% branch-misses compared to 3.4%).

Frame rates (full-screen, map as previous, zoomed-out on 2x1680x1050 = 3360x1050 screen):
32 SDL - 5fps* (mouse really not responsive after zooming-out, problems even opening menu)
64bit SDL - 8fps (mouse laggy or sometimes misses clicks or drags to scroll the map)
64bit OpenGL - 12fps (mouse laggy)...

CPU counters looks "better", feeling is quite opposite - it could be also effect of rather good hardware (i7 960, 12G RAM in tripple-channel configuration, fair nVidia GFX) and multi-threading.

So next step could be to compile 32bit binary with same settings as 64bit one and I need some reading how to do this ;)

According to compilation flags: LIBS variable in Makefile is set to "-lglew32" for any OSTYPE other than "mac". For Linux, regardless of it's bit width, it should be set to "-lGLEW" - diff for workaround was in my first post.

According to SDL2 failure: version 2.0.2 with some Ubuntu patches (exact number: 2.0.2+dfsg1-3ubuntu1.1)

One more bug in OpenGL version: trying to capture screen-shot via "c" key causes memory access violation.


Thanks for your attention and Happy New Year!

Ters

I wonder how this patch affects your SDL performance. The main idea is to avoid all the individual SDL_UpdateRect calls when MULTI_THREAD=1, but at the cost of ignoring the dirty rectangles. It should cause behavior more similar to how the OpenGL backend does it, although starting Simutrans with -use_hw should be the most similar to how the OpenGL backend works. Maybe you can try -use_hw first? My only useable Linux box is too old to try out this in a meaningful way.

Wheart

So here are results of "perf" for five cases:

1. 64bit OpenGL (Debug = 1, multihread, optimise = 1)
2. 64bit SDL before patch
3. 64bit SDL after patch
4. 32bit SDL binary from SF
5. 64bit SDL after patch and debug commented out

HW: i7 920 (slower clock than before), 16GB RAM (three channels, but in configuration 2 x 4GB + 2 x 2GB + 2 x 2GB), slightly better GFX (still nVidia)

"Zoom out" test:
Resolution: 3600 x 1200
Map: 1024x1024
1: 11-12fps
2, 3 and 5: about 8 fps
4: 4 fps (problems with scrolling map using mouse - no response or response delayed and treated as "click", not "drag")

Additional problem: loading and saving map with patched (3 and 5) versions takes about 5-10 times more than other versions...
In opposition - OpenGL version loads (or saves) map instantly (!) - without even trying to paint progress-bar. "Not patched" SDL versions works "normal".

Patch solves main issue with buildings using "dynamic hide".

The only visible effect of "-use-hw" is shorter "mouse losts" (mouse pointer temporary disappeared) trying move zoomed out map.

At this point "Winner is OpenGL", but perf gives some more data. Patched and with turned off debug version has best stats on CPU. Perf results posted as attachments. "Testing" procedure was as follows: (auto)load last game, open "Graphics" dialogue to observe FPS, than zoom-out map and centre it on screen, than try to move it using mouse and then quit.

TurfIt

Are you by chance using a compositing window manager? There were previous reports of such also having indigestion with SDL - compiz and KWin were both mentioned, and of course newer Macs. SDL2 was developed for these, but not well tested on anything but Mac. Several hacks were applied to allow SDL2 to work - it's possible one of those disagrees with your system.

One hack was forcing SDL2 to use the "opengl" renderer rather than autodetecting. Using MinGW, SDL2 would auto select the "direct3d" renderer, and would then crash if you changed the Simutrans window size repeatedly. There appeared to be a humongous memory leak somewhere internal in SDL2/direct3d/video driver. Since both MinGW and OSX were happy with using "opengl", it was hardcoded.
You could try allowing the autodetect, or forcing something else.
Ref: Was there an SDL 2.0 project?


There were also reports of some Linux distros having clock issues with Simutrans - IIRC Ubuntu was one...
In essence the CPU ends up stuck in a low clock state -the speedstep never kicks in. There was some kernal flag to set to fix...
In this case, forcing Simutrans to actually try using more CPU by using a less efficient backend (opengl vs SDL) could kick the speedstep into working and clock up.
Ref: Performance problems under Linux (Xubuntu 12.10_x64)

Did you try -use_hw and -async? (note: -use_hw not -use-hw)  Does you system actually report hardware available? (as seen in Simutrans startup messages on the console)

The mismatched memory ranks in your system doesn't help things - you'd do better to drop to 12GB with 6x2, unless your really need 16.
Hyperthreading can help the multithreaded display (if you can feed it enough memory bandwidth which ^^^^ is hurting). You can try going to thread=8 in simuconf.tab. Don't expect miracles, but every little bit can help.

A self compiled 32 bit is needed to rule out some weird slowdown there... who knows what the file from SF is compiled as...

In general, your results appears as others in the past with systems that simply hate SDL1. The OpenGL backend is unlikely to see further development, it's basically obsoleted by the SDL2 one, not withstanding your crash issue with it.

Some other performance discussions:
Why is SDL with USE_HW not default?
Performance - GDI vs SDL ( Windows XP vs OS X 10.8.2)


Ters

SDL2 works fine on my x64 Linux box, with compositing both on and off (xfce). A developed 1024x1024 pak64 map runs at a decent 12 FPS zoomed full out, while both cores are busy compiling something else. At any other zoom level, it manager 25 FPS. Then again, this old box only has a 1280x1024 display, and Simutrans is only known to start struggling in the backend when going beyond 1920x1080. (At some point, even the frontend will be unable to cope with more pixels.) I don't have any bigger monitor than 1920x1080, and that's for my Windows machine, nor plans on getting any bigger screens in the near future.

SDL1 by comparison, only manages 7 FPS on the outermost zoom, and 14 on the next. So SDL2 seems to be an improvement over SDL1 on Linux as well as on Mac. If only we can figure out why it crashes on some (so far only one known) system.

Wheart

So, put away the performance for a while, my basic problem is:
- SDL has broken "dynamic hide" - artefacts on high buildings, when moving cursor down (leaves portions of "shade")
- SDL2 crashes on window maximize (KDE, LXDE, openbox)
- SDL2 has same problem as SDL with "dynamic hide" (even on "forest" industry, moving cursor down) UNLESS "-use_hw" is specified in command line
- OpenGL (tested, because both SDL versions fails for me) has messy message bar.

As far as I understand - OpenGL backend is "deprecated" and won't be developed in the future, may be removed some day. Something that I didn't understand at first, because "it works for me well" ;)

For now:
Option "-use_hw" has one visual effect with SDL2 - artefacts disappeared. Can you look at it?

One more issue (but really low priority): all tested backends except SDL2 when running with "-fullscreen", use all monitors connected (both HW setups are dual-monitor). SDL2 uses only "primary" one.

At start, simutrans compiled against SDL2 states:

SDL Driver: x11
Preparing display ...
Renderer: opengl, Max_w: 0, Max_h: 0, Flags: 14, Formats: 1, SDL_PIXELFORMAT_ARGB8888,
Renderer: opengles2, Max_w: 0, Max_h: 0, Flags: 14, Formats: 4, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888,
Renderer: software, Max_w: 0, Max_h: 0, Flags: 9, Formats: 8, SDL_PIXELFORMAT_RGB555, SDL_PIXELFORMAT_RGB565, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_BGRA8888,
Using: Renderer: opengl, Max_w: 16384, Max_h: 16384, Flags: 10, Formats: 3, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_YV12, SDL_PIXELFORMAT_IYUV,


How to tell, if there is hardware support or not?

For further investigation of SDL2 crashes on window maximize I will wait until Ubuntu 16.04 LTS will be released - it could be that Ubuntu version of library is messy (like sqlite library - which crashed several apps and wasn't patched till today). Tested KDE (KWin), openbox alone and lxde - SDL2 version srashes regardles of window manager. Currently I have only two boxes for tests and no possibility to test another Ubuntu version or another distro (closed drivers for GFX).

And about performance - for me "better performance" of OpenGL version on my HW is a side effect and I was really surprised that You said, that it's considered slower and won't be developed - in my environment it works better. There must be something I overlooked or there are some other "side effects" (i.e. with at least four threads running, simutrans makes great use of 8-threaded CPU, even with mismatched RAM modules). With some more spare time again, I will look on it again.

Thanks for your spent time and since SDL2 version is (as I understand) "the right one", I'll play on it (and sent bug reports if I found something ;) )

TurfIt

Quote from: Wheart on January 01, 2016, 11:46:38 PM
- SDL has broken "dynamic hide" - artefacts on high buildings, when moving cursor down (leaves portions of "shade")
Present in all backends when using dirty system. I have a possible fix I can try finishing next week.
Also, it should be noted this feature comes with a good performance hit, so if you're having problems in that, might be an idea to avoid...


Quote from: Wheart on January 01, 2016, 11:46:38 PM
- SDL2 crashes on window maximize (KDE, LXDE, openbox)
Please try updating your SDL2 first.  2.0.3 is current. And your video drivers...


Quote from: Wheart on January 01, 2016, 11:46:38 PM
- SDL2 has same problem as SDL with "dynamic hide" (even on "forest" industry, moving cursor down) UNLESS "-use_hw" is specified in command line
-use_hw and -async were for SDL1 only. Flags were hijacked for debugging SDL2, and ended up in trunk I see.
For SDL1: -use_hw; tells SDL to use hardware surfaces if available. I've yet to see a system that reports them available, and there's a good slowdown when trying to use what isn't there.
             -async: only in conjunction -use_hw, turns off vsync.
For SDL2: -use_hw: turns off Simutrans dirty tile system. Why you don't get the smart hide artifacts!
            -async:  turns ON vsync.


Quote from: Wheart on January 01, 2016, 11:46:38 PM
- OpenGL (tested, because both SDL versions fails for me) has messy message bar.
IIRC things go wonky if your system if pbo_able (was an unfinished feature I think.)
You'd need to edit source to turn off
simsys_opengl.cc:234
// Now, GL_ARB_pixel_buffer_object

pbo_able = GLEW_ARB_pixel_buffer_object;

just force pbo_able to false;


Quote from: Wheart on January 01, 2016, 11:46:38 PM
As far as I understand - OpenGL backend is "deprecated" and won't be developed in the future, may be removed some day. Something that I didn't understand at first, because "it works for me well" ;)
SDL2 by default does largely the same thing, and rather than reinvent the wheel implementing our own opengl 2d blitting system...


Quote from: Wheart on January 01, 2016, 11:46:38 PM
One more issue (but really low priority): all tested backends except SDL2 when running with "-fullscreen", use all monitors connected (both HW setups are dual-monitor). SDL2 uses only "primary" one.
Simutrans simply tells the library it wants a fullscreen window. I guess SDL2 interprets that differently than the rest. Either a feature or bug in SDL2 - nothing for Simutrans to fix. Maybe SDL2.0.3 has already done something?


Quote from: Wheart on January 01, 2016, 11:46:38 PM
At start, simutrans compiled against SDL2 states:

SDL Driver: x11
Preparing display ...
Renderer: opengl, Max_w: 0, Max_h: 0, Flags: 14, Formats: 1, SDL_PIXELFORMAT_ARGB8888,
Renderer: opengles2, Max_w: 0, Max_h: 0, Flags: 14, Formats: 4, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888,
Renderer: software, Max_w: 0, Max_h: 0, Flags: 9, Formats: 8, SDL_PIXELFORMAT_RGB555, SDL_PIXELFORMAT_RGB565, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_BGRA8888,
Using: Renderer: opengl, Max_w: 16384, Max_h: 16384, Flags: 10, Formats: 3, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_YV12, SDL_PIXELFORMAT_IYUV,

That's probably troublesome. Simutrans requires RGB565. If your video drivers don't support, they're likely doing on the fly conversion at usually a great performance hit. Unfortunately 16bit color support is slowing vanishing...

You might want to also try the opengles2 and even software renderers that are being reported as avaiable.
Again code editing:
simsys_s2.cc:174
if(  strcmp( "opengl", ri.name ) == 0  ) {

simply change the string to the name of the other renderer to try.


Quote from: Wheart on January 01, 2016, 11:46:38 PM
How to tell, if there is hardware support or not?
Please also provide the startup messages for SDL1 and opengl backends. SDL1 with and without -use_hw.
For SDL2 it's implicit with the opengl renderer.


Quote from: Wheart on January 01, 2016, 11:46:38 PM
And about performance - for me "better performance" of OpenGL version on my HW is a side effect and I was really surprised that You said, that it's considered slower and won't be developed - in my environment it works better.
If you look at the second last post in the Performance GDI vs SDL thread I linked to in my last posting, you see per frame timings I measure with various backends / settings.

Of course this is all platform specific, so it's certainly possible your mileage will vary!
Ultimately I expect we'll end up with only SDL2, or whatever is current at the time. GDI is already calling very very deprecated Windows APIs, more problems seemingly with every new version of Windows. SDL1 appears no longer maintained - abandoned for SDL2.

Ters

Quote from: Wheart on January 01, 2016, 11:46:38 PM
So, put away the performance for a while, my basic problem is:
- SDL has broken "dynamic hide" - artefacts on high buildings, when moving cursor down (leaves portions of "shade")
- SDL2 crashes on window maximize (KDE, LXDE, openbox)
- SDL2 has same problem as SDL with "dynamic hide" (even on "forest" industry, moving cursor down) UNLESS "-use_hw" is specified in command line
- OpenGL (tested, because both SDL versions fails for me) has messy message bar.

The cases where visual artefacts are missing are just a case of two wrongs making a "right".

Quote from: Wheart on January 01, 2016, 11:46:38 PM
For now:
Option "-use_hw" has one visual effect with SDL2 - artefacts disappeared. Can you look at it?

It is as expected. I wrote that SDL2 with -use_hw will behave like the OpenGL backend.

Quote from: TurfIt on January 02, 2016, 05:05:36 AM
Ultimately I expect we'll end up with only SDL2, or whatever is current at the time. GDI is already calling very very deprecated Windows APIs, more problems seemingly with every new version of Windows. SDL1 appears no longer maintained - abandoned for SDL2.

What deprecated API? Most of it is exactly the same API that SDL uses. And mingw doesn't support any other API, and by the looks of it, never will.

Wheart

Quote from: TurfIt on January 02, 2016, 05:05:36 AM
Also, it should be noted this feature comes with a good performance hit, so if you're having problems in that, might be an idea to avoid...

As I stated before - performance wasn't the initial problem. Problem was messy "message bar" and wrong flag for glew library on Linux (-lglew32 instead od -lGLEW) in Makefile.
There was question, why I use OpenGL backend and it just diverted to artefacts when "dynamic hide" is in use and performance things. "Dynamic hide" was used as an example because it is easiest way to reproduce problems with artefacts using it - they appears in some other circumstances but I can't reproduce them "on demand" (or didn't think on reporting them and don't remember how I made them  ;) ).

Quote from: TurfIt on January 02, 2016, 05:05:36 AM
IIRC things go wonky if your system if pbo_able (was an unfinished feature I think.)
You'd need to edit source to turn off
simsys_opengl.cc:234
// Now, GL_ARB_pixel_buffer_object

pbo_able = GLEW_ARB_pixel_buffer_object;

just force pbo_able to false;

It just break "dynamic hide" - more artefacts than on SDL. But "message bar" works OK.

Quote from: TurfIt on January 02, 2016, 05:05:36 AM
That's probably troublesome. Simutrans requires RGB565. If your video drivers don't support, they're likely doing on the fly conversion at usually a great performance hit. Unfortunately 16bit color support is slowing vanishing...

You might want to also try the opengles2 and even software renderers that are being reported as avaiable.
Again code editing:
simsys_s2.cc:174
if(  strcmp( "opengl", ri.name ) == 0  ) {

simply change the string to the name of the other renderer to try.

Please also provide the startup messages for SDL1 and opengl backends. SDL1 with and without -use_hw.
For SDL2 it's implicit with the opengl renderer.

So...
Test conditions:
SDL2, different renderers, fullscreen@1920x1200, i7-920 machine (one witch mismatched memory), 1024x1024 map loaded, zoomed out and centred.

Setting renderer to opengles2 just kill Simutrans - got 1fps when zoom-out and still loosing frames...
Has to kill game, because it didn't respond to mouse.

Console messages + perf:

wheart@centurion:~$ perf stat /usr/local/games/simutrans-120.1.1/sim-sdl2-gles -fullscreen
Use work dir /usr/local/games/simutrans-120.1.1/
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
parse_simuconf() at /home/wheart/simutrans/simuconf.tab: Reading simuconf.tab successful!
SDL Driver: x11
Preparing display ...
Renderer: opengl, Max_w: 0, Max_h: 0, Flags: 14, Formats: 1, SDL_PIXELFORMAT_ARGB8888,
Renderer: opengles2, Max_w: 0, Max_h: 0, Flags: 14, Formats: 4, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888,
Renderer: software, Max_w: 0, Max_h: 0, Flags: 9, Formats: 8, SDL_PIXELFORMAT_RGB555, SDL_PIXELFORMAT_RGB565, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_BGRA8888,
Using: Renderer: opengles2, Max_w: 16384, Max_h: 16384, Flags: 10, Formats: 4, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888,
Loading font 'font/prop.fnt'
font/prop.fnt successfully loaded as old format prop font!
Init done.
[...]

Performance counter stats for '/usr/local/games/simutrans-120.1.1/sim-sdl2-gles -fullscreen':

     493764,067439 task-clock (msec)         #    1,060 CPUs utilized         
            562481 context-switches          #    0,001 M/sec                 
              3711 cpu-migrations            #    0,008 K/sec                 
             29271 page-faults               #    0,059 K/sec                 
     1400294681923 cycles                    #    2,836 GHz                     [83,48%]
     1175084527074 stalled-cycles-frontend   #   83,92% frontend cycles idle    [82,74%]
      737407770923 stalled-cycles-backend    #   52,66% backend  cycles idle    [66,98%]
      667861382230 instructions              #    0,48  insns per cycle       
                                             #    1,76  stalled cycles per insn [83,58%]
       74469042501 branches                  #  150,819 M/sec                   [83,40%]
         983533175 branch-misses             #    1,32% of all branches         [83,40%]

     465,855755516 seconds time elapsed



When switched to software renderer (and expecting even worse results), there was 13-14fps
One interesting thing: with software renderer "dynamic hide" works OK - no artefacts!
One more interesting thing: it doesn't crash on window maximize (for me - most annoying problem with SDL2 so far)!


wheart@centurion:~$ perf stat /usr/local/games/simutrans-120.1.1/sim-sdl2-software -fullscreen
Use work dir /usr/local/games/simutrans-120.1.1/
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
parse_simuconf() at /home/wheart/simutrans/simuconf.tab: Reading simuconf.tab successful!
SDL Driver: x11
Preparing display ...
Renderer: opengl, Max_w: 0, Max_h: 0, Flags: 14, Formats: 1, SDL_PIXELFORMAT_ARGB8888,
Renderer: opengles2, Max_w: 0, Max_h: 0, Flags: 14, Formats: 4, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888,
Renderer: software, Max_w: 0, Max_h: 0, Flags: 9, Formats: 8, SDL_PIXELFORMAT_RGB555, SDL_PIXELFORMAT_RGB565, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_BGRA8888,
Using: Renderer: software, Max_w: 0, Max_h: 0, Flags: 9, Formats: 8, SDL_PIXELFORMAT_RGB555, SDL_PIXELFORMAT_RGB565, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_BGRA8888,
Loading font 'font/prop.fnt'
font/prop.fnt successfully loaded as old format prop font!
Init done.
[...]

Performance counter stats for '/usr/local/games/simutrans-120.1.1/sim-sdl2-software -fullscreen':

     148642,557455 task-clock (msec)         #    1,149 CPUs utilized         
            421670 context-switches          #    0,003 M/sec                 
              5704 cpu-migrations            #    0,038 K/sec                 
             32771 page-faults               #    0,220 K/sec                 
      353913515964 cycles                    #    2,381 GHz                     [83,29%]
      177899630038 stalled-cycles-frontend   #   50,27% frontend cycles idle    [83,64%]
       96569833253 stalled-cycles-backend    #   27,29% backend  cycles idle    [66,28%]
      456983066283 instructions              #    1,29  insns per cycle       
                                             #    0,39  stalled cycles per insn [83,18%]
       46503058092 branches                  #  312,852 M/sec                   [83,46%]
        1499475997 branch-misses             #    3,22% of all branches         [83,33%]

     129,316795735 seconds time elapsed




Just for reference: SDL2 and OpenGL renderer makes 14-15fps.

wheart@centurion:~$ perf stat /usr/local/games/simutrans-120.1.1/sim-sdl2 -fullscreen         
Use work dir /usr/local/games/simutrans-120.1.1/
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
parse_simuconf() at /home/wheart/simutrans/simuconf.tab: Reading simuconf.tab successful!
SDL Driver: x11
Preparing display ...
Renderer: opengl, Max_w: 0, Max_h: 0, Flags: 14, Formats: 1, SDL_PIXELFORMAT_ARGB8888,
Renderer: opengles2, Max_w: 0, Max_h: 0, Flags: 14, Formats: 4, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888,
Renderer: software, Max_w: 0, Max_h: 0, Flags: 9, Formats: 8, SDL_PIXELFORMAT_RGB555, SDL_PIXELFORMAT_RGB565, SDL_PIXELFORMAT_RGB888, SDL_PIXELFORMAT_BGR888, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_RGBA8888, SDL_PIXELFORMAT_ABGR8888, SDL_PIXELFORMAT_BGRA8888,
Using: Renderer: opengl, Max_w: 16384, Max_h: 16384, Flags: 10, Formats: 3, SDL_PIXELFORMAT_ARGB8888, SDL_PIXELFORMAT_YV12, SDL_PIXELFORMAT_IYUV,
Loading font 'font/prop.fnt'
font/prop.fnt successfully loaded as old format prop font!
Init done.
[...]                                                                                                                                                                                                                                                         
                                                                                                                                                                                                                                                                               
Performance counter stats for '/usr/local/games/simutrans-120.1.1/sim-sdl2 -fullscreen':                                                                                                                                                                                     
                                                                                                                                                                                                                                                                               
     101058,072183 task-clock (msec)         #    1,906 CPUs utilized                                                                                                                                                                                                         
            433503 context-switches          #    0,004 M/sec                                                                                                                                                                                                                 
              2176 cpu-migrations            #    0,022 K/sec                                                                                                                                                                                                                 
             29207 page-faults               #    0,289 K/sec                                                                                                                                                                                                                 
      259375571029 cycles                    #    2,567 GHz                     [83,46%]                                                                                                                                                                                       
      152078228158 stalled-cycles-frontend   #   58,63% frontend cycles idle    [83,38%]                                                                                                                                                                                       
       79223728296 stalled-cycles-backend    #   30,54% backend  cycles idle    [66,55%]                                                                                                                                                                                       
      240156492801 instructions              #    0,93  insns per cycle                                                                                                                                                                                                       
                                             #    0,63  stalled cycles per insn [83,19%]                                                                                                                                                                                       
       37242133778 branches                  #  368,522 M/sec                   [83,32%]                                                                                                                                                                                       
        1183587056 branch-misses             #    3,18% of all branches         [83,30%]

      53,023610081 seconds time elapsed



Startup messages for:
OpenGL backend:

wheart@centurion:~$ sim
Use work dir /usr/local/games/simutrans-120.1.1/
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
parse_simuconf() at /home/wheart/simutrans/simuconf.tab: Reading simuconf.tab successful!
Preparing display ...
Screen Flags: requested=12, actual=12
Renderer is NPOT able.
Renderer is PBO able.
Renderer supports textures up to 16384x16384.
Hardware acceleration available, vendor: NVIDIA Corporation.
Loading font 'font/prop.fnt'
font/prop.fnt successfully loaded as old format prop font!
Init done.

When using "-fullscreen" flags are changed:

[...]
Preparing display ...
Screen Flags: requested=80000002, actual=80000002
Renderer is NPOT able.
Renderer is PBO able.
Renderer supports textures up to 16384x16384.
[...]


SDL1 without -use_hw:

wheart@centurion:~$ sim-sdl1
Use work dir /usr/local/games/simutrans-120.1.1/
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
parse_simuconf() at /home/wheart/simutrans/simuconf.tab: Reading simuconf.tab successful!
Preparing display ...
SDL_driver=x11, hw_available=0, video_mem=0, blit_sw=0, bpp=32, bytes=4
Screen Flags: requested=10, actual=10
dr_os_open(SDL): SDL realized screen size width=704, height=560 (requested w=704, h=560)
Loading font 'font/prop.fnt'
font/prop.fnt successfully loaded as old format prop font!
Init done.


When "-fullscreen" used changes to:

Preparing display ...
SDL_driver=x11, hw_available=0, video_mem=0, blit_sw=0, bpp=32, bytes=4
Screen Flags: requested=80000000, actual=80000000
dr_os_open(SDL): SDL realized screen size width=3600, height=1200 (requested w=3600, h=1200)

(yes, two monitors with different resolutions gave in effect strange 3600x1200)

SDL1 with "-use_hw":

wheart@centurion:~$ sim-sdl1 -use_hw
Use work dir /usr/local/games/simutrans-120.1.1/
Reading low level config data ...
parse_simuconf() at config/simuconf.tab: Reading simuconf.tab successful!
parse_simuconf() at /home/wheart/simutrans/simuconf.tab: Reading simuconf.tab successful!
Preparing display ...
SDL_driver=x11, hw_available=0, video_mem=0, blit_sw=0, bpp=32, bytes=4
Screen Flags: requested=40000011, actual=10
dr_os_open(SDL): SDL realized screen size width=704, height=560 (requested w=704, h=560)
Loading font 'font/prop.fnt'
font/prop.fnt successfully loaded as old format prop font!
Init done.


And with "-fullscreen":

Preparing display ...
SDL_driver=x11, hw_available=0, video_mem=0, blit_sw=0, bpp=32, bytes=4
Screen Flags: requested=c0000001, actual=80000000
dr_os_open(SDL): SDL realized screen size width=3600, height=1200 (requested w=3600, h=1200)


I'm aware, that I am "niche user of niche operating system" and that it's platform specific problem.
It seems, that on Linux (and Intel-nVidia box using proprietary nVidia drivers) SDL2 + software renderer gives best effects at a little price of performance (on my box - about 1fps compared to SDL2/OpenGL, but no real reference to 32bit binary in 64bit environment). Another define to set/enforce default renderer at compilation time?

prissi

The image transfer is anyway limited by the main bus. As such the software transfer which only updates the changed regions will use least bandwidth and hence may be the fastest.

And SDL2 support input methods other than 1:1 keyboard, which why it should become the default if possible. (But I fail to compile SDL2 on Haiku, as well as windows.)

TurfIt

Quote from: Wheart on December 30, 2015, 08:14:02 AM
On both SDL and SDL2 using "transparent instead of hidden" in addition to "Smart hide objects" leaves some artefacts (shaded building elements)
r7736 should have much less artifacts when using the smart hide cursor. Tops of buildings extending off the map edge still leave some (as do many other things go wonky when displayed over the void). Plus, don't move your cursor faster than the update rate!

Wheart

I've tested Simutrans r7736 on Linux-x64 with both SDL1 and SDL2 (with OpenGL renderer) - works OK (except known SDL2 crashes on window maximize - to be investigated with updated SDL2 library).

Thanks :)

Compilation with OpenGL backend on Linux (however not advised) still fails:

===> LD  build/default/sim
/usr/bin/ld: cannot find -lglew32
collect2: error: ld returned 1 exit status


Workaround was in my first post in this thread - on Linux linker flag should be -lGLEW rather than -lglew32