News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Performance problems under Linux (Xubuntu 12.10_x64)

Started by donjuan, January 23, 2013, 10:00:17 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

donjuan

My configuration: Xubuntu 12.10 (Kernel 3.5, fglrx-driver 13.1), CPU: Intel i5 2500K 3,5 Ghz, RAM: 4GB

Hello,

i've installed Simutrans from repository (Version 111.3-1) and via download from Sourceforge (Version 120.1) with pak64. Both versions are lagging when i scroll over the map. The game is nearly unplayable.

The only way to fix this bug (that i have found), is to disable all cpu energy-saving features in BIOS. I suppose that the cpu stays in nergy-saving when simutrans is running.
By the way, the windows version also works fine with wine and cpu energy-saving features enabled.

Its annoying for people with less computer skills, that there is a buggy version in the ubuntu repository. Simutrans is a great game. I hope a developer can fix this problem.


Markohs

Can you download and try with one nightly? Maybe that solves your problem.

http://nightly.simutrans-germany.com/

donjuan

Hi Markohs,

I've tried the nightlys - unfortunately the same behavior.  I've tested every verison: Linux/gcc 3/gcc 4 and  Linux 64/gcc 4 Version: 112.1-6296

Markohs

Can you check how many threads does simutrans create when playing? on linux I think every thread gets it's own PID, so doing "top" should show multiple simutrans, or look for the LWP ot threads column.

You can also simulate artificial workload to see if that forces the CPU to keep high frquency, just open some tcsh and do something similar to:

> 1:
> goto 1

or on bash I think you can do something similar to:

while true; do echo test done

Keeping some of those processes actives might force the CPU frequency up, just to diagnostic.

Does /var/log/dmesg or some file in /proc/cpu or similar record CPU frequency speedstep decreases?

You can allways have a look at simutrans.log for some strange messages, but I think it's related to your system, not to simutrans. But I can be wrong, ofc. :)

prissi

You can try to compile simutrans with -MULTITHREAD=1. This could remove those problems.

donjuan

QuoteCan you check how many threads does simutrans create when playing?
There is only one thread running shared to all four cpu-cores.

QuoteYou can also simulate artificial workload to see if that forces the CPU to keep high frequency...
I've used prime95 and boinc to simulate highest workload on all cores with the result, that simutrans work much better, but still lagging when i zoom out to the maximum (when the cpu is stressed so much lagging is normal). Furthermore there are no entry's in /var/log/dmesg. The CPU keep on high frequency (3,5 Ghz) when prime or boinc is running, but when only simutrans is running, cpu stays on lowest frequency (1,6 Ghz- measured with xfce4-cpu-freq-plugin)

Quote...but I think it's related to your system. But I can be wrong, ofc.
Jep, who knows  :P ? But i've found people with the same problem and amd processor (Cool&Quiet) in a german simutrans forum (www.simutrans-forum.de) the last post is from Dec. 2011. But there is no solution. And please imagine how many people use linux, how many of them play simutrans, how many of them use a multi-core-cpu and how many of them register to this forum  and post there problems  ;). I dont believe i am the only one  with this problem.

QuoteYou can try to compile simutrans with -MULTITHREAD=1
@prissi
Unfortunately, i'am just a technician not a programmer. I dont know how to compile simutrans. But your reply gave me an idea. I went to the BIOS and deactivated 3 of 4 cores (energy-saving-features enabled). With my "new" singlecore system i've tested simutrans and it works as good as with prime running - maybe better. (What i haven't tested is, if the cpu keep on 1,6 Ghz)

Markohs

 So looks like the problem comes from ubuntu shipping a single-threaded version of simutrans, looks like ubuntu Kernel does not consider one of their cores being used 100% of the time enough reason to not stepping down the CPU. Looks this is more a Ubuntu kernel problem than simutrans one, and will happen on all singre threaded processes/games.

Said this, Ubuntu should ship a multi-threaded version of simutrans, I'd say 4 threads since all modern CPU's nowadays have a minimum of 4 cores.

Maybe there is a way to tweak the Lunux kernel to have a less agressive underclocking strategy, since Windows looks like treats this better. I'm not a linux expert so I got no idea how. :)

Does anybody in this forum knows how to notify Debian/Ubuntu of this problem so they ship threaded version of simutrans on its system?

If you want to compile a simutrans with threads to confirm our theory, you just need to (saying from mind, some details can be wrong):

sudo apt-get install g++
sudo apt-get install libsdl1.2-dev
sudo apt-get install subversion
svn checkout --username anon svn://tron.homeunix.org/simutrans/simutrans/trunk
cd trunk
cp config.template config.default
--- Edit config.default
This options need to be uncommented
BACKEND = posix
COLOUR_DEPTH = 16
OSTYPE = linux
MULTI_THREAD = 4
OPTIMISE = 1 # Add umpteen optimisation flags
SDL_CONFIG     = sdl-config
---
make -j4

This should generate a working binary, test it out.

donjuan

Hi Markohs,

if i open the activity log of simutrans in synaptic, this name appears: Ansgar Burchardt <ansgar@debian.org>.
I believe it could actually be a debian based problem (ubuntu is totally debian based with light modified kernel) but i am no expert.
By the way, do you know why there are no multi-threaded linux version on sourceforge?

QuoteIf you want to compile a simutrans with threads to confirm our theory, you just need to (saying from mind, some details can be wrong):

I will try to compile it with your "bulding-manual".

Ters

In my experience, you need to push Simutrans hard in order for it to use significant CPU. I haven't bothered building Simutrans with multithreading, because it uses at most a third of one of my eight 2.3 GHz cores. (My map might not be as big, relatively speaking, as it was when I started it years ago.) Spreading it across multiple cores wouldn't help, it would still be the same amount of work, except for thread synchronization overhead, which should be small. Reducing the number of cores would however increase proportional CPU use.

Could it be that low CPU usage throttles down other things like bus speed or graphics, without noticing that those are (possibly) more heavily used than the CPU?

Markohs

That makes sense.

That Frequency of the CPU might be reducing the bus speed too, memory included and PCI to write the frame to the VGA. And simutrans uses quite a lot of bus bandwith, since the whole frame is redrawn.

But Ters, do you play with trees on? That uses to cause lots of CPU use, with trees hidden CPU usage is not high.

Vonjo

#10
The weird thing is that the Windows version, running under Wine, runs fine. This make me wonder if something like LLVM version of Simutrans can make some differences. I still have LLVM version of Simutrans from a few months ago if you want to test it, multi thread is not enabled though.
---
It is true that the debian version doesn't have multi thread enabled.

Ters

Quote from: Markohs on January 24, 2013, 06:58:57 PM
But Ters, do you play with trees on? That uses to cause lots of CPU use, with trees hidden CPU usage is not high.

I had transparent trees, which shouldn't have mattered much or maybe made it slower, but I was running on laptop monitor only. Now I hooked up my HD monitor, switched trees fully on and panned through a forrest. Total CPU usage on computer peaks at 10%, Simutrans only about 4%. It's "only" a 1024x1024 map. I'm not sure about the number of vehicles. Tried turning on pedestrians, which brought Simutrans up to 5-6%. That's still only about one half core.

Markohs


prissi

Actually, that single core helped indicates rather that Linux does something right, i.e. used deeper sleep states on inactive cores. Unfortunately this has the downside, that these take longer to wake up. That mean with the update routine starting four thread each of them will take longer to wake up. As a result, four threads will take much longer than a more or less constantly runningsingle thread.

The nightly is built with four threads, as far as I know. Thus building a singlecore binary should rather help (thus -MULTI_THREADs=1)

Markohs

A desktop kernel, when not running on power saving restrictions, should minimize latency on the process run queue, so going deeper sleep makes no sense at all imho. I guess this is one of the reasons Windows is the prefered OS for gamers.

A desktop kernel should give big priority to interactive applications, specially games. Makes no sense to downgrade frequency to save power. Servers in certain circunstances and laptops/mobile devices are another thing.

Imho this is a problem in how Ubuntu customized the kernel.

But this is just my oppinion. ;)

Maybe tweaking on the pthreads parameters or hinting the system somehow of our desired low-latency high-frequency requirements could help on this problem. I guess this is possible, if someone has any details we can have it a look. I know it was possible in solaris, don't know linux in depth.

Markohs

Found something related to this:

http://embraceubuntu.com/2005/11/04/enabling-cpu-frequency-scaling/

comment 5 menctions about "governors", you can maybe try to set it to "performance" and see if it makes a difference. Looks like there is no way for programs giving hints to the kernel about their computing needs in linux.

More info:

http://wiki.debian.org/HowTo/CpuFrequencyScaling
http://ubuntuforums.org/showthread.php?t=248867

Might be related to a bug cited here (even it's from 2009, but well, xubuntu is a fork, maybe the bug remains there):

https://bugs.launchpad.net/ubuntu/+source/cpufreqd/+bug/344252

Ters

Quote from: Markohs on January 25, 2013, 12:27:23 AM
A desktop kernel, when not running on power saving restrictions, should minimize latency on the process run queue, so going deeper sleep makes no sense at all imho. I guess this is one of the reasons Windows is the prefered OS for gamers.

A desktop kernel should give big priority to interactive applications, specially games. Makes no sense to downgrade frequency to save power. Servers in certain circunstances and laptops/mobile devices are another thing.

Unless you're doing heavy processing 24/7, computers today have such a high peak performance compared to average use that power saving makes sense on desktop also. Save the planet and all that stuff. And how should a kernal know that something is a game? Simutrans doesn't call any markProcessAsGame() function. The kernel just sees a process that doesn't use all the CPU available.

What puzzles me is that the Windows version works under Wine. Is the Windows version single-threaded, or does Wine add just enough overhead to go above the throttling threshold?

Markohs

yep, i know its because of the cpu usage pattern I know. But it's doing a horrible job recognising it.

yea, i'm quite sure wine adds that extra bit that's enough for the kernel to switch ghz

Markohs

#18
Can you please run:

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

And post the result here?

If it doesn't output "Performance", do:

echo "Performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

And test the game again to see if we get any difference?

You might need to do this for cpu0, cpu1,... all your cpu's:

EDIT: Extra reflexion:

Could it be that given the inherent tendency simutrans has to not force the CPU higher than it can give, all the logic in karte_t::update_frame_sleep_time (that tries to not force the CPU to draw more frames than it can handle), goes in precisely the opposite direction than Ubuntu's CPU frequency scaling?

See, Simutrans tries to save CPU => Kernel downgrades CPU speed => Simutrans detects low FPS again and uppers frame time to save CPU => Kernel downgrades CPU speed again => Simutrans uses high frame times seeing he has very little CPU available.

Might it be possible we should instruct simutrans to push the CPU harder?

donjuan

#19
Quotecat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Output: ondemand

The correct command to change the governor in xbuntu and his forks (as root in debian without "sudo") was :
echo performance | sudo tee /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor

I've done this for all cores (cpu0-cpu3) with the result  that simutrans now runs as good as under wine, until the next reboot.
We can bury the multithreading theory.

Okay, I even found a way to permanently change the governor to "performance". (By adding 8 rows of code to  /etc/sysfs.conf ),
but what about other people who want to try out simutrans and what about my electricity bill :-) . I hope there is a clever solution to fix this problem, by adding some rows of code to simutrans.

I have written that Mr. Burchardt is responsible for the repository. That's not totally correct, there seems to be a "Debian Games Team" and he belogs to this team. Link is: http://packages.debian.org/wheezy/simutrans  (ubuntu and forks are using debian repositories)

Markohs

mmm.... yep, we need to do something in simutrans so this setting is not necessary, there is certainly something not working good here.

Trying to fix it but I'm afraid of touching that code because has some relation with network mode and I might break something. The idea is adding a minimum allowed FPS to simutrans, so even if we are not reaching let's say 25 fps, simutrans keeps using CPU to reach those 25, and not renounce to them and start targeting lower FPS/higher frame times.

If Dwachs/prissi could have a look into this it whould be better, it will take me time to fully understand that part of the code.

I think replacing in simintr.cc

bool increase_frame_time()
{
   if(frame_time < 255*FRAME_TIME_MULTI) {
      frame_time ++;
      return true;
   } else {
      return false;
   }
}

with something like

if(frame_time < 2*FRAME_TIME_MULTI) {

can maybe force the game to 15 fps max, but didn'ttried and I'm prolly  wrong. :)

prissi

Simutrans tries to reach 25 fps. If it can achieve it, it will pause, otherwise it will not pause at all. Thus the power mangement will not kick in, if it cannot do 25 fps.

You can take out the pause instruction by a polling loop and simutrans will run with 1000% single core CPU. But honestly, this is an error of Ubuntu rather than simutrans. The kernel should find out quickly, that it has 100% load for some time resp. a lot of latency due to waking up sleeping cores and then just go less to sleep.

Especially with power management, extra threads may be a problem. The only actve core runs fine; but waking those other core every 50ms may take a lot of time (on Intel). Wine may run a lot on a second core, which quite effectively avoids very deep sleep state of the processor, and thus a fast wakeup of other cores for the multithreading.

Markohs

#22
Are you sure we can't avoid this situation in simutrans code? donjuan, do you know if other games in ubuntu are affected by this?

I suspect the problem is also in simutrans code, but you know the code better than I do.

BTW it not so much about deep sleep on cores than the kernel lowering the frequency of the cores.

Isn't this part of the code lowering the CPU requirements of simutrans? I think might be causeing the problem


void karte_t::update_frame_sleep_time(long /*delta*/)
...
        // way too slow => try to increase time ...
        if(  last_ms-last_interaction > 100  ) {
            if(  last_ms-last_interaction > 500  ) {
                set_frame_time( 1+get_frame_time() );
            }
            else {
                increase_frame_time();
                increase_frame_time();
                increase_frame_time();
                increase_frame_time();
            }
        }

donjuan

#23
Quote...do you know if other games in ubuntu are affected by this?

yes. I've  tried diffrent games which are similar to simutrans (not gpu-accelerated and maps). For example LincityNG, Micopolis, NetPanzer, OpenTTD and Unkonwn Horizons.

Every game runs smooth on my machine (with governor ondemand of course). Only Simutrans is in trouble.

Ters

Quote from: Markohs on January 25, 2013, 01:43:52 PM
Are you sure we can't avoid this situation in simutrans code?

Simutrans could start a thread that just does while(true) {++i;}, but that seems like a terrible hack.

Does anyone know if those other games are threaded?

Markohs

The version ububtu ships it's not threaded, so that's not the problem, according what donjuan said.

Yea, that's a horrible hack, makes no sense to include that kind of code, there has to be a problem somewere.

sdog

i thought in ubuntu the cpufreq-daemon queries acpi to check if a power supply is connected and set the governor to performance mode. There used to be a problem that this deamon overrides conservative user settings, by doing this and setting the system to performance mode since the desktop is always connected to power supply. I remember i stumbled upon something like that last autumn.

Having ondemand as setting, might be either they fixed above behaviour and broke something else, or some problems with ACPI not detecting it is powered.

prissi

OpenTTD is single threaded (but for autosave), but revisits 1/7 of all tiles during a frame update. Due to the calculation possible with GRFs and the way data is stored, it is even more CPU intensive with large screens; but it runs with a fixed frame rate of 15 (if memory serves me right) and really enforces it. Thus it might seem more responsive.

Markohs

What's the CPU usage % simutrans gives when running in ondemand mode, donjuan?

If it's not 100%, it's not just a problem in ubuntu, imho.

donjuan

#29
QuoteWhat's the CPU usage % simutrans gives when running in ondemand mode, donjuan?

Low. The average usage of each core is 7-14%.

Markohs

donjuan, do you mind trying a fix?

You have to open simintr.cc, and in line 67 change the line:


   frame_time = clamp( time, 10, 250)*FRAME_TIME_MULTI;


To:



   frame_time = clamp( time, 10, 40)*FRAME_TIME_MULTI;



And recompile.

Test it in ondemand mode, please.

donjuan

Hi Markohs,

sure. I've changed the code. And tried to compile it, unfortunately i got an "fatal error" message. It seems there is a header file missing.

bzlib.h: Can't find file or directory
compling aborted.
make: *** [build/default/dataobj/loadsave.o] Error 1


What can i do ?

sdog

it's related to the external library for bzip2, used to compress savegames. In Ubuntu (you're using it, aren't you?) it's easily fixed. Just get the package with either software centre or directly on the shell with:

sudo apt-get install libbz2-dev

(not 100% sure it's really in that package, you might want to try other libbz2 or bzip2 packages if above doesn't work.)

prissi

The change Makohs suggested is just a very ugly hack. if you does this clamp, do it at least between 20 and 100. Networkgames run often 15 fps. However, seen in the other thread that the CPU sep down happens with idle=0ms I hardly doubt that this will help.

donjuan

The code has been compiled succesfully - thanks to sdog for the advice to install libbz2-dev.

I dont know what has happend (i've uninstalled wine for a while), but i cant start simutrans any longer. Not only the self-compiled simutrans, even the files from sourceforge are no longer starting. Only the st from repository and the 64-bit-nighly build are starting - of course, I marked all "simutrans"-files as executable.

When i try to start st in the terminal i get this message:
user@P67A-UD3-B3:~/simutrans$ ./simutrans
./simutrans: error while loading shared libraries: libbz2.so.1.0: cannot open shared object file: No such file or directory


I try to find this file with "find / libbz2.so.1.0" no match. Synaptic shows that the package libbz2-1.0 is installed.