Author Topic: Desynch  (Read 1343 times)

0 Members and 1 Guest are viewing this topic.

Offline zacekjakub

Desynch
« on: July 24, 2016, 05:21:12 PM »
Hello all,

we played SImutrans some time ago and without troubles, but when we tried it last time, the multiplayer was working fine for a couple of minutes, but later, it got desynched on the client side. Both, the server and even the client are Ubuntu OS's. We tried about 10-15 verisons of the game, even the latest nightly version. The game gets unplayeable. For example, if I place something on the server, the cliet can see it was placed about 2 minutes later. Any advice please?

Thank you.

Jakub

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 8561
  • Total likes: 253
  • Helpful: 226
  • Languages: De,EN,JP
Re: Desynch
« Reply #1 on: July 24, 2016, 07:39:48 PM »
THe Ubuntu repository version and a self compiled/downloaded server version might be incompatible (there was an issue with the SHA implementation).

Did you get your simutrans from exact same place?

Offline zacekjakub

Re: Desynch
« Reply #2 on: July 24, 2016, 11:10:24 PM »
Yes, it was everytime the same compiled version. And we found out this is happening with version from steam, it has to be the same everytime...

Offline DrSuperGood

Re: Desynch
« Reply #3 on: July 25, 2016, 04:21:53 AM »
The Simutrans network synchronization protocol is highly unreliable. If the latency between the server and its clients is unpredictable, e.g. as a result of using wireless internet or network congestion, then there is a very good chance the clients will fall out of sync with the server. The same can happen if the server suffers from a lack of resources so has to slow down.

The reason for this is because the clients can advance past the time the server schedules a command for before the command is received. Since the time the command had to be executed has passed the game goes out of sync. The server always executes it at the scheduled time due to no latency and ultimately decides which clients are in sync or not.

The solution has been mentioned many times. Add safety frames which are sent multiple times per second and prevent the client from advancing past the time scheduled by a safety frame. Commands get scheduled after the last received safety frame and before the next received safety frame, in a safe time that cannot be reached. Result is the client stops execution upon reaching the last safety frame, a point up to which no more commands could be received, and waits for more commands and the next safety frame before resuming execution at full speed. I believe such a system is used by RTS games such as Warcraft III and StarCraft II, both of which have very robust multiplayer.

Until then you can try forcing lag frames on the server. Try setting it to something large like 30 to see if there is any difference. Judging by how the clinet is running 2 minutes behind it sounds like that is probably not going to be enough. For 2 minute latency to occur the server or client must not be running at the same speed (one is resource bound) or the connection between the computers is extremely unreliable (near 100% packet loss).

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 4451
  • Total likes: 141
  • Helpful: 105
  • Languages: EN, NO
Re: Desynch
« Reply #4 on: July 25, 2016, 08:31:38 AM »
With 2 minute latency, pretty much any game beyond chess or slow-paced card games will have serious trouble. That is so slow one could suspect there is a link using IP over Avian Carriers.

Offline zacekjakub

Re: Desynch
« Reply #5 on: July 25, 2016, 10:23:37 AM »
Hello,

I can try to modify the safety frames count, I think I tried it some time ago, but I am not sure. The computer is not the issue, it is **** strong machine with ssd raid and so on. :) Also the internet connection is not the problem, it is 15/15 Mbps line, not everytime perfect, sometimes I get some loss, but acceptable for any other multiplayer game like Dota2 and others.

Thank you, I will let you know.

Jakub

Offline DrSuperGood

Re: Desynch
« Reply #6 on: July 25, 2016, 01:04:58 PM »
Quote
15/15 Mbps line
Such lines are easily saturated, especially if someone is streaming or file sharing on it.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 4451
  • Total likes: 141
  • Helpful: 105
  • Languages: EN, NO
Re: Desynch
« Reply #7 on: July 25, 2016, 07:26:45 PM »
The computer is not the issue, it is **** strong machine with ssd raid and so on.

The harddrive(s) has got nothing to do with it (unless you are so low on RAM that the OS is needs to swap pages all the time). Simutrans should mostly be bound by CPU, bus and RAM speeds. It's ability to utilize multiple cores is limited, and it's ability to use the GPU is non-existing (except possibly converting 16-bit graphics to whatever the monitor is fed). There have been several reports on performance issues with Simutrans on machines that run the most modern games just fine. I believe this is because Simutrans operates on old principles that don't scale as well on modern machines like modern games built on modern principles do.

Such lines are easily saturated, especially if someone is streaming or file sharing on it.

That makes me wonder what kind of home lines aren't.

Offline TurfIt

Re: Desynch
« Reply #8 on: July 25, 2016, 07:59:28 PM »
I can try to modify the safety frames count, I think I tried it some time ago, but I am not sure. The computer is not the issue, it is **** strong machine with ssd raid and so on. :) Also the internet connection is not the problem, it is 15/15 Mbps line, not everytime perfect, sometimes I get some loss, but acceptable for any other multiplayer game like Dota2 and others.

The safety frames count (additional_client_frames_behind ) is more to help if you're getting an actual desync disconnection, but it sounds like your problem is simply the client running behind. There was a bug in older versions that would cause this, mostly from the server not fully keeping up, but it should be fixed in 120.1.3 (and maybe a couple before - can't rememeber...)

A **** strong machine might not be quite so **** strong when it comes to Simutrans as it tends to stress different things than modern games, and doesn't set any artificial limits letting you happily shoot yourself in the foot. e.g. Trying to run full resolution 5K with pak64 fully zoomed out on a laptop - not happening. Or, it might simply be an incompatibility with yout system - there's been a couple reports of issues on various linux distributions in the past.

One slowdown has been with the SDL backend and compositing window managers. Mint and compiz have been implicated there; Likely others too. The fix is to use SDL2 instead - same as OSXs indigestion with SDL1.

I also remember some people having issues with their CPUs remaining in low power mode when running Simutrans - i.e. not clocking up. I think Ubuntu was the one here. So check that's not occurring.

Finally, the online games were usually configured to run at a lower framerate - the server controls the speed of all clients. If a client can't actually maintain that framerate, it will endlessly fall behind. 10 is generally sufficient for the game itself, but the GUI gets rather annoying - 15 improves on that while still letting slower running clients join.

Offline zacekjakub

Re: Desynch
« Reply #9 on: July 27, 2016, 08:57:13 AM »

I said it is strong computer, the ssd raid was just an example of course. 24GB RAM, tripple channel, I7, 3.6Ghz, 4 cores in HT. I really don't think the computer is the issue.... :) 15/15 mbps is enough, I can take care of the load, using strict shaping and other methods...

Jakub

The harddrive(s) has got nothing to do with it (unless you are so low on RAM that the OS is needs to swap pages all the time). Simutrans should mostly be bound by CPU, bus and RAM speeds. It's ability to utilize multiple cores is limited, and it's ability to use the GPU is non-existing (except possibly converting 16-bit graphics to whatever the monitor is fed). There have been several reports on performance issues with Simutrans on machines that run the most modern games just fine. I believe this is because Simutrans operates on old principles that don't scale as well on modern machines like modern games built on modern principles do.

That makes me wonder what kind of home lines aren't.



Thank you,

we are going to try SDL2. I will let you know.

Jakub


The safety frames count (additional_client_frames_behind ) is more to help if you're getting an actual desync disconnection, but it sounds like your problem is simply the client running behind. There was a bug in older versions that would cause this, mostly from the server not fully keeping up, but it should be fixed in 120.1.3 (and maybe a couple before - can't rememeber...)

A **** strong machine might not be quite so **** strong when it comes to Simutrans as it tends to stress different things than modern games, and doesn't set any artificial limits letting you happily shoot yourself in the foot. e.g. Trying to run full resolution 5K with pak64 fully zoomed out on a laptop - not happening. Or, it might simply be an incompatibility with yout system - there's been a couple reports of issues on various linux distributions in the past.

One slowdown has been with the SDL backend and compositing window managers. Mint and compiz have been implicated there; Likely others too. The fix is to use SDL2 instead - same as OSXs indigestion with SDL1.

I also remember some people having issues with their CPUs remaining in low power mode when running Simutrans - i.e. not clocking up. I think Ubuntu was the one here. So check that's not occurring.

Finally, the online games were usually configured to run at a lower framerate - the server controls the speed of all clients. If a client can't actually maintain that framerate, it will endlessly fall behind. 10 is generally sufficient for the game itself, but the GUI gets rather annoying - 15 improves on that while still letting slower running clients join.
« Last Edit: July 27, 2016, 06:17:09 PM by Isaac.Eiland-Hall »

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 4451
  • Total likes: 141
  • Helpful: 105
  • Languages: EN, NO
Re: Desynch
« Reply #10 on: July 27, 2016, 10:53:19 AM »
I said it is strong computer, the ssd raid was just an example of course. 24GB RAM, tripple channel, I7, 3.6Ghz, 4 cores in HT. I really don't think the computer is the issue....

My point is that it was a very bad example. I also have a machine with SSD RAID, but that machine only has two cores at 2 GHz and 2 GB RAM. And the RAID probably slows it down more than it speeds it up. There are different types of RAID, remember. And I suspect RAID doesn't have quite the same potential to speed up SSDs as it does for HDDs.

Yet it runs Simutrans just fine, and has done so for ten years. (Not in HD, but my 2.3GHz laptop can. On a single core, my custom builds being single threaded. Although being an i7, the OS and other stuff can use the other cores. 4K is likely out of the question on my laptop, and probably a significant number of machines.)

The machine on it's own may not be the issue, in a sense, nor is Simutrans. For all its eccentricities, it has worked well for many years for many users. (And most of the eccentricities weren't so way back then.) But the combination might be, which is where SDL2 might come into play. Either that, or it is the network. What else can there be?

Offline zacekjakub

Re: Desynch
« Reply #11 on: July 27, 2016, 12:18:23 PM »
Hi,

just an idea, when the client falls behind, is it logged? If I take the logs from time this happens, can we tell what exactly was the problem?

Thanks.

Jakub

Offline TurfIt

Re: Desynch
« Reply #12 on: July 27, 2016, 07:59:42 PM »
If you're running a debug build, and with -debug 3 parameter, the following will be printed to the console (and in simu.log if also using -log):
Code: [Select]

Message: network_command_t::rdwr:       read packet_id=9, client_id=0
Warning: network_check_activity():      received cmd id=9 nwc_check_t from socket[996]
Warning: NWC_CHECK:     time difference to server 0
Message: network_world_command_t::execute:      do_command 9 at sync_step 2305 world now at 2300
Warning: karte_t:::do_network_world_command:    sync_step=2304  server=[rand=1655761279 halt=1 line=1 cnvy=1] client=[rand=1655761279 halt=1 line=1 cnvy=1]
a -'ve time difference means the client is running ahead of the server and risking a disconnection. It should be around zero.

By purposely overloading my client, I get:
Code: [Select]
Warning: NWC_CHECK:     time difference to server 0
Warning: NWC_CHECK:     time difference to server 325
Warning: NWC_CHECK:     time difference to server 2100
Warning: NWC_CHECK:     time difference to server 3875
Warning: NWC_CHECK:     time difference to server 5575
Warning: NWC_CHECK:     time difference to server 7325
etc.
That was by forcing 40 fps on the server, and then zooming out all the way with pak 64 and 2560x1600 res - I can only sustain 30fps like that so rapidly fall behind. Zoom back in one step, and the client runs 90 fps to catch back up. In terms of 'strong' computer - that's on a I7-6700K OC 4.6GHz, and 32GB DDR4-3200. (and Simutrans loves fast RAM unlike 99% of the programs out there - performance scales almost linearly with ram clock [within reason].)

You can run the graphical client as a server too - can it maintain the framerate specified? (25 is default) Note the frame timing between online and offline modes is different. Offline framerate can vary, online it's fixed so clients must be fast enough to handle.
What resolution are you trying to drive? fullscreen or windowed? At least on Windows, running full screen is a good 25-30% performance hit. Neither Nvidia nor ATI produce drivers that work well with 16bit fullscreen programs. I don't know the linux situation with 16bit support.


Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 4451
  • Total likes: 141
  • Helpful: 105
  • Languages: EN, NO
Re: Desynch
« Reply #13 on: July 28, 2016, 01:00:14 AM »
(and Simutrans loves fast RAM unlike 99% of the programs out there - performance scales almost linearly with ram clock [within reason].)
So you got some measurement on that? I think it follows naturally from the way Simutrans works, but I've never had any actual measurements to back up my beliefs.

I don't know the linux situation with 16bit support.
My NVidia powered Linux box started out a bit slow, then got better with newer drivers, but that was over five years ago now, so that's probably not very useful information. The newest drivers don't support that old cards anymore, although they did release a new old driver around the time Linux 4.0 came out, since the new kernel changed an API the driver was using. I don't think I ever ran Simutrans as anything but windowed on that machine, although that window was maximized.

Linux can be more forgiving for old stuff at times. Linux + Wine is as far as I know the only way to run 16-bit and 64-bit Windows applications side by side. Although it now requires a kernel switch, due to a security issue when switching between 16-bit and 32-bit stacks on the x86 architecture. When using the drivers that are part of the Linux kernel, one might find that someone has taken the time to write a patch with optimizations for 16-bit because they run a lot of 16-bit stuff themselves.

Offline zacekjakub

Re: Desynch
« Reply #14 on: July 31, 2016, 12:52:08 AM »
Hello,

so the logs we got from the client side are here. But unfortunately, I don't think it is going to give you a hint what exactly caused the issue. We changed both configs and forced the fps to be 15, nothing changed.

Warning: NWC_CHECK:    time difference to server 13600
Message: network_world_command_t::execute:    do_command 9 at sync_step 77697 world now at 77347
Warning: karte_t:::do_network_world_command:    sync_step=77440  server=[rand=450850961 halt=27 line=6 cnvy=14] client=[rand=450850961 halt=27 line=6 cnvy=14]
Message: network_command_t::rdwr:    read packet_id=9, client_id=0
Warning: network_check_activity():    received cmd id=9 nwc_check_t from socket[11]
Warning: NWC_CHECK:    time difference to server 12040
Message: network_world_command_t::execute:    do_command 9 at sync_step 77825 world now at 77514
Warning: karte_t:::do_network_world_command:    sync_step=77568  server=[rand=3717493219 halt=27 line=6 cnvy=14] client=[rand=3717493219 halt=27 line=6 cnvy=14]
Message: network_command_t::rdwr:    read packet_id=9, client_id=0
Warning: network_check_activity():    received cmd id=9 nwc_check_t from socket[11]

...

Warning: NWC_CHECK:    time difference to server 3960
Message: network_world_command_t::execute:    do_command 9 at sync_step 78465 world now at 78356
Message: wegbauer_t::route_fuer():    setting way type to 1, besch=city_road, bruecke_besch=NULL, tunnel_besch=NULL
Message: tool_build_way_t():    builder found route with 2 squares length.
Message: wegbauer_t::calc_costs():    construction estimate: 30.000000
Message: interaction_t::interactive_event(event_t &ev):    calling a tool
Message: network_command_t::rdwr:    write packet_id=8, client_id=2
Warning: nwc_tool_t::rdwr:    rdwr id=8 client=0 plnr=2 pos=42,235,0 tool_id=4110 defpar=city_road init=0 flags=0
Message: packet_t::send:    sent 70 bytes to socket[11]; id=8, size=70

Thanks,

Jakub