News:

The Forum Rules and Guidelines
Our forum has Rules and Guidelines. Please, be kind and read them ;).

Odd network desync issue - testing assistance requested

Started by jamespetts, October 01, 2017, 12:20:58 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

jamespetts

Yesterday, I discovered a problem where a fix to random number generation had stopped 64 bit Linux and 32-bit Windows builds of Simutrans-Extended being able to stay in sync with one another in a networked game. I pushed a fix to this, and confirmed that this worked by connecting my Windows computer to my Linux computer in a stable network game over my home network on the same saved game as on the Bridgewater-Brunel server, which ran stably for many hours.

However, the Bridgewater-Brunel server itself cannot now maintain a stable connexion with either a Windows or Linux client. Yesterday, before implementing these changes, I was able to connect to the Bridgewater-Brunel server with my Linux client and remain connected stably to it (but the Windows client would disconnect instantly). Now, I can connect the Windows and Linux machines to each other over my local network stably, but neither will connect to the remote server stably. The same issue appears to occur with the British sandbox server. I explicitly checked yesterday whether the problem was with the command line build by running the command line build on my local Linux machine and connecting the Windows machine to that, but that, too, ran stably.

I cannot think of any way of narrowing down what the difference might be between the Linux machine on my local network and both the Bridgewater-Brunel server and the British sandbox server, given that they are all running an identical executable, pakset and saved game on the same platform (64-bit Linux).

I should be grateful if anyone could have a go at designing and running some tests to try to find out the circumstances in which clients will and will not stay in sync with one another over a network in the latest builds on the master branch: the more data that I have, the better chance that I will have of being able to find and fix the problem.

Thank you in advance for any help.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

#1
I have just tested, and I can only reproduce this with Pak128.Britain-Ex, not Pak128.Sweden-Ex, although the reason for this is far from clear.

Edit: Further testing suggests that this problem is related to simuconf.tab settings: copying the simuconf.tab from the server to my local (Linux) machine, running a client using the server's settings results in it remaining connected stably, but using the default client settings results in it disconnecting in a very short time, even with a relatively simple map.

Edit 2: Connecting a (Linux) client to a remote server (either the Bridgewater-Brunel server or the sandbox server) desyncs regardless of whether the client was started using the server's version of simuconf.tab.

Edit 3: Further testing suggests that, with the server simuconf.tab, it will desync when connecting to a local server on some occasions, but remain connected stably on others, with the relatively simple map.

Edit 4: With a local client/server using the server's simuconf.tab file and the same saved game from the Bridgewater-Brunel server, the client seems to be able to stay in sync without difficulty.

Edit 5: For some reason, I am now unable to reproduce the desyncing on the local server at all, but it still desyncs in all configurations when connecting to any of the live internet servers.

Edit 6: I have managed to reproduce the desync locally in some very specific circumstances: the first time after changing the simuconf.tab file of trying to connect with a mismatched simuconf.tab file between client and server will desync, but subsequent occasions with the same mismatch, even after quitting and restarting the client, do not seem to desync.

Edit 7: I realised that I erroneously failed to change the simuconf.tab file when testing above, so the first connexion attempt even with the same simuconf.tab file resulted in a desync. However, the same behaviour seems to occur when I actually do change the simuconf.tab file: the first connexion fails, and then every subseuent connexion, even after quitting and restarting, works.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

#2
Further testing shows that the system for assigning the random number seed for the multi-threaded passenger and mail generation does not appear to be implicated in this: fixing this at 100 for all threads results in identical behaviour (immediate dseync on connexion to the server, with all random number seeds out of sync in the very first check) as with this set correctly.

Edit 1: Disabling the multi-threaded path explorer in network mode did not make any difference.

Edit 2: I think that I have finally managed to find and fix this problem. What appears to have happened is that, for some reason, the text files on the server seem to have got out of date and been not kept up to date with the current version (I have corrected this now). When I added town name prefixes and suffixes, the random number generator was used to prevent every town name being able to have every possible prefix and suffix combination in a single game; but, if the language files were different on client and server, this could result in the random number states being different, causing a desync. I have now fixed this problem by using a non-synchronous random number generator (sim_async_rand) for the town names, which will not affect the state of a network game.

The Bridgewater-Brunel server has been restarted with this fix - you will need to download the latest binaries in order to connect to it.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Junna

I get a desynch with the (current) build if I connect to my own server after a few minutes, and my friend gets desynched (since a number of versions ago) immediately upon building a stop (but not ways). I suppose this latter issue may have been related to the text thing, though, as I have not tested this yet on newer build.

jamespetts

Thank you for the report. I cannot reproduce the immediate desync on stop building on the Bridgewater-Brunel server. Can I ask whether you are able to stay connected to the Bridgewater-Brunel server?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Junna

It seems I can stay connected there.

There's an awful lot of trains getting stuck wherever there's non-signal signalling in use though.

jamespetts

For your own server, can you check that the text files are the same as on the client in case my fix did not work properly?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Junna

Well, for my friend connecting, the text files were probably not the same, but for my own, surely that would not be an issue? The desynchs were different in nature, however.

jamespetts

By your own server, do you mean that you are running a server on your own computer in your home, the same one on which you are playing?

However, I see the point about the desyncs being different in nature. It is possible that the desyncs that you are getting are entirely unreleated and arise, for example, because the server goes more slowly than the client, resulting in an timing desync.

I am afraid that it is impossible for me to test for what is causing your desync without a way of being able to reproduce it reliably myself.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.