News:

Simutrans Tools
Know our tools that can help you to create add-ons, install and customize Simutrans.

Desync issue (devel-new-2) with Linux Server/Windows client

Started by Ves, October 22, 2016, 09:03:44 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Ves

8a59a0d52e501ab26c4badbba2f45f196584856a executable is now on devel-new, but created a desync on server before I got to finnish this post

jamespetts

#71
I am having trouble reverting to Visual Studio 2012, as it does not support the thread_local keyword that I have used in implementing multi-threading of the private car route finder.

Does anyone knowledgeable about these matters have any idea what might be the cause of a desync specific to Visual Studio 2015? This seems to be a very difficult sort of problem. Has anyone tried compiling the latest Standard build in Visual Studio 2015 and seeing whether that stays in sync with the stame Standard build in Linux?

Edit: Also, is anyone able to compile in MinGW to see whether this will stay in sync with a Linux client?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

So, the server is restarted with last version. Linux-Linux connection is fine.

I tried to connect to bridgewater-brunel.me.uk, and got almost instant desync or even crash:
*** Error in `/home/vladki/simutrans/simutrans-experimental': double free or corruption (!prev): 0x0000000018f40b10 ***

Perhaps just because they run different version?

jamespetts

Perhaps - I have pushed a further (minor) change to devel-new-2 and am currently rebuilding the Bridgewater-Brunel server with this newer version now. The newer version will almost certainly not sync with the immediately previous version.

If anyone is able to test a Windows build made with MinGW to see whether that stays in sync, that would be very helpful, as it would help to narrow down why Visual Studio 2015 builds are not staying in sync.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ves

As you probably remember, I only have msvs 2015 on windows 10, so I cant assist with that. However, the newest builds (using mvsv2015) is on devel-new for testing.

jamespetts

#75
I am slightly confused by what you mean by the last part of the post; when you refer to the newest builds using Visual Studio 2015 being on devel-new, do you mean that you have builds from the (old) devel-new branch made with Visual Studio 2015, as opposed to the Visual Studio 2012 with which they would have been built when the commits of which they are builds were current?

Edit: Also, Vladki, would you be able to revert the server to 69ff5d7d2d1bface984c5c0546bc4004e90b63c4? Ters has built a Windows binary from that version using MinGW, and it would be instructive to test whether this desyncs with a Linux build or not. Thank you.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ves

Im sorry if I caused any confusion! I just stated that I had compiled the commit 9a7696b.

Vladki


jamespetts

#78
I have not had time to look at this recently, and am now visiting my parents for Christmas, so will not have the opportunity to look properly into this for a few weeks. However, what I do notice while I am here (with a Linux desktop computer and no Windows machine) is that I cannot stay in sync between my own Linux machine and either the Bridgewater-Brunel server or server.exp.simutrans.com, which also both run on Linux with the current latest commit of devel-new-2.

Does anyone else notice this, or can others connect properly?

The last testing for desyncs that I did was a few weeks ago when fixing the multi-threaded passenger generation, testing with a Windows client connecting to a Windows server over the loopback interface, which appeared at the time to work correctly. I do not think that any of the changes made since then will affect network synchronisation without any interaction, and I disconnect almost instantly from the Bridgewater-Brunel server and not only disconnect but sometimes crash when trying to connect to server.exp.simutrans.com.

This issue is likely to require lengthy investigation in the new year. However, it would greatly reduce the amount of time that I spend on this (and therefore increase the amount of time that I am able to spend on other things for Simutrans) if anyone could run tests to see which is the last Github commit in which a Linux client can connect to a Linux server without desyncing.

I should be most grateful if anyone could have a go at this test to assist me greatly in advance of the possibly gargantuan task of trying to fix this problem in the new year.

Edit: Very oddly, I cannot reproduce this issue when I am testing on my own Linux desktop over the loopback interface, for reasons that I cannot at present fathom. Either there is something different between the client and the server (I cannot see what as I have downloaded and built the latest pakset and code sources on both), or the desync arises from the act of actually connecting over the network (which seems unlikely as I have been able to get a stable connexion in the fairly recent past from my Linux desktop to the Bridgewater-Brunel server, over wifi, no less.

It would still be very helpful if anyone running a Linux client could let me know whether they can connect and stay in sync with the Bridgewater-Brunel server, however.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Server.exp.... was not updated for several days. Can you try it with the linux client and pakset provided there? Also check for interference on wifi. My problems witg desync disappeared when I switched from 2.4 GHz to 5 GHz

jamespetts

#80
Quote from: Vladki on December 21, 2016, 07:43:36 AM
Server.exp.... was not updated for several days. Can you try it with the linux client and pakset provided there? Also check for interference on wifi. My problems witg desync disappeared when I switched from 2.4 GHz to 5 GHz

With the executable from server.exp.simutrans.com (but without changing the pakset; I am not aware of having made any changes that will affect sync since the 13th of this month), the behaviour is the same as with the latest executable, viz. it will crash within a second of connecting.

As to wifi, I am on a wired connexion, so that will not be an issue.

Can I ask whether others are able to connect to either server without desyncing or crashing?

Edit: I have now tried connecting to the Bridgewater-Brunel server from a newly installed Debian package, and it still desyncs. It is unclear why.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Can you try changing the pakset as well? It reminds me of the problem with not completely disabled rescaled bus, which created broken pak file ".routemaster.dat"

jamespetts

I skip the pakset checks by loading using net:server.exp.simutrans.com in the load dialogue, so the change would have to produce an actual desync. Can I check - are you able to connect to bridgewater-brunel.me.uk with the latest binary and pakset?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

I have tried first my old binary (9ea227ca62aecc2ea326744d9467324cc91e4c58). I can connect to server.exp.simutrans.com just fine. I get immediate desync or crash with bridgewater...
I'll now compile fresh version and see EDIT: it is the same with latest build: 617dd75fc13f62c6cc715ae873ecc68467b2ccfe

jamespetts

Quote from: Vladki on December 21, 2016, 07:42:29 PM
I have tried first my old binary (9ea227ca62aecc2ea326744d9467324cc91e4c58). I can connect to server.exp.simutrans.com just fine. I get immediate desync or crash with bridgewater...
I'll now compile fresh version and see EDIT: it is the same with latest build: 617dd75fc13f62c6cc715ae873ecc68467b2ccfe

Thank you very much for testing: that is most helpful.

When you say that "it is the same", can you clarify what you mean by that? You described two different behaviours on connecting to two different servers; are both the same, or just one?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

With both versions (9ea... and 617...) I could connect and play without problems to server.exp.simutrans.com (both british and swedish pakset).
If I connect to bridgewater-brunel, I get immediate desync and crash upon second try - again with both versions of client (Linux 64-bit).

I have stil the feeling that the problem may be in pakset. I have this fix to remove the funny ".routemaster-rescaled.pak" file:

diff --git a/bus/routemaster.dat b/bus/routemaster.dat
index 1535554..b20d268 100644
--- a/bus/routemaster.dat
+++ b/bus/routemaster.dat
@@ -39,7 +39,7 @@ EmptyImage[N]=./images/routemaster.0.6
EmptyImage[NE]=./images/routemaster.0.7
---
# For TESTing of rescaled vehicles only - delete when testing complete.
-#obj=vehicle
+obj=vehicle
name=Routemaster-rescaled
copyright=JamesPetts&JamesHood
waytype=road

With the patch applied I have a proper file: "vehicle.routemaster-rescaled.pak"
Before this patch I had problems even with server.exp.simutrans.com.

jamespetts

Thank you - that is a useful clarification. There may be an issue with the Bridgewater-Brunel server. I am currently working on improving the road vehicles, so am on the road-vehicle-rescaling branch, whereas the server is running the half-heights branch. However, installing afresh on a completely different computer (my father's, whom I am encouraging to take up Simutrans; he has built a few stage coach lines in 1750 already) also produced a desync, and that install was from the .deb package on my nightly server, which is from the half-heights branch.

It is hard to see how the problem can be the Routemaster 'bus, which is introduced in 1956, when the game on the Bridgewater-Brunel server is currently in 1909, however.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

I have to say that my client pakset matches with server.exp.... but does not match bridgewater... (not only rescaled bus, also pedestrians have changed)
Can you try connecting with my pakset? I have the gut feeling that pakset mismatch may be the cause.

jamespetts

It could be. The Bridgewater-Brunel server should have the latest pakset on the half-heights branch, but it may be that the updating is not working: if it has different pedestrians, that would suggest that the pakset is a few weeks old. I will have a go at this when I have a moment.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Do not forget to clean up old .pak files. I remember that some objects were removed/renamed.

jamespetts

#90
Quote from: Vladki on December 22, 2016, 01:52:52 PM
Do not forget to clean up old .pak files. I remember that some objects were removed/renamed.

Yes, this is a particular issue with this system of having one .pak file per object. I will have to look into automating this by deleting the pakset folder entirely before rebuilding it.

Edit: I have found that using "make clean" works for this purpose. I will have to test it on the server.

Edit 2: The nightly pakset build on the server is now set to "make clean" before it "make"s, so, as of to-morrow, there should be a proper clean pakset in place. I should be interested to see whether this makes any difference.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

I have tried to connect to bridgewater-brunel server, but the pakset used by server does not match the nightly build

jamespetts

Quote from: Vladki on December 31, 2016, 12:26:49 AM
I have tried to connect to bridgewater-brunel server, but the pakset used by server does not match the nightly build

I will have to look into this when I get a chance.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Just one more note, as you work on transparent vehicles, you could completely remove the partial definition of rescaled routemaster bus to avoid the pak file starting with dot, which is in the nightlies. Sync (delete obsolete paks) the pakset for bridgewater server and try if it helped. Or try connecting to the swedish pak server.

jamespetts

I have removed the "routemaster-rescaled" from the road-vehicles-rescaling branch. When work on this branch is complete, I will be able to merge it back into the half-heights branch and hopefully we will then be able to test whether this helps.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ves

I made some tests today to see wether I could log on to the bridgewater-brunel.me.uk servergame. It desynced and crashed quite heavily, but in a curios pattern:

1st attempt: lost syncronization after 26 seconds
2nd attempt: Crash to desktop after 12 seconds

after restarting simutrans:
3rd attempt: lost syncronization after 23 seconds
4th attempt: Crash to desktop after 13 seconds

after restarting simutrans again:
5th attempt: lost syncronization after 21 seconds
6th attempt: crash to desktop after 13 seconds

The pattern seems to be that when the servergame is accessed first time in a game session, it last for around 20-25 seconds before it will desync. When connecting again (without restarting Simutrans) you can only be there for 12-13 seconds before the entire game crashes. As if the first attempt to log onto the servergame influences the second attempt. Also note that the initial attempts to start the servergame after a crash appears to trigger the desync earlier and earlier.

Using:
Windows 10
Executable compiled with msvs 2015: 2d60c8ffe5ecbc6192e5a846fc59907e7a6442d7
Pakset (half height branch) compiled with corresponding makeobj: 65b85f3f8c231057f35bdfb7deb8b7dd1b8f02e3

I dont know if you can use this in any way, but thought I should report it to you anyway!

Happy new year!  ;D

jamespetts

Thank you for letting me know, and happy new year to you, too!
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

I have seen the same pattern as Ves few weeks ago (desync, crash, desync, crash, ....

Linux 64-bit

jamespetts

I have just pushed a fix which involves removing an instance of casting away const, which is undefined behaviour. I have not had a chance to test the effect on networking yet, but I wonder whether anyone could test whether this helps when a Linux server has a Windows client connected to it?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.


Ves

May I ask, where do I see which version the server is? is it always the latest nightly? Can I rely on the date that is shown as the "last modified"?

My results:

Connecting to server.exp.... swedish server with 99d6634f2c44dacc986c84715fb91f942e7a79f3 yields no problems whatsoever.
Connecting to server.exp.... british server with the same comit lets me stay synced. However, it feels a bit unstable, as if I tamper with some of the deadlocks for instance, it might desync me.
Connecting to bridgewater server with a8ab51179693fa413f17c9d5040724e895c34ae8 yields the same results as described previously in this thread (desync after 20 sec on first attempt, crash after 12 sec on second)

jamespetts

Hmm - interesting. I thought that server.exp.simutrans.com had previously had desyncs without interaction within seconds of connecting?

Perhaps you could try using the same map from the Bridgewater-Brunel server on server.exp.simutrans.com to see whether you can reproduce the desyncs there?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Quote from: jamespetts on January 09, 2017, 11:38:32 PM
Hmm - interesting. I thought that server.exp.simutrans.com had previously had desyncs without interaction within seconds of connecting?

Perhaps you could try using the same map from the Bridgewater-Brunel server on server.exp.simutrans.com to see whether you can reproduce the desyncs there?

Here you are - port 13354 - connected just fine

Quote from: Ves on January 09, 2017, 11:26:09 PM
May I ask, where do I see which version the server is? is it always the latest nightly? Can I rely on the date that is shown as the "last modified"?
You can rely on "last modified" or the info in README.txt. However If the file is very fresh, it may be that the server is runnign the previos version and will be restarted soon. All is done manually, and the order of operations (upload/restart) may be random.

Quote
Connecting to server.exp.... swedish server with 99d6634f2c44dacc986c84715fb91f942e7a79f3 yields no problems whatsoever.
Connecting to server.exp.... british server with the same comit lets me stay synced. However, it feels a bit unstable, as if I tamper with some of the deadlocks for instance, it might desync me.
Connecting to bridgewater server with a8ab51179693fa413f17c9d5040724e895c34ae8 yields the same results as described previously in this thread (desync after 20 sec on first attempt, crash after 12 sec on second)
Yeah, I have the same with linux client.

Ves

QuoteYou can rely on "last modified" or the info in README.txt. However If the file is very fresh, it may be that the server is runnign the previos version and will be restarted soon. All is done manually, and the order of operations (upload/restart) may be random.
Im sorry, I did not mean the server.exp... I meant the bridgewater-brunel server.

QuoteHmm - interesting. I thought that server.exp.simutrans.com had previously had desyncs without interaction within seconds of connecting?
Perhaps you could try using the same map from the Bridgewater-Brunel server on server.exp.simutrans.com to see whether you can reproduce the desyncs there?
Connecting to port 13354 with 99d6634f2c44dacc986c84715fb91f942e7a79f3 only caused "normal" desyncs after around 30 seconds for me. No crashes though!

jamespetts

Do I recall correctly that Vladki, you are running Linux, and Ves, you are running Windows? And do I correctly understand that Ves is referring to a "normal" desync as one that occurs without any user interaction?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.