The International Simutrans Forum

 

Author Topic: Instability on the Bridgewater-Brunel server  (Read 2271 times)

0 Members and 1 Guest are viewing this topic.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 17636
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #70 on: October 06, 2018, 09:42:35 PM »
The result of the third test is significant and interesting. I took the 1937 game that runs without losing synchronisation, then used the public player tool to advance the year to 1940 without changing the game-state other than the date, and re-ran the test. This time, the game would lose synchronisation after a few minutes again. I re-tested with the 1937 game and confirmed that it did not lose synchronisation. This suggests that there is an issue with some item automatically placed in the game which has an introduction or retirement date in around 1939, the most obvious candidates for which are roads.
Edit: Further testing has shown that copying the latest (client) simuconf.tab to the server (save for replicating the server's original network settings) does not prevent the loss of synchronisation from occurring with the >1939 saved game.
« Last Edit: October 06, 2018, 11:00:18 PM by jamespetts »

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2496
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #71 on: October 07, 2018, 04:06:21 AM »
Might be worth trying to advance the server beyond 1940 to see if there is a date the problem stops. This could help locate specific objects causing the problem. Obviously the year has to be advanced either offline and the server restarted or the server re-joined afterwards since one can assume that any windows client touching 1940 will be out of sync instantly and is only booted later when a checksum check is performed.

It might also be worth clean installing Simutrans on the server (making sure not to lose all saves). Although the pakset is hash checked by clients, files like simuconf.tab are not.

Of course one should make sure that the server is really going out of sync with the clients. It could be something to do with the time that starts somewhere in 1940 causing a false positive OoS detection.
« Last Edit: October 07, 2018, 06:50:56 AM by DrSuperGood »

Offline Ves

  • Devotee
  • *
  • Posts: 1520
  • Languages: EN, SV, DK
Re: Instability on the Bridgewater-Brunel server
« Reply #72 on: October 07, 2018, 12:21:59 PM »
I went ahead and looked at what ways where becoming available in the time frame leading up to 1940. These objects are taken from all dats in this directory: https://github.com/jamespetts/simutrans-pak128.britain/tree/master/ways

Name=hr-asphalt-road-medium
intro_year=1935
intro_month=6

name=BrickViaduct
intro_year=1838
intro_month=7

Name=city_road
intro_year=1932
intro_month=1

name=ConcreteSteelCantileverRoad
intro_year=1937
intro_month=5

Name=concrete_road
intro_year=1936
intro_month=9

Name=runway
intro_year=1938
intro_month=9

name=airport_oneway
intro_year=1938
intro_month=9

Name=taxiway
intro_year=1938
intro_month=9

---- close retire dates ---- (not a complete list, since I didnt think of checking the retire dates until midway through the list...)

Name=macadam_road
retire_year = 1936
retire_month = 7

Name=WoodenTretleElevatedNarrow
retire_year=1938
retire_month=7

name=WoodenTrestleNarrow
retire_year=1938
retire_month=7

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 17636
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #73 on: October 07, 2018, 08:38:15 PM »
Dr. Supergood - that is a very useful suggestion. I have tried advancing the time to 2000, and there is no loss of synchronisation with this. I will try a few intermediate dates to see what the cut-off is.
Edit: The loss of synchronisation still occurs in 1950.
Edit: The error also seems to occur in 1975.
« Last Edit: October 07, 2018, 10:09:05 PM by jamespetts »

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2496
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #74 on: October 08, 2018, 08:03:17 AM »
Might be worth binary searching the exact start and end year.

It could be coupled to town buildings/attractions, industry or private cars since all of those are subject to introduction or phase out with year.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 17636
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #75 on: October 08, 2018, 10:08:06 AM »
Each round of testing takes a considerable amount of time, so it will take a long time to get to the point of checking the exact year. I am planning to try to find it as precisely as possible, however.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 17636
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #76 on: October 12, 2018, 12:21:08 AM »
Further testing has revealed an error in the earlier testing, but that error itself has revealed interesting data. When I advanced to 2000 initially, I had used a game saved in 1937. However, the initial testings of 1950 and 1975 had used the game saved in 1939 - after the problem had arisen. Re-testing in 1952 with the game saved in 1937 shows that the client is able to stay in sync with the server.

This suggests that it is the presence in the game of an object that is automatically built sometime in the 1939-1952 era that causes the problem, rather than the building of the object while the client is connected.

I will have to test further when I have more time to see which year that the problem first goes away.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2496
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #77 on: October 12, 2018, 03:56:21 AM »
There is a limit to what objects are automatically created or manipulated.
  • Trees.
  • City buildings/attactions.
  • Walking passengers.
  • Private vehicles.
  • Terrain slopes (due to construction of city buildings).
  • City roads.
  • Resurfacing of all existing roads, rails, etc, potentially to a different type due to obsolesence.
  • Industries, and industry linking.
  • Power consumption/generation.
  • Bridges, and hence grounds, due to the construction of city road bridges over obstacles.
« Last Edit: October 12, 2018, 05:47:04 AM by DrSuperGood »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9238
  • Languages: De,EN,JP
Re: Instability on the Bridgewater-Brunel server
« Reply #78 on: October 12, 2018, 04:48:43 AM »
Are there exponenents or square roots used in any generation routin? It may be that those are slightly deviations only for number generated in that era. Because if there is no desync when running both under Linux, I would suspect something like this ...

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2496
  • Languages: EN
Re: Instability on the Bridgewater-Brunel server
« Reply #79 on: October 12, 2018, 05:46:38 AM »
Quote
Are there exponenents or square roots used in any generation routin? It may be that those are slightly deviations only for number generated in that era. Because if there is no desync when running both under Linux, I would suspect something like this ...
There is a software implementation for these which should be deterministic between platforms. The software implementation is heavily used by vehicle physics which cannot directly be the cause due to there being dates that the game remains in sync for hours despite ~10,000 vehicles.

Anyway an idea that occurred to me was to disable multi threading on both server and client for a test. If this stops it going out of sync then it is caused by something multi thread related.
« Last Edit: October 12, 2018, 07:21:19 AM by DrSuperGood »