News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Instability on the Bridgewater-Brunel server

Started by DrSuperGood, September 06, 2018, 03:21:35 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

Rollmaterial

Technically I didn't "use" them, I just sent out the trains that were in the depot. If I remember correctly they were made of 5 carriages: brake front+middle+restaurant+middle+brake rear. They were arranged in the same order as the train with SR stock currently running on the line.

VOLVO

So there were a few things below hopefully would bring new ideas of what went wrong:
1. On some stations there were no post offices or whatever I just put a bus stop with postboxes or roadside loading bay with postboxes.
2. The whole Northern Frontier Express line only has one dedicated mail train running with no schedule (as opposed to the Eastern Frontier Express which has some trains with mail brake carriages).
3. The Mail line of Northern Frontier Express (Northern Mail Express) is separated from the Passenger Norther Frontier Express, at Bealdean Rye the mail train line terminates at DrSuperGoods's Terminus, and the Passenger line terminates at my own station. (Current testing of mail trains seems to be running the passenger line)
4. My lines have strong mixing of old semaphore signals and modern 4 aspect light signals.
5. Some signals are placed at the second or even third platform tiles because the signal is built first then for operation reason the platform is extended. This I have done for a very long time I doubt it is the cause but still thought I mention in case anyone can think of something.

I also notice the branch line for Ves's Green Quantinglow Airport has been built when the strong dysncro came, and by that time the NFE Mail trains have already been running for quite a few game years.At that period of time all trains on the Northern Frontier Express has been replaced with LNER A4s and only the mail train was running on A3, so testing with A1 or A3 may not be representing the rolling stock of the dysncro period.

jamespetts

Interesting - thank you both for that. I have re-started the trains that were already in the depot at Elmley which have the formations indicated by Rollmaterial (A3s with GNR corridor carriages as specified).

Loss of synchronisation was encountered, but only after quite a long time.

I should be grateful if others could also test the server in the present state with these trains running to test for ~1 hour for loss of synchronisation. This suggests that it is possible that the problem is specific to certain types of rolling stock, but this will need more testing to narrow down.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

I have only had a very brief chance to look at this, but the position appears to be that neither the 1937 nor the 1939/1940 versions of this game carried mail on the Northern Frontier Express. The 1937 version used the same A1s and GNR corridor carriages as were found in the depot in 1939 (these sets apparently having been withdrawn from service on upgrade), and the 1939 version using A4 locomotives and the later LNER corridor carriages. The 1937 saved game already had a few trains with the A4s and LNER corridor carriages which had been purchased in that year. However, in the 1937 version, the classes had not been reassigned on the newer LNER carriages, whereas by 1939, Bay Transport had reassigned the classes on all the trains on that line (including those with newer carriages) to very low.  The earlier trains with the GNR carriages had their classes already reassigned to very low.

Loss of synchronisaiton thus appears from this preliminary study to occur when either a mail train is run on this line (even though this line did not originally have mail trains) or when the later LNER carriages are run with reassigned classes.

This is an odd pattern of failure and does not suggest anything useful about the underlying code. However, it would be worthwhile carrying out tests by checking to see whether adding the A4s and LNER carriages to the line again (removing all existing trains on the line) without reassigning the classes causes loss of synchronisation or not, then reassigning the classes after testing for ~1 hour and testing for ~1 hour again to see whether loss of synchronisation results this time.

My time for testing at present is very limited: since these are tests that anyone can perform on the server, it would be extremely helpful and help to make progress towards a fix if anyone could test this this on the server and report the results here.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Rollmaterial

#144
Reassigning classes seems to affect the time after which sync is lost: it happens after 5-10 min with classes reassigned to very low and after ~30 min with default low class, independently from the choice of rolling stock. This suggests a correlation with the amount of passengers carried and that the issue is in passenger routing or generation. I will now test with reassigning the classes to medium.
Update: Desyncs after ~40 min.

jamespetts

That is very interesting - thank you for that. That the reassignment causes the loss of synchronisation to happen more quickly when reassigned to very low and more slowly when reassigned to medium does seem consistent with your hypothesis that the problem is with passenger routing somehow and that the faster loss of synchronisation is caused by the greater number of passengers in the lower classes who are transported when reassigned to very low.

This does not, therefore, by itself get to the bottom of what the essential changes are between 1937 and 1939 such that the former is stable for >1 hour even with reassigned classes and the latter is not.

Were there any changes to the schedule of the Northern Frontier Express, between these dates may I ask?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

I assume you have tried after removing all his passenger aircraft? There is a known issue with broken passenger routing over air routes. The symptoms are that changing any passenger related schedule anywhere causes all air passenger (not mail) routes to be lost from the network. This is to rule out the rail line being a manifestation of this routing issue spilling over somehow.

SuperTimo

Quote from: jamespetts on November 08, 2018, 10:59:43 AM
Were there any changes to the schedule of the Northern Frontier Express, between these dates may I ask?

There was the case that the trains got stuck multiple times, which might have led to a large amount of passengers on the line once the route was running again.

jamespetts

Quote from: DrSuperGood on November 08, 2018, 11:49:31 AM
I assume you have tried after removing all his passenger aircraft? There is a known issue with broken passenger routing over air routes. The symptoms are that changing any passenger related schedule anywhere causes all air passenger (not mail) routes to be lost from the network. This is to rule out the rail line being a manifestation of this routing issue spilling over somehow.

I am aware of this issue, which I was unable to diagnose after quite a few hours of testing. I did not continue testing for that because this issue then arose. That issue was found to be not specific to aircraft, but rather an issue affecting higher classes of passengers (becoming progressively worse the higher the class if I recall correctly; aircraft of this period default to the "very high" class, so exhibit this behaviour more readily).

Nonetheless, there was a test (which I believe is documented in detail on this thread) involving removing first all aircraft, then all ships, then all road vehicles until only trains remained and the loss of synchronisation still occurred. Then it was discovered that liquidating Bay Transport prevented the issue; then it was discovered that removing the Northern Frontier Express prevented the issue, and now it has been discovered that the problem occurs or not depending on what rolling stock is run on the Northern Frontier Express, but no intelligible pattern can yet be discerned from these data. The problem is that each round of testing takes such an enormous amount of time and so many rounds of testing are required to get good data that even narrowing the issue further is likely to take an extremely long time (i.e. many, many months given the amount of time currently available to me) without considerable assistance in testing.

Once the problem has been narrowed down more accurately, I can start looking at the code in more detail and isolating specific pieces of code and testing with those disabled to see wherein the problem arises.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

The current state of the NFE led to de-sync for me within ~10mins.

James what would you want tested next?

I believe that Rollmaterial tested switching to the A4s with reassigned classes, currently the line is running A4s with the class reassigned to medium.

jamespetts

Thank you - that is most kind.
I think that the locomotive is not relevant, as it has been tested hauling freight wagons without any loss of synchronisation.

The next useful test would be to reset all classes on that line to the default and, making no other changes, try to stay in sync for circa 1 hour.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

I reset the class for the route and was able to stay in sync for an hour. I left of my own accord.

jamespetts

That is extremely helpful, thank you.

The next useful test would be to try re-assigning the class from low to medium rather than from low to very low and try to stay in sync for ~1 hour and see whether that makes any difference.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

I have done that. I have de-synced three times, usually within <10 mins

jamespetts

Quote from: SuperTimo on November 10, 2018, 01:10:51 PM
I have done that. I have de-synced three times, usually within <10 mins

This is very interesting - thank you. The next thing to test would be to try some different rolling stock (e.g. GWR carriages) and see whether you lose synchronisation with those (a) with default classes; or (b) with reassigned classes (i) to very low; and (ii) to medium.

Thank you very much for this most useful work.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

#155
I have replaced the rolling stock with GWR stock, same formation |brake - normal - dinning - normal - brake|. Has stayed in sync for at least 40 mins, however i did suffer a loss of synchronisation during the process of changing the trains. I will test changing the classes tomorrow.

edit: somehow I misspelt the brake the second time.

jamespetts

Quote from: SuperTimo on November 10, 2018, 09:18:12 PM
I have replaced the rolling stock with GWR stock, same formation brake - normal - dinner - normal - break. Has stayed in sync for at least 40 mins, however i did suffer a loss of synchronisation during the process of changing the trains. I will test changing the classes tomorrow.

That is extremely helpful, thank you. Another thing that might be worth testing is, for any state where you get loss of synchronisation, try removing the dining cars and seeing whether this makes a difference when everything else remains the same.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

#157
So far I have the following results:

- GWR Default Class, de-synced once but I think that was caused by something else as otherwise I was able to stay connected >50mins.-
- GWR Very Low: constant de-syncing <10mins each time.
- GWR medium: yet to de-sync been connected for around 20mins, will see if this lasts.

I will test with removing the dinning cars either later or tomorrow as I want to play another game and this is using up a fair amount of RAM and processing power.

edit: medium class still in sync after around an hour so it seems pretty stable.

jamespetts

That is very helpful - thank you very much. I shall look forward to your dining car tests.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

#159
I have started to test the trains without the dinning car. Something I have noticed is that someone is playing on the server as a company called Northside Transport. They have built a railway line using some mothballed track and have several bus lines. I am not sure whether this is affecting things but either way it is potentially compromising the testing we are trying to undertake.

edit: So far the results without the dinning car are the same as with it. Setting the class to low results in no de-sync, setting it to medium causes de-sync, I am about to test it on very low but I think it is likely this will result in de-sync.

Something I have noticed is that when changing class the de-sync only seems to begin once the trains have removed all of the passengers of different classes.

edit 2: switching to very low also causes de-sync without a dinning car.

jamespetts

Quote from: SuperTimo on November 12, 2018, 11:07:52 AM
I have started to test the trains without the dinning car. Something I have noticed is that someone is playing on the server as a company called Northside Transport. They have built a railway line using some mothballed track and have several bus lines. I am not sure whether this is affecting things but either way it is potentially compromising the testing we are trying to undertake..

I have noticed that, too: however, if we can get a state where one set of conditions results in loss of synchronisation and another does not, then that should suffice for the test to be independently verifiable. The problem would arise only if what Northam Transport is doing causes a loss of synchronisation in any event, which would be detectable if there ceased to be any states in which the loss of synchronisation did not occur.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

I think people should be more concerned about that player wasting their time playing on a server where no progress will be retained as it is being used to debug...

I have noticed very glitchy mechanics with the passenger class system in the past, shortly before the problem started. For example a train that could only hold around 400 people max (including standing) was suddenly holding over 2,000 people when I made a change in passenger class. It still listed the correct maximum capacity, but it was completely ignored this number and pretty much loaded as if it had 400% extra capacity.

jamespetts

Quote from: DrSuperGood on November 12, 2018, 06:08:15 PM
I think people should be more concerned about that player wasting their time playing on a server where no progress will be retained as it is being used to debug...

I have noticed very glitchy mechanics with the passenger class system in the past, shortly before the problem started. For example a train that could only hold around 400 people max (including standing) was suddenly holding over 2,000 people when I made a change in passenger class. It still listed the correct maximum capacity, but it was completely ignored this number and pretty much loaded as if it had 400% extra capacity.

If you have a reproduction case for this, I should be grateful if you could post a fresh bug report for this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

VOLVO

Quote from: jamespetts on November 08, 2018, 10:59:43 AM
That is very interesting - thank you for that. That the reassignment causes the loss of synchronisation to happen more quickly when reassigned to very low and more slowly when reassigned to medium does seem consistent with your hypothesis that the problem is with passenger routing somehow and that the faster loss of synchronisation is caused by the greater number of passengers in the lower classes who are transported when reassigned to very low.

This does not, therefore, by itself get to the bottom of what the essential changes are between 1937 and 1939 such that the former is stable for >1 hour even with reassigned classes and the latter is not.

Were there any changes to the schedule of the Northern Frontier Express, between these dates may I ask?

I usually increase frequency a little after the trains get stuck to prevent them stucking the stations which shares with other network or lines.

I will come back to it this weekend I should have more free time than previously and help out with the debugging.

jamespetts

That is very helpful - thank you.

After the dining car/catering tests have been complete, the next tests that I want to run involve creating some special rail vehicles just for debugging, being (1) the identical LNER carriages used on the Northern Frontier Express, but with the default class set to very low; (2) the identical LNER carriages but with no overcrowded capacity; and (3) the identical LNER carriages with the default class set to very low and no overcrowded capacity. Each should be tested in turn for ~1 hour (or until earlier loss of synchronisation) to see whether this makes any difference. Each test should change only the carriages used on the line and nothing else.

The aim of these tests is to see whether the reassignment of classes itself causes the problem, whether the problem is caused by overcrowding (as is known to occur on that line) or whether the problem is caused by some combination of reassignment and overcrowding. If these tests yield a result showing evidence of a possible causal relationship between overcrowding and/or class reassignments and the loss of synchronisation, then the area of problematic code in question can potentially be narrowed considerably and work can begin to isolate/disable sections of the code for testing.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

VOLVO

WIth the current running Castle Class and GWR carriages there is no disyncro at all.

After changing the stock for A4 and LNER carriages with very low price > disyncro after over 2hours.

Also, if there's a way to disable all the 'no route' message boxes that'll be very nice as it's difficult to work with with everything popping up.


jamespetts

Can I ask - did you reassign the classes for the GWR carriages? Also, did you run the GWR carriages with a catering vehicle?

To disable the no route pop-up windows, go to Message centre > Options and uncheck the right hand and centre checkboxes for "warnings" and "problems".
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

VOLVO

No, they were in very low configuration and no catering vehicles.

jamespetts

Thank you for the confirmation.

The next step is for me to set up some special debugging carriages identical in every respect to the existing LNER carriages save that they are of "very low" class by default - testing with these (only) will help to determine whether the problem is the re-assignment of classes that is the problem.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

I have now added the debug versions of the Gresley carriages. These will be available from to-morrow's nightly build of the pakset. They can be distinguished because they have no translated name and their default name starts with DEBIUG1_. They have very low classes by default.

It would be very helpful if somebody could run a test to see whether using these carriages (and no other) without any class reassignments  on the Northern Frontier Express there is any loss of synchronisation or not.

I am very grateful for the testing work done so far.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

I have been testing with the debugging carriages - so far, I am able to remain connected stably to the server for quite some time with a number of trains of the special debugging carriages (including dining cars).

I should be grateful if others could test and confirm whether they are also able to stay connected. If so, this strongly suggests class reassignment as the source of the problem.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Rollmaterial

#171
Just tested. I lost sync after ~10 min twice, then managed to stay connected for over an hour. The desyncs may have had something to do with there being a train running in reverse schedule, which I fixed at some point during the second or third attempt. I have also noted that the game takes a considerable time to resume after the client has connected and loaded the map.
Edit: Tried again, once again desynced after ~10 min twice then stayed connected longer on the third attempt.

jamespetts

Thank you for testing: that is helpful. Can I ask why you think that running in reverse schedule is relevant? Have you tested for this?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Rollmaterial

My second round of testing seems to indicate that running in reverse schedule isn't related to the bug.

jamespetts

Thank you for letting me know. Can I ask what tests produced this result?

What we really need to do is to conduct tests to deduce which features of this line are unique such as cause a loss of synchronisation whereas other lines do not. So far, the tests have been oddly inconclusive. What we may need to do next is duplicate the line, run on it trains that we know cause a loss of synchronisation, and element by element alter the line's schedule until the loss of synchronisation no longer occurs; then return to the original schedule and make the last change alone to see whether that change is decisive in and of itself or whether it is cumulative with other changes, and, if the latter, with what changes it is cumulative.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.