News:

Simutrans.com Portal
Our Simutrans site. You can find everything about Simutrans from here.

[BUG] Direct Routes Not Updating.

Started by DrSuperGood, April 23, 2019, 05:17:21 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

DrSuperGood

Bridgewater Brunel server. Running the nightly at the time of posting connected to reliably with client.

The "Direct routes from here" section of stops is not updating. Looking at what is going on, it seems routing information is not being updated in general.

For example a mail route I set up 8 game months ago to redirect all mail from Sandy Bay station (5102,2313) to Wardmead station has not updated the routing of Sandy Bay station which is still trying to send the mail via a train line that no longer services Sandy Bay station. This means actual routing information is not being updated, and not just that the "Direct routes from here" list is broken.

Another example is the South island of Stashridge (4639,3368). Both the mail and passenger stops and lines here are brand new. None of them are getting any "Direct routes from here" information. The line has been operational for 3 game months. I suspect the only reason there is still some passenger traffic is because it still thinks the old routing to the island exists. No mail is being generated at all, with the stops listing "No mail service".

I even waited with the server game running for an entire game month and no change was observed. Still incorrect or missing routing information. This means either the routings are updating impractically infrequently (game years...), or that they are not updating at all.

jamespetts

Thank you for your report.

Can I ask you to check whether an actual cycle of the pathfinding algorithm has run? You can find the current status of the path explorer in the display dialogue.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

#2
I am guessing
Quote from: jamespetts on April 23, 2019, 05:54:08 PMCan I ask you to check whether an actual cycle of the pathfinding algorithm has run? You can find the current status of the path explorer in the display dialogue.
It fails to run fully. It holds mostly on Status of "Passengers (p_class_0) / explore" and "Passengers (p_class_1) / explore". There are some other Status briefly between, but those seem to be the big ones and none observed for any classes other than those. All other values shown remain constant.

I am guessing it times out due to how complex the map is. It finishes passengers class 0 and starts on passengers class 1 but times out and then begins again back on passengers class 0. As such it never touches passengers class 2, 3 and 4 and never touches mail classes 0 and 1.

The expected behaviour is that it should cycle through all passenger classes sequentially, and then all mail classes sequentially. Even if a save/load cycle is performed, it should begin again where it left off so as to not starve updating any classes.

I will leave the game running a few more cycles to confirm this pattern and edit this post if there was any change.

EDIT:
Other stages run between the two observed, but usually last only a small part of a second or so.

EDIT2:
Can confirm. It just does p_class_0 and p_class_1. It does not progress beyond that or run for any mail classes. Since it is not completing the stop routing tables are not being updated.

jamespetts

Thank you for that. There is no algorithm that stops the path explorer if it has been running for too long, so there is no applicable concept of it "timing out". It can in theory keep going on any one phase for eternity.

You refer to other stages running "between the two observed"; do you mean between 1 and 0, or between 0 and 1, or both?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

QuoteYou refer to other stages running "between the two observed"; do you mean between 1 and 0, or between 0 and 1, or both?
I think both. But they change so fast that I cannot really remember the exact names or sequence of them. In any case they only involve p_class_0 and p_class_1. I observed for around an hour and only ever those two with most of the time (>99%) spent on "explore".

jamespetts

Quote from: DrSuperGood on April 25, 2019, 12:06:12 PM
I think both. But they change so fast that I cannot really remember the exact names or sequence of them. In any case they only involve p_class_0 and p_class_1. I observed for around an hour and only ever those two with most of the time (>99%) spent on "explore".

That is the time consuming phase. What you are describing suggests that the substantive work for subsequent phases of the path explorer may be being skipped for some reason. I am not currently sure what that might be, and this may require very extensive investigation.

Relevant recent changes to the path explorer that will require such investigation are the path explorer data saving algorithm and the modifications to the path explorer multi-threading management.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

It is entirely possible it was not working on the server for a very long time. Before the paths were rebuilt on every load so them not updating during play might not have been noticed due to how frequent reloads are. Just now that paths are being saved it makes a huge difference since it is they only way they get updated.

Rollmaterial

Quote from: DrSuperGood on April 25, 2019, 12:18:27 PM
It is entirely possible it was not working on the server for a very long time. Before the paths were rebuilt on every load so them not updating during play might not have been noticed due to how frequent reloads are. Just now that paths are being saved it makes a huge difference since it is they only way they get updated.
That is what I have been observing too. The path explorer never gets beyond p_class_3 before the next back-up save.

jamespetts

Thank you for your reports. I have spent some hours on this this evening, but have been unable to obtain any reproduction of this.

The difficulty is that this does not appear reproducible in a smaller sized game. In the Bridgewater-Brunel game, it takes such a long time to refresh the paths that it makes it impractical for testing (a game that I loaded about half an hour ago is still on the same phase of refreshing; I would need to reproduce the actual occurrence of the issue many scores of times to carry out enough tests to narrow down the problem, which would take potentially hundreds or even thousands of hours of continuous work, which will not ever be possible). Even in a fairly substantial game (the britain-2012 saved game), I cannot reproduce any case in which anything which causes the path explorer to refresh does not cause it to refresh all categories.

One thing that I have noticed with a smaller sized game is that the path explorer will refresh all of the classes and categories and then go on immediately afterwards to refresh some of them again. I have not looked in detail at what causes this, as I cannot reproduce any case in which this actually causes a problem, and in any event, the number of classes/categories for which this occurs is not consistent, suggesting that it would take an extremely large amount of work to find the cause of this in circumstances where it may not be relevant at all. I cannot deduce any mechanism by which this could prevent certain classes/categories of the path explorer running. There is no known way of the path explorer failing to refresh a requested category/class when it comes to the time to do so. Refreshing all the categories/classes properly, then refreshing some but not all of them again is a known behaviour, but this is not capable by itself of causing some classes/categories never to update at all.

Could it be that what Dr. Supergood is seeing is the second superfluous run of the path explorer as described above, which sometimes does terminate short with lower classes (it is indeterminate which lower class will terminate it)? In this case, what will normally happen is that the next trigger for the path explorer to re-run will trigger it to resume from the point where it left off.

At present, I am not able to find any way of reproducing this at all.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

I guess this is another 6 month to fix issue then lol. Since the server is basically unplayable until it is fixed as outside of upgrading existing lines it is impossible to create functional new stops as well as major line schedule alterations.

Maybe if more information was shown, or logged. Such as printing to debug what stages the path explorer jumps to and when. I would be willing to sit for an hour or so on the server to capture these, if the server logging itself is not sufficient.

One must also consider that the map gets save/load cycled fairly frequently. What happens to the current intermediate path explorer results during such a cycle?

ACarlotti

I have suspicions that a call of path_explorer_t::refresh_all_categories(true) may be to blame, since this appears to reset the current category. I intend to find out where this is called from, and try using that knowledge to trigger the symptoms described above. Assuming I am correct in all of the above, and provided I can find no good reason that the current cateogry needs to be reset, then the fix will likely be to remove those two lines of code.

ACarlotti

I actually tried something slightly different, which is to replace all calls of path_explorer_t::refresh_all_categories(true) with calls of path_explorer_t::refresh_all_categories(false). It doesn't seem to result in any crashes, and the path explorer seems to behave reasonably even with that change. This now prevents the path_explorer resetting when convoys enter depots, the map is enlarged or rotated, or reroute_check_interval_steps steps have occurred. The last one had a comment added saying that "This is not the computationally intensive bit of the path explorer.", which seems a little misleading to me - by itself it isn't computationally intensive (it just resets a bunch of values), but it triggers a full regeneration of path explorer data, which will be computationally expensive.

I've pushed this (and some other stuff) to Github. (The other stuff is a a bit more thread refactoring along the same lines as previously, a fix to an #ifdef which was on the wrong condition, and a fix to prevent crashes if MSG_LEVEL>=4).

DrSuperGood

How do save/load cycles effect it? Running the path explorer on the server game might take anywhere from 15 to 30 minutes from start to end. Hopefully with the fixes I can measure it.

jamespetts

Thank you for your work on this, especially A. Carlotti. I have now incorporated these changes, and I should be grateful if anyone could re-test with to-morrow's nightly build to check whether this has helped.

What I noted when testing was that there were no circumstances in which the path explorer would not run a full refresh cycle on all classes of the passenger category if a passenger line was changed, even if a load/save cycle was run part-way through the refresh or even if a new refresh was triggered part way through an existing refresh. I tested on a game large enough that the refreshing took a few dozen seconds so that the refreshes could be triggered part way through an existing refresh, but not so large that the refreshes took an unfeasibly long time.

However, if the data were actually erased during one of these refreshes, this conceivably might account for the symptoms being shown, so it may well be that A. Carlotti's fix is the correct solution to this. One possible side effect is that, on smaller or medium sized maps, changes that occur during a refresh may have to await the next refresh in order to propagate - on a very large map, this is unlikely to make any practical difference (save to fix the problem, if indeed it does that).

Incidentally, if the path explorer is now interrupted by a load/save cycle, it should save all its partly completed work and continue where it left off. It is not impossible that there is a bug somewhere that prevents this from working properly, but I have not seen evidence of such a bug so far.

In any event, we shall await the outcome of testing with A. Carlotti's fix to see whether this improves matters. Thank you all for your reporting/testing/fixing: it is much appreciated.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

I connected to the server and the path explorer seems to be running okay. Unfortunately I didn't get a chance to run it through all the classes as someone joined and this stopped it running. However I did download the server save and ran the sever game locally, and the explorer managed to cycle through all stages (including mail) and this has appropriately displayed the 'direct routes' for mail routes I showed in the other thread.

It took about 20 mins running the save locally to cycle through all the different freight/passenger classes in the path explorer.

ACarlotti

Quote from: SuperTimo on April 28, 2019, 11:15:36 AMsomeone joined and this stopped it running
What do you mean by 'stopped it running'? In this situation I think it is supposed to resume running exactly where it paused before the save/load cycle.

SuperTimo

#16
Quote from: ACarlotti on April 28, 2019, 12:56:09 PM
What do you mean by 'stopped it running'? In this situation I think it is supposed to resume running exactly where it paused before the save/load cycle.

I am on the server now and someone is trying to join so I will see whether it restarts after the game un-pauses.

edit: so the game saved and un-paused but it has not resumed the path explorer for the class it was on (Passenger class 3). Instead it has restarted at passenger class 0.

jamespetts

Thank you for looking into this. This does not seem to be correct behaviour (the path explorer resuming from somewhere different to where it had reached when the save commenced), but it will be fantastically difficult to find the cause of this without a reliable reproduction case that does not take many hours to execute.

Can I ask - has anyone created any new routes since the latest nightly build and can anyone confirm (1) what type of load that they carry (i.e., passengers, mail or a specific type of goods); and (2) if passengers or mail, what class tat they carry? Knowing this (and preferably several of these data points) would help to understand with more precision and clarity the scope and nature of the issue.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

SuperTimo

Quote from: jamespetts on April 28, 2019, 03:08:45 PMCan I ask - has anyone created any new routes since the latest nightly build and can anyone confirm (1) what type of load that they carry (i.e., passengers, mail or a specific type of goods); and (2) if passengers or mail, what class tat they carry? Knowing this (and preferably several of these data points) would help to understand with more precision and clarity the scope and nature of the issue.

I have created a route for carrying livestock, and two for carrying meat today, if that helps? I think the server will sort out the routes eventually it just needs some time without someone else joining the server. Is it possible for you as an admin to have the server run with no one connected so it can have some time to calculate the routes?

jamespetts

Quote from: SuperTimo on April 28, 2019, 03:45:11 PM
I have created a route for carrying livestock, and two for carrying meat today, if that helps? I think the server will sort out the routes eventually it just needs some time without someone else joining the server. Is it possible for you as an admin to have the server run with no one connected so it can have some time to calculate the routes?

Would someone also be able to create some passenger routes for comparison? This would be very useful.

I do not think that the exercise that you suggest would help, as this would not be materially different from the test that you carried out already of running the game in single player mode.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

#20
Appears better?. Finally got my mail routes updated. It takes about 2-3 in game hours (not real hours, closer to 30 real minutes) for all updates.

I am unsure if save/load effects this. When I joined passengers were updated but not mail. I do not know if anyone else joined earlier today. It may be fixed, or may not be fixed, this needs some time to test.

EDIT: Possibly not fully fixed. When another client joined it reset to p_class_0 when it was still in the middle of doing goods. I then desynced from the server...

jamespetts

That is very helpful information, thank you. It will be helpful to have some more data points from which to form a hypothesis as to possible causes on which basis I can then test whether that hypothesis is correct.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

ACarlotti

#22
I connected once, and when I joined it was running the path explorer for passenger class 0. I waited for it to reach class 1, then left and rejoined, at which point it was still running the explorer on class 1. I then joined with a second client (simultaneously), and it then reset to 0.

Can anyone confirm whether or not it is resetting when, and only when, there is already a client connected upon joining the server?

I'll look at the code a bit more, to see if I can work out where it's being reset.

EDIT: Right, this is a rather silly bug. The reason it's not resuming from the right point is simply because we don't save the current step. The reason it seemed to sometimes be resuming from the right point was because it would skip over the previous categories/classes if they didn't need to be refreshed.

I'll sort out a fix now, along with making some other changes (firstly, to reduce save sizes significantly by not saving classes that don't exist; secondly, to improve robustness if the number of classes has changed since a save was made).

ACarlotti

I've posted a fix for these to Github. The file size savings aren't huge - on the current server game it should theoretically reduce the size of the uncompressed saved path explorer matrices by about 10%, and in practice two saves from before the change were 636MB and 610MB, whereas after the change I have sizes of 583MB and 616MB. (One reason for variation in file size will be a dependence on the state of the path explorer, although I think all of those saves occurred while it was processing passengers).

jamespetts

Thank you for this: that is very helpful.

However, I notice one issue with the fix, which is that it alters the pattern of saved data without incrementing the revision number: games saved from before this fix may therefore exhibit errors on loading, as they will be missing the data that the program expects to read at these points.

Would you be able to increment the version to 10 and modify the code so as it is able to read games saved with the code as it was before it was modified? That would be very helpful. Thank you again for your assistance with this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

ACarlotti

The first commit (7d2e473) doesn't change the save version number, which means that by itself it will still fail to save the current category/class. The second commit (5f70ad3) introduces another change to the save format, and then also increments the save version number. I believe this is consistent with how multiple changes to the save format in the same version have happened in the past. I can't see any mistakes in the tests for revision numbers, but if there then I can amend those commits before they're merged.

(Incidentally, as well as a rotation fix mentioned elsewhere, I've just corrected some journey time types to uint32, which is now sitting on Github too).

jamespetts

Splendid, thank you for that clarification; I realise that I had misread your code in the later commit: my apologies. I have pushed a very slight non-functional change to make it a little clearer (< 10 rather than <=9).

Thank you also for fixing the journey time types: that is most helpful. I must have missed these when I changed the journey time from 16-bit to 32-bit a little while ago.

I should be very grateful if people could test with the next nightly build on the server to see whether the cumulative effect of these fixes has solved the underlying problem. I am very grateful to A. Carlotti for dealing with this particularly challenging area of the code.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

It appears kind of better. I am still seeing a lot of time spent on p_class_0 when I join or after a save/load cycle but currently have insufficient evidence to prove that it was not on p_class_0 before the save/load cycle.

At least mail routes are updating again now.

jamespetts

Quote from: DrSuperGood on April 30, 2019, 11:46:49 PM
It appears kind of better. I am still seeing a lot of time spent on p_class_0 when I join or after a save/load cycle but currently have insufficient evidence to prove that it was not on p_class_0 before the save/load cycle.

At least mail routes are updating again now.

That is very helpful, thank you. It would help if you and others could monitor the situation over the next few days to make sure that there is no obviously anomalous behaviour. I logged in earlier this evening and added a new 'bus route to test this, so I should be interested to see whether this route is working when I log in next.

How are those with air routes (with p_class 4/"Very high") finding that new routes are working?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ves

It takes a while for new routes to get recognized, but I believe that they eventually are. I made some new routes today to three newly built airports and they have all three of them quite alot of passengers, both Very high and high. However, my planes more and more just use "high" class routes (p_class[ 3 ]) now.

DrSuperGood

Out of interest, is there a reason why one cannot incrementally update the routes for specific stops instead of waiting for the general path explorer to get to them? Even if the route is an inaccurate approximation that has to wait for the path explorer to refine it would still be better than no route for a good part of a game month.

ACarlotti

At the moment that would basically require running a second path explorer in addition to the main path explorer, and I don't think it's worth introducing that amount of complexity at this point in time.

However, I have been thinking about the path explorer more generally, and this is certainly something I would bear in mind in any future versions. I need to spend some time working out the figures to see if it's computationally, but if the numbers work out favourably then this could be implemented at the same time as a change in path explorer algorithm.