The International Simutrans Forum

 

Author Topic: Modelling private car traffic without breaking the CPU bank  (Read 25797 times)

0 Members and 1 Guest are viewing this topic.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Modelling private car traffic without breaking the CPU bank
« on: March 24, 2012, 12:16:18 AM »
Edit: This thread was originally titled, "Bug fix uncovers performance issue. Ideas for solution sought". Renamed to clarify more expansive recent discussion. Re-titled 2nd of January 2013.

I have been working on fixes to some of the bugs in 10.10, including the bug that has stopped the server game for the time being, and found an additional bug whereby road connexions between towns were never updated as they should be once they were put into the database in the first place. Once I fixed that bug and re-tested with the saved game from the server, I found that the amount of re-checking of road connexions between cities overwhelmed even my fairly powerful i7 such as to make the game unresponsive. I had thought all along that that the road checking was not causing performance problems, but it transpires that this was only because it was not, in fact, happening at all after the first check.

The issue is related to the number of cities. In the server game, there are 675 cities. 675 ^ 2 = 455,625. Because the game will assume that routes are bidirectional, the number of computations are halved, making 227,812. Further, these are spread over twelve months, one twelfth of cities being re-checked every month, making 18,984 re-checks that must be run every month if the road network is updated in any way. This is just for cities - routes must also be found to attractions and industries.

(By way of comparison, in a game with 128 cities, there would be just 683 re-checks every month, and in a game with 64 cities, there would be only 171 re-checks).

This is a rather tricky issue. Without the checking mechanism, there is no way of telling which cities are connected to each other (or attractions) by road. One conceivable solution would be to have some very sophisticated pathfinding mechanism similar to that which Knightly wrote for the routing of goods, but that is beyond my capabilities. A simpler, although somewhat less satisfactory solution, would be to find a mechanism of spreading the number of re-checks evenly so that one can specify in simuconf.tab a maximum number of re-checks that may be performed every X ticks, although even implementing that would be really rather tricky.

If anyone has any ideas to offer by way of assistance, I should be very grateful indeed.

Edit: A crude timing test shows that each road check operation takes between less than 1ms and well over 100ms each. Assuming an average of 50ms per operation, performing 18,984 re-checks would take 949.2 seconds, or 15.82 minutes (that is not counting any other operation that the computer might be performing during that time - the actual time to complete the re-checks would therefore be considerably greater).
« Last Edit: January 02, 2013, 05:36:06 PM by jamespetts »

Offline ӔO

  • Devotees (Inactive)
  • *
  • Posts: 2345
  • Hopefully helpful
  • Languages: en, jp
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #1 on: March 24, 2012, 01:15:20 AM »
Just a question, but how hard would it be to multithread the route calculator?

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #2 on: March 24, 2012, 01:19:12 AM »
I don't know. I suspect very hard. Simutrans generally is not multi-threaded at all. C++ has no built-in support for multi-threading, so a different implementation for every target platform would be required. I suspect that much work would be necessary to make multi-threading work in Simutrans. If anyone would like to try, however, I shan't be complaining!

Offline transporter

  • *
  • Posts: 136
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #3 on: March 24, 2012, 04:57:58 AM »
Idea 1:
I'm not sure if software is able to do this, much less if it's even possible, but what about doing the checks only after a player makes a change to the roadways?  I'm assuming that any changes made by the public service player will not affect how cities are connected since I haven't seen any times that the public service player has deleted ways. Then the game would be able to assume that connections haven't changed since the last check. So maybe a possible cue to check is if an active human player makes a change to the map.

Idea 2:
Maybe the checks can only be performed on a section of map at a time. This could maybe be a changeable variable in the settings page to be customized to your computer or server. Or it could be based automatically on map size and city distance. For example, for a 100x100 map, it would take the top left 30x30, then move over 20 units each time to overlap 30x30 for a continuous check for the rest of the map, then when it gets to the edge of the map it moves down 20 and either just moves the other way or starts back on one side.

Idea 3:
A sort of combination of the 2 where the checks occur whenever the player makes a change to the map, but the checks happen within a certain area of the change itself. Using the same rules as stated above.

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #4 on: March 24, 2012, 08:13:42 AM »
You don't need check connection for every pair. It is very slow indeed. You only need to run some kind of flood algorithm to mark connected components of road network. In fact, i think said algorithm already exists somewhere in simutrans code. Ask Standard development team about it.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #5 on: March 24, 2012, 11:08:27 AM »
Transporter - thank you for your thoughts. Idea 1 is already implemented - the checks will only be run if somebody changes the road network somewhere. That is not much help, though, as the game becomes unresponsive when anyone does change the road networks. Idea 2 is interesting, although this is just another form of staggering, one form of which is already achieved by the system of allocating each city a month in which its road connexions are to be checked. I could spread this out further, but it would have to be so spread out, I think, to avoid causing problems on really large maps that it would take many, many game years for road connexions to refresh fully.

Inkelyad - thank you likewise for your suggestion. The problem is that we need to find not just the answer to the binary question of whether a city is connected at all, but the quantitative answer as to what the quality of the connexion is. A connexion by a meandering dirt track should attract far less traffic than a connexion by a motorway.

Edit: I do wonder whether multi-threading might be the only possible answer here. That would be a great amount of work - certainly not suitable for a bugfix release. I might well just have to release 10.11 as it is, and set the server game to "assume_everywhere_connected_by_road" for the time being. The trouble with multi-threading, though, is that it would be enormously difficult to synchronise accross the network. The only way that I can think to do it reliably is for the server only in a network game to run a private car route finding thread (a single thread dedicated for the purpose running constantly), and for it to push the results of its updating work to all of the clients as it goes. That way, on server games, only the server itself would have to have a multi-processor machine in order to get acceptable performance.

Edit 2: What do people think of OpenMP?
« Last Edit: March 24, 2012, 11:26:30 AM by jamespetts »

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #6 on: March 24, 2012, 11:33:29 AM »
The problem is that we need to find not just the answer to the binary question of whether a city is connected at all, but the quantitative answer as to what the quality of the connexion is. A connexion by a meandering dirt track should attract far less traffic than a connexion by a motorway.
Pick a city as start point. Flood road network  with "journey_time to start point" values. For each city in connected component store value from flood to where you need it. There is no need to repeat this via calc_route for each pair.


Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2346
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #7 on: March 24, 2012, 11:48:34 AM »
There's some discussion elsewhere about multithreading in simutrans at the moment. There's limited scope for it but many functions need to be carried out in a single threaded manner as it stands at present.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #8 on: March 24, 2012, 12:15:38 PM »
Kieron - interesting. Is this discussion on one of the public boards? If so, I should be interested to see it.

Inkelyad - hmm, perhaps I am being dim, but I do not quite follow. What do you mean by "store value from flood to where you need it"? What I had imagined that you meant was a system that produced a binary answer to the question, "is city A connected to city B?" by inferring in some cases that, if city A is connected to city C and city C is connected to city B, then city A must be connected to city B. This method is not useful, as there might be a faster route between city A and city B than via city C, and we need to know the journey time of the route. If this is not what you meant, apologies for having understood you incorrectly - clarification would be much appreciated!

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #9 on: March 24, 2012, 02:11:20 PM »
Dijkstra's algorithm where cost of edge == journey time.
Then you can build private car routes from resulting tree.

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #10 on: March 24, 2012, 04:31:03 PM »
You shouldn't need to worry about synchronization with the rest of Simutrans when writing a parallel algorithm -- you just use parallelization to speed up that particular computation, then return the answer when it is done. OpenMP is one particular way of doing this, it is good at fork-join style parallelization. You should probably consider it. There is an associated library (GNU OpenMP) and a compiler flag (-fopenmp) to help support it in GCC. Bear in mind, not every graph algorithm is amenable to parallelization, and don't be surprised if your initial attempts make things worse.

Other things to consider: dynamic updates to SP problem. http://www.cs.uwaterloo.ca/~epfchan/publication/sptdg.pdf

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #11 on: March 24, 2012, 05:41:18 PM »
Looking at the source a bit, it seems that you are doing A* search tile-by-tile and running that for every (unordered) city pair. If that's the case, then there ought to be significant opportunities to trade off space for time, as there should be overlap between many different routes.

Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2346
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #12 on: March 24, 2012, 05:58:17 PM »
Summary of discussion:
* What slows down simutrans is not generally one run through a particular algorithm, it is the cumulative time taken by repeated runs. Almost always these runs must be made in the same order (even more so for network code). So performance increases through multithreading are quite difficult to achieve.
* There also tend to be unforeseen consequences of multithreading as almost all operation in simutrans are connected to other parts of the code.
* Multithreading does not help all platforms equally, even where multiple cores are present. Memory bandwidth can often be a limiting factor.

That said, I've managed to get certain operations (rotation for example) to go faster through multithreading, but these are relatively few and far between.

In this case I doubt multithreading will help as you still need 15 minutes of processing time (on a relatively powerful machine), so you will have to change to a more sophisticated algorithm, or simplify the code which currently requires it.  15 minutes divided by 7 (8 cores minus one being used for rest of game) for example is still 2 minutes per game year - still hugely excessive for one algorithm.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #13 on: March 24, 2012, 11:57:45 PM »
Thank you for all the suggestions and information - most helpful. Inkelyad - I don't see how Dijkstra's algorithm could assist here, as there would be no way of knowing (as far as I can tell) whether, for any given city pair, there was not a direct route that did not pass through any other cities that was shorter than any other route without doing the very same A* search between each city pair as is done already.

As to multithreading, it would, as Kieron points out, be very hard to synchronise without it being done exclusively on the server as I propose (and I worry slightly that doing it that way would greatly increase network traffic). The problem is that different clients' worker threads would finish any given element of the task at a different point in the execution of the main thread to each other, which would cause different clients to behave differently, causing network desyncs.

If there was one thread just for the private car routes on a powerful multi-core server updating all the routes in the background in a game in which only the server had to perform this work (and simply push the updated results to the clients when they are ready), the fact that it takes quite a long time should not matter too much, especially in a game where the length of each month is quite long, although I accept that it is not ideal. If there was some way to optimise the algorithm, that would be the best option, although I'm not sure what it might be yet, unless I have misunderstood the suggestion relating to Dijkstra's algorithm somehow.

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #14 on: March 25, 2012, 12:37:40 AM »
Parallelism is entirely deterministic. If you're not seeing that behavior, then either there's a bug in your code, or you've actually implemented a "concurrent" algorithm. But kierongreen is right that at best, you will only see a constant speed-up which is not sufficient.

I think ultimately you need to find ways to avoid repeating work (e.g. use dynamic programming). I also think that there must be some way of taking advantage of your relatively simple representation of maps and road networks; either through partitioning or heuristics such as the one that A* uses already.

Speaking of the heuristic, I have a slight concern about its current implementation. If I'm understanding correctly (and I don't know German), you are using straight line distance as the heuristic (a natural choice for maps). However, it is possible that this heuristic is non-monotonic. Following it one step could lead you to a situation where you have more difficulty reaching your end goal (lower speed limits) than before.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #15 on: March 25, 2012, 12:53:49 AM »
I haven't written any multithreading code for Simutrans yet - what I write above is based on what I understand of how multithreading works, although I have never looked very deeply into it, so forgive me if my understanding is incorrect. I do not know the exact meaning of a "concurrent" algorithm, but suspect that that is what I had in mind, in that there would be a thread created at the start of the game and persisting thereafter working its way through updating road connexions marked as stale by the main thread and unmarking them stale, only interacting with the main thread to lock the variables that it updates at the moments that it writes to them. A system that requires the main thread to wait for the road connexions thread to reach a certain point before continuing would not really help much in terms of speed, I don't think.

As to the A* algorithm, this comes from Standard, and is the same algorithm used for player vehicles finding their way about the map: I have not altered this at all. If you can think of any way of making it faster, I should be most interested, and the Standard developers might be interested, too.

As to dynamic programming - can you elaborate?

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #16 on: March 25, 2012, 02:17:11 AM »
Yes, that is a concurrent algorithm. Generally speaking, if you start thinking about locks, communication, or interleaving, you're dealing with concurrency. Parallel threads, on the other hand, have no interaction with each other. For example, OpenMP provides a construct that allows you to execute the iterations of a "for loop" in parallel. In order to use it safely, you must not write code which touches the data other iteration steps are using (pseudo-syntax):

Code: [Select]
parallelfor(i=0; i<10; i++)
 array[i]++; // OK

Code: [Select]
parallelfor(i=1; i<10; i++)
 array[i] += array[i-1]; // bad!

Dynamic programming is the method of breaking down a problem into subproblems in order to solve it -- in particular, most people think of overlapping subproblems -- so that the solution to one subproblem can be stored and used later when it is needed again.

Shortest path algorithms tend to fall into this category. For example, suppose you had already computed the best route from Chicago to San Francisco. Now you are working on the best route from New York to San Francisco, and you find yourself at Chicago. Great! You already know the answer from here, so you look it up and are done.

The general shortest path problem is usually broken down into categories: Single-source, Multi-source, and All-pairs. The Bellman-Ford algorithm was one of the first single-source algorithms, and also one of the first examples of dynamic programming. Dijkstra is as well, and faster for graphs with non-negative edge weights. A* is single-source using an admissible heuristic to guide breadth-first search. Some famous examples of all-pairs shortest path are Floyd-Warshall and Johnson's algorithm. These also happen to be easily parallelizable.

Although you are trying to solve shortest paths for all pairs of cities, since you are working tile-to-tile, the all-pairs algorithms are too computationally and memory intensive to be used. Instead, you want to solve it between a (relatively) small subset of the vertices. That's why I suggested some kind of dynamic programming solution. Suppose you are searching for the best path between cities A and C, and you expand a vertex that is on the already-computed path between B and C. Then you can reuse that portion of the path. I think this will come up a lot because the intercity roads are probably used by many different paths in most games.

That still leaves an awful lot of data to store. You need to put some kind of indication on each vertex to say what paths it belongs to. One possibility to reduce this overhead might be to only do it periodically, instead of for every vertex. Or you could try to identify road hierarchies (e.g. determine which roads are intercity highways and which ones are local roads).

Offline isidoro

  • Devotee
  • *
  • Posts: 1140
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #17 on: March 25, 2012, 02:33:38 AM »
Just a quick idea to deal with some cases.

Prerequisite: you have a table with pair distances for every pair of cities.

Somebody breaks your network at C (erases a tile there).

Calculate all possible distances from C to all cities.

Now we will select all cities that are affected by point C (mark them dirty):
  repeat for every pair of cities not dirty at the moment A and B
    if distance(A,B)=distance(A,C)+distance(C,B), mark dirty A and B

Rebuild your table with your present algorithm, bun only for dirty cities.



Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2346
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #18 on: March 25, 2012, 09:16:44 AM »
Quote
In order to use it safely, you must not write code which touches the data other iteration steps are using
And simtrans does this, many times, in many places. Short of a complete code rewrite there's not a lot that can be done about this. So while a pathfinding routine might be safe to run in parallel threads on it's own, it is not with the rest of the simutrans code.

Offline isidoro

  • Devotee
  • *
  • Posts: 1140
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #19 on: March 25, 2012, 10:57:18 AM »
Well, there is much to be said about all that of multithreading...  In fact, I work on that myself.

Generally speaking, multithreaded programming is much, much more difficult that single threaded and requires some expertise.  A lot of unexpected errors may happen and, worst of all, those errors are erratic (some times the application works, some times it doesn't) because there is a factor we don't control: thread scheduling.  Either parallel threads don't use the same data (or the state is irrelevant, which is nearly always not the case) or you have to build critical sections among them.  But in that case, a dreadful threat arises too: deadlocks...

Not to mention problems with compatibility among systems.  For instance, I don't know a portable way for separate threads to deal with GUI input for each separate window apart from the trivial one that one thread takes all and delivers messages to the other by other means...

In a big, running project like this, I would advise you not to go into this field unless you know very well what you are doing...


Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #20 on: March 25, 2012, 04:28:44 PM »
And simtrans does this, many times, in many places. Short of a complete code rewrite there's not a lot that can be done about this. So while a pathfinding routine might be safe to run in parallel threads on it's own, it is not with the rest of the simutrans code.

It wouldn't be running concurrently with the rest of Simutrans code. It would be replacing what is now done serially. As long as each parallel thread does not trample on the data the other threads are using at that moment, then there will be no difference to the rest of the system, and it will only be noticeable by the way it uses more than 100% CPU.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #21 on: March 25, 2012, 04:59:27 PM »
Update: I have released version 10.11 as it is for the time being, and set assume_everywhere_connected_by_road=1 on the server game while a further solution is sought. This issue is clearly one that can only be solved with considerable work, and in a major release.

Chicken - do you think that a concurrent algorithm is the right approach for this scenario, or is there some advantage to using a parallel approach here? Isidoro - would the concurrent approach that I suggest (pushing updated data to the road connexion tables on all the clients simultaneously) circumvent the difficulties that you enumerate above?

As to optimisation, things are not as simple as one might imagine: if the program has already found a shortest path between City A and City B, and is looking for a path from City C to City A, if it has reached somewhere in the city limits of City B, that will not necessarily assist in re-calculating the part of the route from City B to City A, as the routes are calculated from one particular tile in each city to the equivalent tile in each other city (the road tile immediately outside the town hall), so, unless the best route happened to pass through that particular tile, the method that you suggest above would not assist in this case (and, overall, would probably have only a very limited impact).

Isidoro - the problem with simply checking for breaks in the network is that this does not account for the (important) possibility that the existing route is left entirely intact whilst a new and better route is built between the cities. This is why I implemented a periodic re-check algorithm instead of a system that looks specifically for broken links.

Thank you all for your input so far!

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #22 on: March 25, 2012, 05:16:40 PM »
Either may help, but a concurrent algorithm is subject to all the ordering and synchronization problems you have brought up. Carefully applied parallelism can bring some constant factor speed-up to certain algorithms in the code, but that doesn't sound like it's enough.

Recall that in order to find the shortest path between City A and City B, you also had to find the shortest path to City B on all the tiles that lie in between (this is called 'optimal substructure' by the way). So I'm proposing to hold onto that information in some form. This is the trade-off between space and time. We could save a lot of time by remembering paths that have been computed recently, and reusing them whenever possible, but storing all this information could also lead to an extreme blow-up in memory size. So it is likely that some compromises will need to be made. For example, you could only store path information on vertices with degree at least 3.

Code: [Select]
A ---------+----------- B
           |
           |
           C



Since you have computed A-to-B, when working on C-to-B you find that at the '+' you already know the shortest path to B, so you are done. Also, since we're working bidirectionally, you now know the shortest path C-to-A.


Perhaps that is still too much memory, in which case, you could consider some sort of road classification system based on certain criteria. For example, it may be possible to identify "intercity" roads and only save information about degree-3 nodes while traversing them.


Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2346
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #23 on: March 25, 2012, 06:07:58 PM »
Well, I think optimal memory usage I think you can hope for is to store direction from each tile to proceed to a particular city. This requires 3 bits per tile (2 for direction, 1 to indicate if it is possible). If you store that for all tiles that is mapsize_x*mapsize_y*(3*number_cities/8) bytes - for a 1024x1024 map with 200 cities this would be 75MB, but 2048x2048 with 800 cities would be 1.2GB. So you would have to store it for road tiles only somehow. Storing a value like this would instantly allow you to check when deleting a tile whether this broke an existing connection, however the stored route might become stale (not the fastest) over time without refreshing.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #24 on: March 25, 2012, 06:58:08 PM »
I had considered using fragmentation in the past, but it strikes me as extremely difficult to implement reliably. One idea would be simply to store routes, but this might lead to very high memory consumption given the number of possible routes as calculated above. Kieron's idea about storing optimum directions is interesting, although is still possibly subject to the problem that the data might become stale by the implementation of a faster route somewhere else that makes the correct direction on any given tile different to the previous optimal direction in a way that can only be tested by a complete recalculation of the whole route without the aid of the previously stored directions, and this sort of recalculation would have to be done frequently if better routes were not to be missed for a long time, so it seems on the face of it that this cannot be optimised further by this method.

I should be interested in any other thoughts on optimisation, however, that are able to surmount the perennial issue that there might be a completely different and faster route built at any moment.

As to a concurrent algorithm, in the example of the particular concurrent algorithm that I suggested, where does the possibility for deadlocks, etc., arise? If all that the private car routes thread does is work its way through a list of stale connexions to be recalculated in the order that they became stale, recalculate them (which would not itself require changing the state of anything on the heap), then interrupt the main thread to push the new results (which, if the main thread were running on a server, it would then push to the clients: the clients would not run their own private cars thread), where is the possibility for deadlocks, race conditions or desynchronisation between client and server?

Offline isidoro

  • Devotee
  • *
  • Posts: 1140
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #25 on: March 25, 2012, 07:12:28 PM »
@James:  I agree with chicken that running some threads would not benefit the problem very much.  You can only get rid of the unresponsiveness somehow.  But there may be not enough time to keep the roads connections statistics updated if the city number is big.

Regarding the algorithm I proposed, it is only an example of possible optimizations.  The rationale behind them is that most modifications in the road network will surely not affect but some of the cities and you recalculates all of them.  And time of calculations depends much on the city number.

Another optimizations can play with physical distance in the map (based on coordinates) vs. distances by road, since the latter will always be greater than the former.

You can extend my idea to adding new roads too.  In that case, you would have:

Prerequisite: you have a table with pair distances for every pair of cities.

Somebody joins your network from points D to E.

Mark dirty all new connected cities

Calculate all possible distances from D to all cities.
Calculate all possible distances from E to all cities.

Now we will select all cities that are affected by point D and E (mark them dirty):
  repeat for every pair of cities not dirty at the moment A and B
    if distance(A,D)+distance(D,E)+distance(E,B)<distance(A,B) or
       distance(A,E)+distance(E,D)+distance(D,A)<distance(A,B), mark dirty A and B

Rebuild your table with your present algorithm, bun only for dirty cities.

Edit: you have posted while I was posting...  You can have problems in the following scenario:
  • Even if your algorithm doesn't modify data
  • Even if it doesn't depend on modifications of the data by ST while it is running
  • But if the results of it is not uniformly spread and some nodes make decisions based on new data while others make them based on old data, you can have inconsistent behavior among nodes
The last may not apply to the current network model of ST.  I don't know how network mode is implemented.  So, the last points may not make sense...
« Last Edit: March 25, 2012, 07:21:07 PM by isidoro »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #26 on: March 25, 2012, 07:20:31 PM »
Isidoro,

that's an interesting idea - a sort of radius effect? The complexity there is that different roads might have different speed limits, so the existing connexion might be the shortest but not the fastest, so there would have to be a margin of latitude, and then a method of calculating the margin of latitude, but in principle it should be possible to say: if a change to the road network occurs, only mark as stale connexions to and from any cities within X radius of the change (and then only update the cities connexions in their assigned months). One problem with this might be that the addition of a new city road by a city is (and needs to be) enough of a change to a road network to make an existing route that could possibly go through that tile possibly stale, and city roads are being updated automatically all the time, so, in a well developed game with fast growing cities, this method might well produce only marginal savings. I assume, incidentally, that you had imagined this being done in addition to the multi-threading?

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #27 on: March 25, 2012, 07:24:19 PM »
I wasn't proposing to store the data long-term, only until the current recalculation is complete. The point is to avoid redundantly computing the same path over and over again by memoizing intermediate results.

Offline isidoro

  • Devotee
  • *
  • Posts: 1140
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #28 on: March 25, 2012, 07:24:38 PM »
Again, we have crossed our postings again...  :D I have to go now (to the movies).  I'll keep the conversation later.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #29 on: March 25, 2012, 07:30:31 PM »
Isidoro - it was for this reason that I proposed using the server to push the results to the clients rather than simultaneously calculating the results on server and client (as is done for everything else in Simutrans). Would this not suffice to solve the problem?

Chicken - hmm: what exactly would count as "the current calculation" here? The current system, and so far all imagined alternatives, comprise in effect a constant rolling recalculation.

Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2346
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #30 on: March 25, 2012, 07:37:52 PM »
Using directions we can use the following to recalculate directions when a road section is built :

how many existing road sections is the new road connected to?

0:
set all directions to unconnected

1:
set all directions to the connection if this is connected

2 or more:
loop over each connection counting weight to destination - if no connection then discount this connection. if no connections left set new route as unconnected

test to see if any tiles in these routes overlap (even if it is destination tile) - if so we have a route "triangle" with sides 1:weight(connection a->overlap), 2:weight(connection b->overlap) and 3:weight( connection a->connection b) - if side 1+side 3<side 2 or side 2+side 3<side 1 then we need to recalculate either side 1 or 2. To recalculate we work out the crossover point (where the route along sides 1 and 3 equals that along side 2) and point all directions closer than that to new route. We then discount this connection from further calculations in the step below

then for each tile in the new section choose direction leading to lowest (weight tile to connection)+(weight connection to destination)

Since each tile already knows the fastest route to the destination calculating the route weighting should be relatively quick. Upgrading an existing way would be slower, as you would first have to simulate breaking the connection, then re-adding it.

Edit: we need a final step, if we have connected a connected and unconnected connection - in this case we work back along unconnected connection setting directions toward us (if any of these happen to meet then we will need to compare weightings and set directions appropriately)
« Last Edit: March 25, 2012, 07:42:55 PM by kierongreen »

Offline chicken

  • *
  • Posts: 62
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #31 on: March 25, 2012, 07:38:33 PM »
Well, for example, you take 12 months to recompute the data, so any stored information could be considered stale 12 months after it is calculated.

Online Vladki

  • Devotee
  • *
  • Posts: 3714
    • My addons, mostly roadsigns, pak128.cs
  • Languages: EN, CS
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #32 on: March 25, 2012, 07:45:00 PM »
Hi all,

I'm not sure I understand the background of this problem, so excuse me if I'm totally wrong. I think it's worthless to calculate connections for each pair of city - complexity of n^2. It should be enough to check (and store) direct connections from each city. Just follow the roads leading out of the city and stop in the first city you get to. Then the complexity is n*m (where n - is number of cities, and m is number of roads leading out of the town - usually less then 10). Indirect connections then can be derived from stored direct connections much easily.

And of course the recalculation should happen only if there is some new city (attraction, factory). If some road modifications happened in the area od some city then recalculation is needed only for direct connections of the affected city, not the whole map.

Multithreading will not help much - reducing complexity is needed.

Offline isidoro

  • Devotee
  • *
  • Posts: 1140
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #33 on: March 25, 2012, 10:56:46 PM »
Vladki, your post makes sense to me at first sight.

Isidoro - it was for this reason that I proposed using the server to push the results to the clients rather than simultaneously calculating the results on server and client (as is done for everything else in Simutrans). Would this not suffice to solve the problem?

I think that I answered before, but we were crossposting then...  The problem may arise if the new information doesn't arrive at the same simulation time in all nodes and happens that some of them use the old values and some of them use the new values.  As I said before, with concurrency there are always problems lurking around at unexpected places...

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20720
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Bug fix uncovers performance issue. Ideas for solution sought
« Reply #34 on: March 25, 2012, 11:13:35 PM »
Isidoro - by "nodes", do you mean network clients?

Edit:

Kieron - how does this get around the issue of it not being helpful to know the directions in the event that a new, faster route has come into existence and that it is rarely possible to know that a new, faster route has not come into existence?

Vladki - thank you for joining the conversation! New ideas always appreciated. The problem here, though, is that, although we are discussing things in terms of connexions between cities, that is slightly imprecise: the actual routes are calculated between a specific road tile in one city and a specific road tile in another city (the road tile immediately outside the town hall). So, if we get the direct connexions from City A as going to City B and City C, and we are looking for a connexion from City A to City D, which has a direct connexion to city C, the fastest route from City D to city A might well go through City C (as in, through the city limits of City C), but not go through the town hall road tile in City C. In some cases, this difference might be trivial, but in other cases it might be very great - by orders of magnitude of journey time in the right circumstances.
« Last Edit: March 25, 2012, 11:19:13 PM by jamespetts »