The International Simutrans Forum

 

Author Topic: Possible memory leak  (Read 3595 times)

0 Members and 1 Guest are viewing this topic.

Offline Rollmaterial

  • Devotee
  • *
  • Posts: 599
  • Languages: EN, FR, DE, FI, SE
Possible memory leak
« on: February 10, 2020, 09:43:21 PM »
When loading this save, performance drops a few seconds in and RAM usage starts steadily increasing.

.sve: https://drive.google.com/file/d/1nSlcyFaPKBoPlthRhXW1OBlMK0xeMEWZ/view?usp=sharing

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #1 on: February 10, 2020, 10:08:20 PM »
Thank you for the report. The memory usage increase seems to come from adding values to the private car routes hashtables in the ciies. It is not immediately clear why this is. Can I ask whether you have rotated the map recently?

Offline Rollmaterial

  • Devotee
  • *
  • Posts: 599
  • Languages: EN, FR, DE, FI, SE
Re: Possible memory leak
« Reply #2 on: February 10, 2020, 10:26:39 PM »
I have not rotated the map. The map was saved in the nightly build from last Friday.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #3 on: February 11, 2020, 12:10:31 AM »
This is a complex problem and not one that I have time to solve this evening - what seems to happen is that the private car routes are generated and stored in the cities faster than they can be processed at the current constrained rate of processing (constrained to conserve performance), and the piled up routes use a very large amount of memory.

There was also an issue in which routes may end up being rechecked before the previous routes had processed, but I have now fixed this part of the issue.

To solve this problem, I will have to improve the scheduling of route finding to be more synchronised with route processing, and I do not have time to complete that this evening.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #4 on: February 14, 2020, 01:02:45 AM »
In more detail - this is not actually a memory leak at all: the private car routing system as currently coded actually uses this much memory. I will need to change the way in which the feature works at a fairly fundamental level to solve this.

Currently, the private car routes are stored in two separate places: (1) in the ways themselves, which is needed for the actual private car routing to work; and (2) in the cities.

The city storage serves two purposes: (1) it allows private car routes to be generated by a multi-threaded algorithm, and to store those routes temporarily pending them being stored in the ways (as it would not be possible for the multi-threaded algorithm to store them in the ways at the same time as private cars were reading those same data from the ways); and (2) to allow the routes to be deleted so that they can be refreshed periodically.

However, on a large map, it seems that the amount of data storage required is unfeasible. Each city will store two sets of routes: the currently used routes and those currently being written by the multi-threaded process. Each route consists of a vector (list) of koord3d (3 dimensional co-ordinate) objects, one for each tile of road. Each koord3d object uses 48 bytes (16 x 3).

The saved game linked above has 600 cities. Each city stores a route to each other city and also each attraction and each industry on the map (if reachable by road). On this map, all the cities were on one large land map, and so all connected to each other by road. Cities typically stored circa 1,700 routes each.

The exact length of the routes will vary enormously, but it would not be unreasonable to assume a very approximate average of 1,000 tiles for such a large map. Thus, 48 bytes x 1,700 routes x 1,000 tiles x 600 cities = 48,960,000,000 bytes, or ~46 gigabytes. This could be as much as doubled by the fact that the cities store the routes twice (although usually both sets will not be full in every city). (My own empirical testing showed that the test map actually consumed only circa 12Gb, the memory consumption stopping at that level, which, while still unreasonable, suggests that the average of 1,000 tiles per route to be an overestimate).

The other method of storage of private car routes - storing them in the road tiles themselves - is much more efficient. This is because duplication is eliminated: there is only one co-ordinate (koord3d object) stored for each destination in each tile. Even if routes from 599 cities pass through a tile to get to the 600th city, only one route co-ordinate pair (destination and next tile) will be stored in that tile of road. Thus, if we were able to eliminate storage in the cities entirely, we could bring memory consumption back to reasonable levels.

However, the efficiency of this route storage has significant problems. First of all, deleting a route is a problem, as there is no way of telling where the route is from. Secondly, multi-threaded creation of the routes is a problem. This is why I stored routes in the cities in the first place.

However, after some careful thought over the last few days, I have hit upon a potential series of solutions. I should be grateful for feedback from more technically minded people in respect of this in case I have missed anything important.

Firstly, deletion. The routes are effectively a singly linked list starting at each townhall road. The problem is that we need two kinds of deletion: (1) periodic deletion, just before checking to see whether a newer, better route is available; and (2) immediate deletion when a tile of route is deleted or made inaccessible to private car traffic (e.g. by a no entry sign, private way barrier or the like). With immediate deletion of a tile part way along the route, there is no way of knowing the previous tiles and thus being able to delete up to that point: one can only delete forwards. However, deleting from the townhall road cannot work, as, once a missing tile be encountered, the route beyond that point becomes unreachable and undeletable. However, if we combine the two deletion strategies, so that immediate deletion takes effect for all road tiles in front of the point in question, and periodic deletion from the townhall road will deal with tiles up to that point, this should ensure that all tiles of the route come to be deleted by the time of the next periodic refresh even if one or more tiles of road on the route have been deleted in the meantime.

As to multi-threading, the only way that I can think to do this is to have an array of two sets of destinations stored in the road tiles, and to designate one as the reading array and the other as the writing array. Each time that the multi-threaded private car route finder starts afresh on the list of cities to process, it will swap which one counts as the reading and which counts as the writing array, and then delete all of the writing array routes by starting from the townhall tile and deleting forward from there. (When using the immediate delete method, we will need to delete both the reading and writing set, so we need to wait for the private car route finding to finish before we do this. This means that we need a special thread synchronisation function, await_private_car_threads() that works in the same way as the current await_path_explorer() method. Indeed, we probably need this anyway, as any private car route finding that happens concurrently with a road tile deletion/change will be indeterministic as to whether that particular tile of a prospective route be processed before or after the deletion/change, leading to a loss of synchronisation in a network game. I have noticed occasional losses of synchronisation in network games recently, and this may be the cause of at least some of them). A complexity which I have not yet considered in full is that cities are not processed from beginning to end in terms of their private car routes, rather they are added to and removed from a list of cities awaiting processing dynamically, so knowing when to swap the reading and writing status (which would have to be global) is not easy.

In any event, I have been very busy this week, so have not had much time for coding; but I hope to look at this over the week-end.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #5 on: February 14, 2020, 11:58:33 PM »
Sounds reasonable.
The number of routes in such a scenario (each city to each city bi-directionally, each city to each industry and attraction bidirectionally) is exactly c*(c-1)c+2c*(a+i)=c²-c+2c(a+i) in O(c²+c(a+i))
Additionally, map size affects the size of each individual route, so we quickly get huge amounts of data, just as observed.
We can optimise this here and there but that characteristics will remain when storing the routes as a whole, so I fully support your conclusion that we should get rid of these linear lists storing the routes as a whole, replacing it with something more meory efficient. I.e. without or at least with much less redundancy.
The current information stored in the roads is a good structure for this purpose (and it fits our needs much better than you thought)

So, let's simply throw the routes stored in the cities away and see what happens.
Now we got following issues as mentioned:
1. We cannot simply delete roads as this will unlink parts of our graph and make them unreachable from any origin => We will need to ensure that we always know where our routing instructions are stored.
2. We cannot calculate routes in parallel as directly writing it to our worlds ways will desync the game => We will need some kind of temporary storage for newly calculated routes.

At first, I would not store the routing information directly in the road tile but instead maintain an own graph for this, storing that graph in the world.
That graph persists of nodes, which are stored in a koord=>node map.
A node has a destination=>node map, where vehicles will read routes from. An edge is an entry in that map.
Additionally the node will store the coord of the related road.
The graph will maintain a second koord=>node map, where the route calculation thread can write to.
Additionally, the graph will store a list of dirty kords.
We will see the use of it later on.

Further note, the approach will only support to reroute all lines in a run, as that is what I have understood from your post.
Adapting it to support for reroutes of only a few lines is however possible with not too much additional effort.
We will need to use edge counters again and we will need to maintain an additional map of dirty nodes.

About 1.
As suggested by James, when removing a way, we could immediately remove the remaining path.
However, I see a huge performance issue here in case there are quite a lot routes stored on that tile and it might not be too tough immediately throwing away the stored routing informations.
Performancewise Keep in mind what we need to do (immediately in step) when removing a node:
For each edge remember the edge count, follow that route to the destination and decrease the edge count to that destination by that number. That will quickly freeze the game for a noticeable time.

Luckily we don't need to do complex route deletion stuff yet.
The only thing we have to do is setting the dirty flag in the node, adding the koord to the dirty koord list and deleting the road.
The same goes for any other changes to a road tile that affect any route, i.e. removing directions from a road, placing private gates and stuff like that.


About 2.
Generally this is difficult. However, luckily we store a writing map.
If there is only one process involved in calculating the routes, we can simply write the calculated data to the writing map.
If there are multiple threads involved in calculating data, there needs to be one thread managing the writes to the table.

Once we have finished calculating, we will inform the server about this. If we are the server, we will wait for all clients to be ready.
Once everyone is ready, we will swap the reading and writing map.
Once that's done, we can prepare the writing maps for the next routing run, that means emptying the writing graph.

Private cars will behave roughly the same as they do currently, however, in detail they will need to access the graph, instead of accessing the world.


Further note, we could maintain the dirty stuff on a per-edge level, instead of on a per-node level, which will require storing links to connected neighbors to a node.

Additional thoughts:
The described mechanism itself should work without desyncs.
However, the calculated routes, which is not described preciesely here, might differ, thus cause desyncs.
The reason for this is a race condition in vehicles moving along the world causing congestion and the route calculation considering that congestion.
If a road is just a very little "better" than another on the server, it might be that a faster client calculated the same route a little bit earlyer, where that road was still a little bit worse than the other one.
This affects the current congestion calculation and car routing in just the same way as the new congestion and car routing.
I cannot imagine a simple approach to get these in sync without effectively not calculating routes in sync or working on a copy of worlds road network but creating that copy will also need a reasonable amount of single threadded time.

It does not seem that these desyncs happen too often, thus it might be the better option to compare chunks of calculated routes (by their hash) in between server and client. If they don't match, compare individual lines hashes and enqueue those that didn't match to try again later within the same reroute run. Anything that matched will be written to the writing map of the graph.
« Last Edit: February 15, 2020, 12:38:36 AM by Freahk »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #6 on: February 15, 2020, 12:25:49 AM »
Thank you for your thoughts in relation to this. I am not entirely sure that I see the advantages of a node map (which would take considerably more work to code) than an array of the existing structures.

For performance on deleting, all that we need to do is follow the routes from the deletion point; if this proves to intensive to put into a sync step, we can store a list of recently deleted road tiles and process them in a step, or possibly bit by bit in a step.

As to the server synchronisation, generally, there is no need for clients to send data to the server to indicate to the server that they have performed some internal calculation: the way that Simutrans networking code works is that it simply assumes that client and server will do the same work in the same order in the same step or sync step, so, as long as one makes sure that this is in fact what happens all the time, there is no need to send data. Data need only be sent for user interaction. So, for multi-threaded code, we just need to have a thread barrier at an appropriate point in karte_t::step(), and the server and all clients will all wait for the threads to finish their run at that point before progressing. This is how the other multi-threading works.

It would help if you could illustrate more clearly the comparative advantages and disadvantages of a node map over and above simply using the existing road tiles (which are a node map in their own right, of sorts) for this work.

Thank you again for your thoughts.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #7 on: February 15, 2020, 02:20:22 AM »
Well it shouldn't require that much more work. It's really just a class storing two maps, and a list of dirty koords.

The main advantages of the node map over the world are the follwoing:

1. At any time, we know where routing information is located in the world, as any node in the graph relates to a road in the world that has routing information.
-We conceptionally cannot leave any unreferenced routing information in the world.
Technically these are not unreferencedd as they are still reachable through the world but effectively we will never access them that way.
We could permanently grow a savegame and as long as the amount is not very extreme, it's very  unlikely anybody will ever notice that.
We could for sure write a cleanup at map save or load but well, that's additional code.
-The tradeoff is an additional hashmap storing data that we have to store anyway.
Accessing a node in the koord=>node map might be slower than accessing the way but that depends on how we get the way. In any case, both is not that slow.


2. We get a very fast and simple operation to flush the newly calculated routes to the world.
-The only thing we have to do in parallel is swapping two references and validating each node at a koord in the dirty list and clearing the dirty list.
Later on in parallel we will need to clean up the writing map but that cleaning up also neeeds to be done in the road based approach.
When storing the routing the the roads directly, we cannot simply swap the whole road network at once. We cannot simply maintain a whole shadow world and swap it. We will need to maintain the reading/writing map on a per-road level. That means flushing the updates to the world involves iterating the whole graph and swapping the references.
In addition, how do we even iterate the whole graph? Well sure it's possible, but implementing a graph iterator is again more code than simply using the existing iterator of an already implemented hash map.


3. We do barely any calculations in step when changing roads. The only additional things we do there is setting a dirty flag. The "tradeoff" is a !node.dirty check when moving.


So in all I would say the code is simpler and involves less computation in the main thread.

Offline Octavius

  • *
  • Posts: 52
  • Languages: EN,NL,DE
Re: Possible memory leak
« Reply #8 on: February 15, 2020, 02:31:16 PM »
I haven't looked at the source code so I may be saying something incredibly stupid here, but I have some thoughts.

— Dijkstra's algorithm (at least simple versions, maybe not some highly optimised variants) can give us the shortest path from every source to a particular destination in the same time as it can give us the shortest path from the farthest source to that destination. As we want to know the shortest (or actually, fastest, but that doesn't matter) routes from everywhere to everywhere, there's no need to calculate the route from any specific place to any other specific place. You can also calculate the shortest paths from a particular source to every destination simultaneously, but as cars care about their destination, not their source, that isn't useful. There's never any need to store routes from any particular place to any other particular place, only routes to a particular place.

— On every road tile a car can go in 4 directions, which can be stored in 2 bits. You need an additional bit to indicate that a destination is unreachable. So we can store directions to 2 destinations in a byte, to 10 destinations in a 32-bit word or to 21 destinations in a 64-bit word. Using the index of the destination, it's trivial to find the right bits giving directions assuming that every road tile gives directions to every destination. At 0.4 bytes per road tile per destination, this seems doable to me. Add a few more bytes per tile to hold temporary information for each thread running the route calculations.

Whenever routes change, there may still be cars on the old route and they shouldn't get confused. So the shortest route to a destination should still be indicated at a road tile, even if it's a road leading from nowhere. This takes some extra memory, but makes calculations easier.

To do this multithreaded, one needs a write buffer and a read buffer on every road tile. Swapping them is simple. Just toggle a boolean that tells which one is for reading and which for writing. Then a loop over all road tiles is required to clear the write buffer, indicating that no route to any destination exists. This can be done in a separate thread. Then the route finding will fill in the right bits. To keep things in sync, one has to use available road tiles, speed limits and congestion data from the previous month, not the current month, so this has to be double buffered too.

— The difficult part is quickly fixing a route when a road is deleted (or made a one way street or blocked for private cars). The part of the route downstream of the disruption is unaffected and cars already on that part of the route should continue to their destination. Most of the upstream parts may still be correct too and we shouldn't confuse drivers by immediately deleting the entire route upstream of the disruption. What one can do is find a bypass, the shortest path from the tile directly upstream of the disruption to the tile directly downstream of the disruption. Wherever this bypass coincides with the old upstream route, the bypass overwrites the old route. Where the bypass coincides with the old downstream route or a different branch leading to the same destination, unaffected by the disruption, the bypass terminates and cars can follow the old route. This has to be done for each destination, but this can be done simultaneously by keeping track of the destinations for which a solution has already been found. This is unlikely to find the best route, but it may find a useful route.

Sometimes there is no bypass possible. This may happen because of one-way streets or because different parts of the road network are now indeed disconnected. If no bypass can be found from the last tile before the disruption, this tile and all tiles that can be reached from it get marked as "destination unreachable". The search is performed again from each of the neighbours of this area. If after a number of tries no solution is found, the algorithm gives up and leaves a dead-end route, leading to confused drivers when they reach the end of the signposted route. If the entire upstream route were deleted at once, those drivers would be confused too. Then we leave it to the slow multithreaded routefinding to either calculate the new routes or mark the destination as unreachable later. Having a few confused drivers every now and then is no problem.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10124
  • Languages: De,EN,JP
Re: Possible memory leak
« Reply #9 on: February 15, 2020, 02:40:59 PM »
Also, when storing the next tile in a crossings, instead of every way, one could reduce memory consumption even more, maybe another factor of ten. And even if old tiles on a detour route remain, newer crossing on a new route would show the right way. So stale crossings would not be a problem. And with coordinates and just the next direction as two bits, indeed one coordinate pair would be 54 bytes per destination and crossing.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #10 on: February 15, 2020, 04:47:25 PM »
Also, when storing the next tile in a crossings, instead of every way, one could reduce memory consumption even more, maybe another factor of ten
I had suggested this in the initial design. It was rejected for some reasons, so I did not include this in later suggestions anymore.
However, that's one of the things I meant when mentioning "we can optimise here and there". However, that will reduce memory consumption by a more-or-less constant factor but won't change anything about the O(n²)*average_line_length scaling characteristics.

Storing directions instead of koords was also part of the initial design but was replaced by koords for performance reasons. Maybe storing a direction indexed array of neighbor koords would be a good compromise. i.e. storing 48*4 additional bits in each road to save (48-2)*#destinations bits at cost of an additional array access.


— Dijkstra's algorithm
A Dijkstra is used there. However, I don't know how exactly the route seraching itself is implemented.
"In the same time" might be quite misleading and inpreciese here. To get the route from all origings to one destination, we will in fact need more time than seaching for only a single source-destination path, especially when these are quite close, in terms of cost, not in terms of hops, to each other.
However, it will in any case be much faster to calculate all-to-one for each city instead of calculating one-to-one for each pair of cities, so I support this.

— On every road tile a car can go in 4 directions, which can be stored in 2 bits.
We are not talking about 20 destinations but moreover 2000 destinations, preciesely 1700 in the above example map.
That gives us 6000 bits, or 750 bytes per node.
We should however keep in mind that there will be quite a lot "unreachable" destinations stored, so we might need data about how many routes are actually stored in each route to decide which one is better. I would guess the destination bits approach is better when it comes to both, memory and acces times.

Also, I agree with the bypass solution as already suggested in the initial design but that one was also rejected.
In any case, we should not leave old broken routes in the world forever. We need to keep track of them and delete them at some time.
I do however also agree, that immediately removing routed forwards from impassible roads is not too tough as cars on that section of the route can still perfectly use it.
The "backwards" dire is a more difficult thing. Optimally we want to locally fix the route but we should not put too much effort in enforcing this as there might be situations where there simply is no easy quick fix for that route e.g. when an intercity road was deleted, there most often is not a nearby alternative.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #11 on: February 16, 2020, 12:49:33 AM »
Thank you all for your thoughts. Some questions for Freahk and then some clarifications.

First of all, you refer to a map structure and a "hash map". Are you referring to the existing Simutrans hashtable class or something else? If referring to the hashtable class, what had you imagined using as the key, and what the value? Note that the Simutrans hashtable class requires a custom hashing algorithm for each type of key data. There are three types of hashtable used in Simutrans at present: a koordhashtable (a hashtable of co-oordinates), an inthashtable (a hashtable of integers) and a ptrhashtable (a hashtable of pointers). the latter cannot safely be used in network mode without generating losses of synchronisation if it is ever important to iterate it in a deterministic sequence as the sequence of iteration will depend on the actual physical hardware addresses in memory assigned to each object in the table. The current system uses a koordhashtable on the road tiles. If you imagined a data structure other than a Simutrans hash table, may I ask what in particular that you had in mind?

As to the clarifications: the private car routing algorithm already uses Dijekstra's Algorithm in the way suggested and has done for some time (long before private cars actually followed the routes; originally, the routes were discarded and only the times stored).

As to whether it is necessary to delete the route downstream of a deleted tile, this is complex. Certainly, there is no immediate need to do so, as the downstream route will, at least initially, remain valid. However, with the system of storing the routes in the roads as actually used, not deleting the downstream route immediately would mean that the downstream route would be inaccessible when later seeking to refresh the routes, so that even if a better route were to become available, the old route would be undeletable. It is interesting to ponder whether this would actually be a problem or not, and answering that question strikes me as less obvious than it first seemed. Given that immediate deletion is something that might have an unacceptable performance impact, it would be very useful to have a good answer to this question.

In relation to storing directions only at crossings: this was initially the idea. However, it turned out that private cars' movement code is far more computationally intensive when they are not following a route, and it is far more computationally expensive to look up a tile based on a direction than simply to retrieve that co-ordinate from memory, so the system of storing routes in all tiles was used instead. This also has the advantage that if any tile along the routes becomes an intersection after the route is created, nothing needs to be done and it works perfectly well.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10124
  • Languages: De,EN,JP
Re: Possible memory leak
« Reply #12 on: February 16, 2020, 01:24:30 AM »
When you store dirs at crossings, then the lookup of the next tile is not so computationally inefficient but you save 6 bytes. The only thing that needs to be computed is the z-coordinate of the next tile. On the map the ground next to the current one will get anyway in the level 1 cache during the hop, since the check for ways etc must be done anyway.

With the remaining byte (since the structure will use 8 bytes in memory for sure), you can have a counter, that reduces each month/season/passing vehicle and thus force rerouting after some time.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #13 on: February 16, 2020, 01:55:54 AM »
I am not refering to any specific class but in general the concept of hashmaps i.e. a key=>value data structure that performs access in O(1) (and a few more characteristics that are not pretty relevant here)
However, as we want to do koord3d=>node mapping, a simutrans koordhashtable seems to fit.
How exactly that node looks like depends on the exact implementation as discussed above. The most simple (but I guess by-far more memory consuming) approach is the implementation that is already used in the road tiles currently.
I will try to call these hashtables, but I am pretty used to the term hashmap, so sorry when I forget about it.

However, with the system of storing the routes in the roads as actually used, not deleting the downstream route immediately would mean that the downstream route would be inaccessible when later seeking to refresh the routes, so that even if a better route were to become available, the old route would be undeletable
Sure, if you don't know where your routing informations are spread around the world. The graph would know that.
However, also without the graph solution there is a quite simple solution:
When a road gets deleted -we should better say "becomes invalid" I guess- search the routing instructions for all neighbors the invalidated road points to and store these to a list.

The dirty flag btw. is a remainder of the original design, which allowed for updating single routes in the graph but later on I realised you want to recalculate and swap the network as a whole anyway, so I reduced it to a simpler form.
In the graph, we can simply delete a node including edges pointing to it from the graph. We will need to navigate one step backwards to delete these edges, which hower is not a problem if we store our neighbors in the node. These 4 additional koords won't hurt.


Edit: I see, koordhashtable is for 2d koords. However, adjusting the code to koord3d seems to be 3 lines of code.
Alternatively we could maintain the reading/wring structure on a per-node level, instead of on a per-graph level, in which case the graph can be a simple array or list of nodes with an additional active_map flag, indicating which one is used for reading curently. Access might be (well likely will be) faster and swapping works also immediately.
Cleanup will be slightly more complicated.
Thinking about it again, I guess that's the better way to go.

With this implementation we will:
1. At any time know where routing information is stored in the world if we maintain a simple list.
2. Get a very fast and simple operation to flush the newly calculated routes to the world by toggling a bool.
3. Do barely any calculations in step when changing roads. We won't need to take care of unreachable nodes due to 1 and the fact that we calculate the whole network in a run.

Further, about the removal of old routes: When we don't clear the writing table before writing new routes to it but instead sync it to the reading table and store a counter to each edge indicating how long ago it was last modified, we can indeed retain old routes for the next few reroute runs, so cars on that route can safely follow them to their destination.
However, this assumes the routes don't lead somewhere we cannot escape from, in which case it would be better not to retain that route.
This will need special consideration.
I guess with an attempt to locally fix a route when it was cut and deleting routes that don't lead to their destination should work well in combination.
However, if that is worth it or not highly depends on how long a full reroute cycle takes to calculate and thus, how often we want to calculate it.
« Last Edit: February 16, 2020, 03:39:14 AM by Freahk »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10124
  • Languages: De,EN,JP
Re: Possible memory leak
« Reply #14 on: February 16, 2020, 04:08:13 AM »
At a dead end, cars will turn around and then tell the next crossing to trigger a new search to its destination. That could take care of dead ends.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #15 on: February 16, 2020, 11:07:20 AM »
Yes, this needs to be reconsidered. It was also rejected previously for some reasons.
An issue with this is that are non immediate dead-ends. I.e. sections of road that only have a single entry but have intersections.

In any case we should remove routing towards immediate dead ends as this is quite inexpensive and will already be sufficient in many cases.
« Last Edit: February 16, 2020, 11:18:44 AM by Freahk »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #16 on: February 16, 2020, 11:39:23 AM »
Thank you for your reply. I am afraid that I am still slightly unclear about your suggested implementation - I suspect that this is me being a little dim about these concepts owing to being a self-taught amateur at computer programming.

Your suggested system would use a hashtable with koord3d objects as keys if I understand you correctly. Presumably, the koord3d objects would refer to the co-ordinates of the tiles in the world where routes from that tile are stored? If so, what do you envisage the value of the hashtable being - another hashtable, this time with the destination/next tile key/value pairs, or something else?

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #17 on: February 16, 2020, 01:59:20 PM »
Nearly as you guessed, pseudocode following:
Code: [Select]
struct graph{
  koord3dhashtable<koord3d, node_t>* read_table;
  koord3dhashtable<koord3d, node_t>* write_table;
}

struct node_t{
  inthashtable_tpl<uint16_t, edge_t> edges; //assuming destinations are stored as uint16 destination IDs.
  koord3d own_location;
  koord3d[4] referenced_from; //north, west, south, east or any other fixed order. Neighbors pointing towards us to allow for removing routes towards us.
}

struct edge{
  node_t& node; //The node this edge refers to.
  uint8_t last_updated; //to allow for retaining edges for some time after route recalculation.
}
I hope I didn't miss any attributes, but in any case this should clearly show the intention... I hope.


As mentioned before, we might want to stick with the current approach of maintaining routing information in the roads, in which case we will have to implement the read/write structure on a per-node level. We will further need to store a list or set of koord3d containing each koord that currently has routing information stored.
Code: [Select]
struct road_routing_network{
  bool current_active_tables; //indicates which ones are currently used to read from.
  std::vector<koord3d> routing_locations; //a list of all coords where roads hold (or held if they were deleted in the meantime) some routing information.
}

struct node_t{ //stored in the road
  edge[#destinations] edges; //An array of all destinations, indexed by destination ID. A destination may be marked "unknown" (no route to destination on this tile) or "unreachable"(dead-end). Also see notes below.
  koord3d own_location; //might already be stored in the road anyways.
  koord3d[4] neighbors; //north, west, south, east or any other fixed order. Allows us to quickly translate the 3-bit direction information to a neighbor.
}
//I am aware we cannot simply implement it this way, we will likely need to compose multiple std::bitset to implement a dynamic* sized bitfield type.
//*dynamic at compile time, but not growing once instantiated.
//How exactly we store edges is excahngeable, so we could simply use the hashtable approach here or use the bitset approach in the above.

struct edge_t{
  bit[3] directions;
  bit is_new;
}
« Last Edit: February 17, 2020, 02:34:59 AM by Freahk »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #18 on: February 17, 2020, 12:23:43 AM »
Thank you for your response. I am somewhat unclear on how having a hashtable of hashtables will be any better than storing the hashtables in the road tiles; what advantage does the node hashtable have over storing these data in road tiles?

As to the graph structure, I see that what you propose is a hashtable of nodes, each of which contains a hashtable of edges, each of which edge contains a node. Presumably, you mean for the edge objects to contain references to nodes rather than copies of nodes? Also, I am not sure what advantage that using an inthashtable would bring, since this would require another entirely new and complex data structure to map the nodes to numbers - we already have koord objects that can do this.

As to the second version of the system in which the data are stored in the road - do you propose an actual naked array, with all its ease of slight coding errors causing very difficult to find memory corruption and crashes and spending months chasing down bug reports about this - or a vector, which will be much more reliable and maintainable, but will take more overhead?

I am still struggling to understand the benefits of this: forgive me again for being dim. Each node is presumably a tile in the map on which there is a road? It would contain a list of edges, presumably being links to other nodes. I am not clear on how this is functionally different from the current much simpler system of storing the destination and next tile in a hashtable in the road tiles (and, remember, this data structure needs to be very efficient for reading, as it is read by every single private car object in the world in a sync_step()). What data does this system store that the current system does not that prevents there from ever being unreferenced routing information? Have I misunderstood what data are intended to be in the edges? The complexity of this suggestion appears to be getting more extreme (such that this might take months of intensive work for me to implement and debug), and I am struggling to understand the benefits. I need to know whether I am misunderstanding something critical before embarking upon the originally planned system, however. Again, my apologies if I have fundamentally misunderstood what the edges of this graph are intended to represent.

For deletion, we either have to iterate over all way tiles in the game and check which one is a road, which of those that are roads have some routing data and which of those that have routing data have data to a given destination, which will be expensive, or we delete using the methods described in my post above. I am not convinced that maintaniing a separate list of all road tiles with routing data will ultiamtely be any faster than the all ways method, however.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10124
  • Languages: De,EN,JP
Re: Possible memory leak
« Reply #19 on: February 17, 2020, 01:53:00 AM »
If you entries are uint16 hashes, you could use a sorted array. It has much smaller memory footprint, and for a low number of entries (less than 1000, which is probably the case) it has also quite fast access. Especially, since all entries are closed together in memory and likeliek already in the cache after the first access, unlike list.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #20 on: February 17, 2020, 04:39:38 AM »
First, yes I meant a reference to a node for sure. Updated this little but meaningful typo in the above post.

Thank you for your response. I am somewhat unclear on how having a hashtable of hashtables will be any better than storing the hashtables in the road tiles; what advantage does the node hashtable have over storing these data in road tiles?

I would not go for the graph. It was originally designed to allow for partly updating the graph with a few routes instead of recalculating the whole. In that case it had some noticeable advantages but was later on narrowed down to  not support this anyway. For the current requirements, any advantages of this approach can be achieved with little effort in the other approach.

As mentioned, after some thought, I would stick with storing nodes in the roads and maintaining an additional list of koords to keep track of which roads store routes.
However, about the inthashtabe, it's not that entirely new nor complex. You can find it in tpl/inthashtable_tpl.h
The idea is saving a few bytes storing ints instead of koords.
However, I would assume the bitset solution or prissis suggestion to consume less memory and perform better anyway.

[Big section of implementation details rather than conceptual thoughts]
The second version is what I would indeed go for. The evil naked array is not that evil as we don't do much black magic with that piece of memory: We get a fixed size of memory, remember that size and always access within these bounds.
However, If you really hate them that much, you could go for a std::array, it's fine. I don't think you want to go for an std::vector here.You know the exact size of the array when you initialise it and you won't add nor remove elements once you have initialised it.
Each road stores a node, I guess that is just the same as-is the nodes will just look a little different.
A node stores a bitset representing the edges.
Each edge requires exactly 3 bit of memory, thus can represent 8 different states: north, west, south, east, unknown, unreachable and two further unused ones.
However, due to hardware limitations we cannot store or adress 3 bit directly, so we need to fill up to a byte, thus we can use pairs or four bit instead.

The point in the bitset structure is that access will be quite fast. Very likely faster than the hashtable at a fairly low memory consumption of ceil(destinations/2) byte per node incuding the additional was_updated bool.
The alternative suggestion of a sorted list might be slightly slower but will reduce less memory.
In any case both will be quite fast and likely suitible.
Hashmaps might also be suitible but have a quite large memory footprint.
[/Big section of implementation details rather than conceptual thoughts]

What data does this system store that the current system does not that prevents there from ever being unreferenced routing information? Have I misunderstood what data are intended to be in the edges?
The data in the edges won't prevent this. The whole discussion about how exactly we want to store the edges is an implementation detail discussion rather than a conceptual one. All of the three options have their downsides and all of them can be exchanged without affecting the concept.

What conceptionally makes the difference here is the additional collection of (unique)koords we maintain to hold the locations of routing information in our world which will ensure that we don't leave any routing information in the world and will be quite handy to iterate the nodes, which we will have to do to clean up after recalculation.

About the deletion:
I am not quite sure if I understand you correctly.
Do you refer to the cleanup process after we have recalculated all roads and swapped the reading/writing datastructures?
We will need to prepare all datastructures for write anyway, thus we will need to iterate them all if we want or not.
If we have a collection of all routing roads, we can simply iterate it. If we don't have one things will be more complicated and even less efficient.

If you, however refered to deleting a single road, we won't need to do anything special. Delete the road and we are fine. Especially leave the forward route as-is. Some simple attemts to fix the backward route might be nice but are not neccessary.
The collection ensures that we won't lose any routing information in the world. There may now be edges and a collection entry refering to the koord that doesn't hold a road anymore but that's not an issue, we can simply check if there is a road at that location whenever we retrieve a koord.
There are tougher solutions in terms of memory consumption e.g. storing the start of each route fragment but if the whole road network is not noticably huge in memory, where each tile stores a koord and much more, a simple collection of koords to only a part of that road network will be negligible.
The same argumentation goes to iterating the whole collection: we will only do this once a run before calculating the routes. If we can calculate all these routes and write their routing informations to the nodes, which means we access many nodes quite often, simply iterating each node exactly once should be negligible.

One important point about the collection is that it should hold each koord only once, otherwise it's pointless.
To ensure this we will need to add a koord to the list if we are the first one writing to it and delete it, when we encounter at cleanup that the node is finally outdated.
To quickly know if we are the first one writing to it, we might need to maintain an additional bool in each node.

About the complexity of this approach getting extreme, I guess we might just have confused you with too many different possible implementations and their details.
The concept is even simpler than the current approach imho as we don't need to take special care of many things thanks to the constraint of always recalculating the whole network.

That said, I will start my week now.
« Last Edit: February 17, 2020, 05:08:33 AM by Freahk »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #21 on: March 01, 2020, 05:02:12 PM »
I have been working on this to-day in the private-car-routing-memory-saving branch. I now have a working version, and I should be grateful if anyone could test this.

However, I am finding that performance levels are unacceptable with the saved game from this thread: the game becomes unresponsive for very long periods, to the extent that it was not possible to run a meaningful test of memory consumption.

Using a performance profiler, most of the CPU time seems to be spent waiting for the mutex to clear while relaxing the private car routes. I will have to look into attempting to have only one thread of private car route finding running at any given time to see whether this assists; however, having this running single threadedly did not improve performance at all.

If anyone has any ideas as to how to improve the performance of this on very large maps, I should be very grateful.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #22 on: March 01, 2020, 05:51:21 PM »
I have now investigated further and pushed some further enhancements. The saved game in the opening post is now playable, but there are still long pauses intermittently. This is because these long pauses are the time that it takes the game to check all the routes from just one city. I have made these checks less frequent (once every X number of steps, where X is half the number of cities, with a minimum of one), and set it to check only one city per step if the total number of cities are > 64 to avoid excessive mutex clashes, but still each individual city takes > 3 seconds to check with my fast computer (Ryzen 3900x).

This is in large part because this is a truly huge map (600 cities), and, unlike the Bridgewater-Brunel server game, which has a number of islands of various sizes, each and every city on this map is connected to every other city, giving an enormous routing depth (there are a very large number of routes to relax for every city being checked).

I am not sure whether or not there is anything that can be done about this other than either (1) a very sophisticated system involving suspending an individual city's checking after it has done a set amount of work in a step and resuming it in a later step to spread the workload (this would be a good solution in principle, but would take a vast amount of very difficult work and may delay by many, many months any other progress); or (2) advising players with very large maps which are fully interconnected by road to use the assume_everywhere_connected_by_road setting and avoid private car routing entirely.

At this stage, I do not know what the threshold for this latter suggestion would be if it were to be adopted: this would be likely to need a very large amount of very time consuming testing.

However, one thing worthy of note is that preliminary testing has shown that this new code does seem to limit memory consumption effectively, although this map still takes circa 9.2Gb.

Edit: After running for about 45 minutes on fast forward, it is up to 11.2Gb and still many private car routes have not been found.

Edit 2: It seems to have reached 12.0Gb about 2-3 hours in, but there still seem to be routes not found.

Edit 3: Some more testing this evening: (1) the new version on this branch seems to consume very significantly less memory than the current master version with the very large map in the OP; and (2) using std::unordered_map makes performance worse, not better (it took ~3x as long adding elements to the unordered map than it did to the koordhashtable).

Edit 4: Further testing has shown that any attempt to use more than one route-finding thread at once causes drastic loss of performance because of mutex clashes when adding routes to tiles. Previously, this was not an issue, as there was no need to have a mutex for route memory storage. However, because the routes are now put directly in the tiles, a mutex is necessary, and the frequency of access to these data means that it is quicker overall to have just the one private car thread than multiple threads. Even in small games such as the Stephenson-Seimens game, although the performance impact is not easily noticeable on my fast computer, a performance profiler shows that, with multiple parallel private car route finding threads, private car route finding (and the mutex wait in particular) takes a large proportion of CPU time. Setting this to a single thread only greatly reduces this. I wonder whether there is a way to define one mutex per tile of way to avoid this issue?
« Last Edit: March 02, 2020, 10:26:10 PM by jamespetts »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #23 on: March 02, 2020, 11:49:19 PM »
I have improved the thread model somewhat by now having one mutex per tile of way. This does improve performance notably when running multiple private car route checks at once, but this can still take a long time for a single run to complete. On the game in the OP of this thread, on my fast computer (albeit with only 4 threads being used simultaneously), it can take anywhere from 5-10 seconds for a single set of 4 cities (multi-threadedly) to check. If this check is set to run every step, the game will (obviously) become totally unresponsive and unplayable. If it is set to run every X steps where X is 1/4 the number of cities, in the game in the OP, it runs every ~14 seconds.

This means that, if we assume a 20 second cycle time (14 seconds not processing private car routes, and 6 seconds processing them) for 4 cities, 600 cities will take 3,000 seconds to process their private car routes (that is, 50 minutes). After those 50 minutes have elapsed, the game will return to checking again from the first city. By contrast, running a single private car route checker at once, the time is ~4 seconds for a single run, giving an 18 second cycle time, meaning that 600 cities would take circa 10,800 seconds to process (that is, 180 minutes or 3 hours).

This also has the effect that the game will become unresponsive for 5-10 seconds at a time every 14 seconds or so, and this is on a fast computer. This is plainly not a usable way of interacting with a game. The real issue seems to be the amount of memory bandwidth required for writing the routes to memory as they are detected: a single city finding all available routes to 600 other cities plus perhaps the same number of industries and attractions again with an (estimated) average route length of (say) 1,024 tiles would have to write 1024 x 1200 hashtable entries, which amounts to 1,228,800 (~1.2 million entries). The trouble is that this cannot simply be optimised away: it is an inherent limitation of the very idea of having private car routes. Using a different data structure for the storage of private car routes (e.g. a graph) would not help here, as the same number of data members would have to be written.

In theory, it would be possible to spread the workload of finding private car routes over every step, so, instead of having a long wait every X steps, a little bit of the finding could be done in every step. However, this is likely to be extremely difficult to implement. The route finding algorithm uses the find_route() method, which is used both single- and multi-threadedly for different purposes: it is the exact same method as is used by vehicles finding their nearest depot, for instance, or by trains using choose signals to find an appropriate platform. There is no architecture in this method for storing data between cycles: it is all stored in local variables. Furthermore, it is performance critical optimised code implementing something computationally complex (Dijkstra's algorithm), so any fundamental modification would be exceptionally difficult.

The only thing of which I can think to investigate is whether there is some wait mechanism for threads that is not a barrier (i.e. is not mutual) in pthreads, so as to stop and start the running of the thread at will during the cycle so as to break this down into small segments that can be run each step rather than processing a whole set of cities all at once and needing to wait for them to end. In this way, there would be no need to make fundamental changes to the route finding algorithm, although how to make sure that any wait commands work only with the private car route checker and are ignored by all other uses (e.g. choose signals) will have to be considered carefully.

Given the amount of time that this algorithm actually takes to run, however, it is questionable whether even breaking it down in this way will yield acceptable performance on larger maps. Thought will have to be given to whether there is a limit above which the assume_everywhere_connected_by_road should be set, and whether it is sensible to introduce more granularity to this setting to allow very large games to revert to the old behaviour when this was disabled, viz. checking whether towns are connected by road for the purposes of generating private car journeys but not actually mapping out private car routes.

In the meantime, any testing of this branch would be most welcome. This does need to be tested before I incorporate its changes, I think.

Edit 1: Preliminary testing with a saved game from the Stephenson-Seimens server (1987 in-game year) suggests that this branch will remain in synchronisation over a network.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10124
  • Languages: De,EN,JP
Re: Possible memory leak
« Reply #24 on: March 03, 2020, 01:04:30 PM »
Since even the most complex games rarely spawn a car per step, would it not make much more sense to route on demand. I.e. when a car is generated, find its route and mark all tiles on its way accordingly. (The routing would stop, when a tile with a known route is encountered.) Also when you play with 20000 convois, almost any step a convoi is routed, and still the game is responding. So maybe rather making citycars like public player convois (i.e. contain a route) sound less memory and computational intense.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #25 on: March 03, 2020, 01:26:39 PM »
Since even the most complex games rarely spawn a car per step, would it not make much more sense to route on demand. I.e. when a car is generated, find its route and mark all tiles on its way accordingly. (The routing would stop, when a tile with a known route is encountered.) Also when you play with 20000 convois, almost any step a convoi is routed, and still the game is responding. So maybe rather making citycars like public player convois (i.e. contain a route) sound less memory and computational intense.

Thank you for your thoughts. I believe that I did try (albeit some time ago with a slower computer) to have private cars route in the same way as player vehicles, but the game immediately became completely unresponsive. On a larger map in more modern times, the number of cars can be very great, so I have doubts that this is viable.

However, I believe that I have devised a possible solution to the scheduling problem. If we have a counter (a local variable) in the routing algorithm for the number of routes processed since the start of the run, and if we then, once the counter reaches a certain value, call a pair of barriers waits before allowing the algorithm to continue, this will allow the workload of the route finding to be spread over multiple steps.

It will not be possible to save this state, so provision must be made for completing all cities before progressing further whenever a save (or map rotation) is triggered. This can be done by having a global variable (something like "suspend_private_car_routing") that can be set which will prevent the code from reaching the barrier waits embedded in the routing algorithm and also prevent the main threaded routing algorithm from initiating any route searches. Thus, we would set that global variable to true, call await_private_car_threads(), then start_private_car_threads() and then await_private_car_threads() again, perform the save or rotate, and then set the global variable to false again.

Ideally, the maximum number of stored routes per thread per step would be configurable in simuconf.tab, but a hard coded value will be used for initial testing.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #26 on: March 03, 2020, 02:14:36 PM »
I don't think it's that easy tbh.
Since even the most complex games rarely spawn a car per step, would it not make much more sense to route on demand.
I guess I don't understand that relation. Why do you expect on demand routing to perform better if there are less than one car per step spawned?
I would not expect this.
Additionally, currently we can use Dijkstra as an all-to-one or one-to-all shortest path algorithm. I don't think this can still be done with the suggestion can it?

Storing the whole route in each citycar also sounds like exploding memory consumption tbh but it's hard to tell and greatly depends on
1. how many privatecars share an equal route
2. how long these routes are
3. how many routes to the same destination each tile shares on average
4. how many different routes are used at each point in time.
I fear storing the whole route to each privatecar will affect memory consumption on the same level as storing each existing route in the townhall did.
We will be storing fewer different routes, but also store many routes that are exactly the same multipe times.

Combining these approaches, however, might work well:
- Use Dijkstra to calculate all routes to a destination
- In the origin city, write the routing information to each road tile that is part of the route (as-is)
-Outside of the origin city, write the whole route to the next hub to the road tile. A hub is an intersection where two or more routes from different origins to the same destination merge.
Dijkstra needs to know predecessors as the shortest path from an origin to a destination is determined by backwards iterating from destination to origin, so we should get the hubs calculated just for free and with an appropriate graph iteration on the Dijkstra result, we could also ensure to visit each node only once.

I guess I don't need to note that privatecars will need to copy each part-route instead of referencing it, otherwise "garbage collection" will be pretty complicated when a route gets recalculated.
Further, we will somehow need to store all nodes that belong to any route to a given destination, otherwise we won't be able to properly invalidate/remove all routes to a given destination.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #27 on: March 03, 2020, 08:15:30 PM »
One useful thing that the individual cars might do is help with route deletion: because the individual cars store their origin and destination, we can immediately delete the route from the origin to that destination (as far as the break, at least) as soon as a private car finds a gap in the route.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #28 on: March 04, 2020, 12:40:49 AM »
I have made some progress with this: I have managed to get a system of iterating through a fixed number of routes (16) per thread per step working. This greatly improves performance in the saved game from the opening post. However, I do not know how long that it will take to build all the route tables at this rate: it might be a long time.

I have not yet made the number of routes per thread per step configurable by simuconf.tab, and there is much cleaning up of the code to do (which will alter saved game compatibility).

Also, I have not implemented the necessary improvements to deletion as yet. However, I should be very grateful if people could test this, as it would be very helpful to have feedback from people who have different specifications of hardware trying this on a mixture of different saved games.

Thank you again for all of your feedback.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #29 on: March 04, 2020, 04:57:38 PM »
I have now removed the old deprecated city route storage code, which has altered the saved game versioning to what will hopefully be the final versioning. I should be grateful if people could test the branch for performance and stability before I incorporate this.

Offline Matthew

  • *
  • Posts: 385
    • Japan Railway Journal
  • Languages: EN, some ZH, DE & SQ
Re: Possible memory leak
« Reply #30 on: March 04, 2020, 05:33:11 PM »
Thank you for continuing to work on improving this area of the game, James.

I take it that you want us to test the private-car-routing-memory-saving branch?

If so, I get a crash at or just after pakset loading. GDB backtrace:

Code: [Select]
Thread 1 "simutrans-exten" received signal SIGFPE, Arithmetic exception.
__pthread_barrier_wait (barrier=0x555555cf0f80 <karte_t::private_car_barrier>) at pthread_barrier_wait.c:117
117 pthread_barrier_wait.c: No such file or directory.
(gdb) backtrace
#0  __pthread_barrier_wait (barrier=0x555555cf0f80 <karte_t::private_car_barrier>) at pthread_barrier_wait.c:117
#1  0x0000555555947d2a in karte_t::load(char const*) ()
#2  0x00005555558dc20f in simu_main(int, char**) ()
#3  0x00005555558f2632 in sysmain(int, char**) ()
#4  0x00007ffff6748b97 in __libc_start_main (main=0x5555555ab1d0 <main>, argc=5, argv=0x7fffffffdfd8, init=<optimised out>,
    fini=<optimised out>, rtld_fini=<optimised out>, stack_end=0x7fffffffdfc8) at ../csu/libc-start.c:310
#5  0x00005555555ab23a in _start ()

EDIT: This was built with commit 5c8008650, non-debug version, pak128.Britain-Ex.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #31 on: March 04, 2020, 06:44:04 PM »
Yes,. you have correctly understood what I have asked to be tested: thank you for this.

However, the error given is rather mystifying: I do not understand how it is possible for a thread barrier to raise an arithmetic exception, and I have not been able to reproduce this myself.

I have noticed a thread scheduling problem that seems to be causing thread deadlocks in certain situations, however, so it is possible that these are somehow related.

Can I check whether anyone else is experiencing this issue when testing this branch? It might be helpful to run the test multiple times, as thread errors can be non-constant.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10124
  • Languages: De,EN,JP
Re: Possible memory leak
« Reply #32 on: March 05, 2020, 02:52:24 PM »
How many citycars you have? In my savegame library I did not find a network game (or any other developed map) which had much more city cars than convoys (they were both with in the order on a factor of three at most), much less citycars (factor of 100 or less) than waytiles.

So assuming a route will cost about 4000 bytes (for an average route) per vehicle, even 10000 citycars will just consume 40 MB. A hashtable per waytile will consume at least (131*8)+(5+1+8)*destinations (let us assume just 150 destinantions, so we can hash them as a byte), so each tile will need roughly 2000 bytes. (One can of course cut this a lot using a vector sorted by hash.) However, a map will have much more way tiles than cars. In my calculation, individual routes per citycar will only consume more memory, when you reach half as much citycars as way tiles. But at this poiny yhe congestion will become unbearable.

Moreover, the saving of routes is rather trivial, and the game state is immediately recovered after saveload, which is important for network games. And as a bonus, overtaking citycars becames as predictable as overtaking normal convois.

As optimisation, routing to non-reachable destinations is the most expensive. Hence, after a citycar could not find a route to a city, it could notify the spawning city, that for this month there is notroute to xyz. Thus saving a lot of wasted route finding.

Offline Freahk

  • Devotee
  • *
  • Posts: 1244
  • Languages: DE, EN
Re: Possible memory leak
« Reply #33 on: March 05, 2020, 04:30:36 PM »
How many citycars you have?
That's hard to tell.
What I can say for sure is that there currently are 1006 convoys in that world and the average journey time of bus lines in between two neighbored cities is roughly 2 hours, so one hour from one city to another.
Latest stephenson-siemens save, which is a rather small mal in both, size and inhabitants, had ~500k private car journeys spawned per month, which is roughly one car every 8-th tick (2^22 ticks per month)
So, it is reasonable to estimate at least 80k private cars to exist around the world at any point in time, which is much more than the above mentioned number of convoys.
I would even expect it to underestimate the number of cars by a factor of 2-4.

Storing routes to private cars would require us to store the full routes to each of the 80k cars. Calculating all routes in the world results in much less routes that can even share a lot.

Imho, we should rather optimise the road network stored approach. There are still quite a lot things we can do:
1. optimise the way how nodes store routing informations.
2. reduce the memory consumption of road tiles that don't store any routing informations.
3. reduce the number of nodes that store routing informations.
4. reduce the number of destinations stored per node.

About 1.
If the rather simple implementation of hashtable_tpl_t is an issue (it always uses a fixed number of 101 buckets, so it likely is far from optimal), consider optimising it or choose (well tested) std::unordered_map instead.
Alternatively, try out an ordered/sorted vector, it will be smaller and might perform better or worse, depending on the number of stored destinations.
Mentioned bitmap also is a promising candidate: We can store routing informations to 21 destinations per uint64, which results in 96*sizeof(uint64)=768 bytes to store routes to all of the above mentioned 2000 destinations.
 
About 2.
A rather small memory consumption optimisation. Within cities there are quite a lot road tiles that won't hold any routing data. We can save a little memory when we don't reservate 101

About 3.
Most of the roads with routing informations can be interpreted as part of an edge instead of as a node.
Those nodes don't need to act like nodes, thus we don't need to store a whole hashtable with quite a lot routings here.
Storing routing informations is only required if there are routes diverging or merging on that tile. In any other case it is sufficient to maintain two linked lists or a double linked list (take care of oneway roads!) We don't need sto store any destinattions or whatever here. Cars will simply tell the road "I move along edge 0" and the road as part of the edge says, "fine, move to (1,2,0)".

There is one thing left:
The above assumes we have such a structure and cars already know which route they follow, but how do we build that structure up? And how do cars find the nodes if there are only very few nodes containing routing informations?

As we calculate routes using Dijkstra in an all-to-one manner, detecting merging sections to the same destination is quite easy.
To detect other diverging or merging routes, however, we will need the help of our edges. To help us, they will need to know their neighbors along the edge. Optionally, they can also know the related nodes to speed some things up at cost of a little memory. I will assume this but everything can also be done without knowing them directly.

When adding a new route to the world, we will iterate along that route from origin to destination.
The very first road of our route is by definition a node, as edges cannot exist without a node at both ends.

Whenever we leave a Node, we will add the destination to the routing table.
Further, we will remember that node as our previous node, ask that node for the next node in direction of our next road tile and remember it as the next node.

Whenever stepping from one roat tile to the next, there are the following options:
1. we are on a known edge. In this case, we will have the next node stored. The next tile will either be
  -part of the edge from our remembered previous node to the next node. In this case we will simply enter that road without any action.
  -the next node. In this case we can again enter that road without action. We will update previous node and next node and continue.
  -not an edge nor a node. In this case we cannot simply continue. We are leaving an existing edge, so we have to upgrade¹ our current tile to a node.
2. we are not on a known edge. In this case we will not yet know the next node. The next tile will either be
  -a node. In this case we will link² the nodes.
  -part of an existing edge. In this case we will upgrade¹ that tile to a node, and link² the nodes.
  -not a node nor an edge. In this case we will simply enter that road without any action.

¹Upgrading an edge to a node means
1. creating the datastructure itself.
2. fill the routing table. Destinations that can be reached via A without passing B will refer to the edge to A, the same goes for destinations that can be reached via B.
3. link² A and B to the new node, using the existing path from A to B.

²Linking nodes means
1. Add the next node as an neighbor of the previous node
2. Jump to A and iterate along the given path to B and transform each tile to a node, that means write the previous and next tile to it and reference the related nodes.


About 4, I won't go in detail as this can later on be implemented if point 1-3 are not sufficient. Further note that this will finally make the whole network dependant! Recalculating routes to a single town will most likely invalidate quite a lot other routes, so this can really only be done if the whole network is calculated and swapped as a whole.

We can reduce the number of destinations per node using the same principle that real-world roadsigns use:
When you start in town A on one side of the map and you want to get to town B on the other side of the map, you won't find any town B on the roadsigns of your starting city.
However, you have road map at home, so you looked up your route (yes, very old fashioned)
After reading the map, you now know to reach town B from town A, you will need to drive via town C, D and E, which are (hopefully) on the roadsign of each previous town. Sometimes you won't even reach one of those towns along your route as before entering that town, there is a roadsign to the next town.

The same can be applied ingame.
We will quite often be able to describe the requested route from A to B as a sequence/list like C, D, E, B
It is important to note that these are some kind of chunks/areas of the map instead of exact (city center) koords.
When we calculated a route, we can keep track of which chunks we pass. The sequence of these chunks is what I will call the large scale routing.
The situation is a little bit tricky when we have routes depnding on routes that have not yet been calculated.

... to be continued when required ...
« Last Edit: March 06, 2020, 12:01:37 AM by Freahk »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 20190
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Possible memory leak
« Reply #34 on: March 05, 2020, 11:42:00 PM »
The number of private cars is not the full story: we need to know routes between all points to be able to know whether to generate private cars in the first place, as passengers will take their private car only if the destination is reachable by road, the private car journey is within their journey time tolerance, and either the private car journey is faster than any other method, or this particular passenger always prefers to use a private car where possible.