Pathfinding algorithms

jamespetts · March 31, 2009, 04:06:31 PM

In my work on Simutrans-Experimental, I have found that, in order to have competition between different transport providers, and to give full effect to some of the new features such as average speed and comfort, it may well be necessary to re-write part of the pathfinding engine: that part that deals with finding the route for passengers/goods to go from their origin to their destination.

From my investigations, the current routing code seems to work something like this: each packet of goods or passengers has a quickstone (modified pointer) to (1) the final desintation; and (2) the next transfer. Whenever a convoy calls at the station at which goods are waiting, it checks to see whether it calls at the next transfer: if it does, it boards that convoy. At the next transfer, the packet disembarks, and calls the routing method (described below) to see what should be set as the next transfer.

For each station, there is a list of stations (called "warenzeil", stored in a vector) that can be reached directly from that station. This list is updated periodically (I have not investigated to see what triggers the update), and does so by scanning the schedules of lines and lineless convoys to check (1) whether they call at that station, and (2) at what other stations that they call. The list is in no particular order.

Then, on each re-routing, a list of structs is created. The structs (called HNodes) contain a quickstone pointer to a station, a quickstone pointer to a parent node, and some positional data. The structs are stored in a fixed-size array of 65335 elements. Initially, the list contains only one node: the starting station. The warenzeil data for the starting station is then obtained, and all the stations in that list added to the array, in the same order as they are in warenzeil (in other words, random order, or, possibly, in order from the oldest to the newest in building dates). The parent link is set to point to the starting station in each case. The program then checks whether the destination station is anywhere in the list. If it is not, then the process is repeated for the next item on the list (the next station), and all of the parent links from the items added to that list are set to that station, and so on until either (a) a fixed maximum number of transfers is exceeded (usually 7 or 8); or (b) a route is found. If no route is found within the requisite number of steps, the journey is marked as "no route".

A demonstration using ASCII graphics might help. S represents the starting station, and D the destination. Other letters represent intermediate stations. The <-- shows which part of the list is currently being processed. The left-hand column represents the station, and the right-hand column, the parent.

List as it starts:

S NULL <--

Next step:

S NULL <--
A S
B S

Next step:

S NULL
A S <--
B S

Next step:

S NULL
A S <--
B S
F A
G A
H A

Next step:

S NULL
A S
B S <--
F A
G A
H A

Next step:

S NULL
A S
B S <--
F A
G A
H A
I B
J B
K B
L B

Next step:

S NULL
A S
B S
F A <--
G A
H A
I B
J B
K B
L B
M F
N F
D F

Next step:

S NULL
A S
B S
F A
G A
H A
I B
J B
K B
L B
M F
N F
D F <<< DESTINATION FOUND - stop looking

Then, the packet of goods or passengers looks like this:

Destination: D
Next transfer: A

The packet will then board the next free convoy going to A, where it will recalculate its route from there, and set the next transfer to F, and repeat that process until it reaches D. The trouble with that is that the route that it finds is largely random: although the route that it finds is optimal in the very narrow sense that there will be no possible route with a fewer number of transfers, it will not take into account any aspect of the quality of that connexion. So, the A - D route might be rickety old 'buses with an average speed of 25kph taking a terribly circuitous path, whereas there might be another route via B (say, B - C - D), which might be two express trains going using a direct route, with an average speed of 100kph each. This is not good if any element of competition between transport providers is to be simulated.

The difficulty is that, as stated in the code comments, the method for searching for the route is very processor intensive: changing it to a method (such as A* routing) that takes into account the cost of each connexion, as well as whether it is possible, would add another set of computational steps to each iteration over the cycle in the routing system, and substantially increase the amount of CPU time dedicated to routing. The result would likely be that large games would be unacceptably slow on common hardware: large games can already be slow.

So, I have come up with an alternative, and I should be very interested in any comments before I start to implement it, just to make sure that I have not missed anything, or that there is not a better way of doing it. In the method used to build the list of stations that can be reached by direct connexion from the existing station, the journey time (based on the average speed and the approximate route distance, taken by calculating the distance between all the intermediate stops) and (for passenger-carrying convoys or lines) comfort is also recorded. That information (and possibly some information on waiting time) is placed in a struct together with the quickstone pointer to the stations, and those structs, rather than just the quickstone, as put into the list. Where more than one convoy or line serves the same destination, the best values are put into the struct.

Then, the stations call a new route-finding method, which uses an A* algorithm to find the best route to each reachable destination from that station (up to the maximum number of transfers). The journey time (in the case of passengers, combined with the comfort) is used as the cost for the A* algorithm, and the distance (possibly the Manhattan distance) between the stops used as the heuristic. That list of ultimately reachable destinations, again with the total journey time and average comfort throughout the whole journey (taken by apportioning the comfort between each leg of the journey depending on the journey time of that leg) stored in a struct. Also in the struct would be a list (possibly a queue) of all of the intermediate stops, in order. This list would be re-built in that way whenever the lists need to be rebuilt, with the same calls as for the method to re-build the lists as it stands now. Those would be stored in a hashtable at each station, the key to which would be a quickstone pointer to the destination.

Then, whenever a packet of goods or passengers is generated at a station with a destination, it would simply ask the station for a route: the station would then retrieve the pre-calculated route from its hashtable, and pass that to the packet waiting in the station. The packet would not then need to re-calculate its route at each stop: it would be presented with a pre-calculated optimum route to its destination. Whenever the routes need to be recalculated, the packet could simply request a new route (queue of intermediate transfers) from the station at which it is currently sitting.

One thought that I have had, but I am not sure about, is whether the route finding for one station should make use of the routes already found by another station. Take the example used above for demonstrating the existing model. Suppose that station S was re-calculating its routes: ought it, when it gets to station B, simply ask station B for its pre-calculated route to station D, or ought it calculate all the remaining steps itself, by checking C's list of direct connexions? Would there be some danger of recursion and not updating properly if pre-calculated routes could be used to pre-calculate other routes?

In any event, this method would find the optimum route to any given destination station from any given starting station. What about where there is a choice of starting stations? The present code in Simutrans-Standard just picks a starting station at random. The current version of Simutrans-Experimental looks for up to 4 starting stations, and then compares the number of transfers from each, and chooses the route with the fewest number of transfers. With the system described above, a packet in search of a starting station could look up the route to the destination given by each starting station, and query each to check the journey time and average comfort, and thereby choose the optimal route.

This would facilitate competition between players and the AI (or other players in a multiplayer game, either asynchronously using the current code, or using the network code that is being developed): players could build a better route than their competitors, and thereby secure all the traffic (until it started to get overcrowded, in which case the ratings would decline, and so on with negative feedback). Even if a player has a monopoly, it would ensure that passengers/goods made more intelligent choices about their routes, giving more realistic routing behaviour, making better routes more popular (and therefore profitable), and worse routes less popular.

I have also been considering adding some sort of waiting time to the mix, to test, not just the time that it takes to get from one station to the next, but the overall time spent by a packet of goods, including time spent waiting at stations. I had previously thought this impossible because of the way in which goods packets were merged at stations, but have since devised a workaround: where packets are merged at a station, it is always the case that one of the packets being merged has newly arrived at the station, either by being generated locally, or by being unloaded, and can be assumed to have a waiting time of zero. Therefore, the average waiting time of the merged packet can be calculated: if there is a packet with 10 passengers who have been waiting for 10,000 ticks, and a new packet unloaded at that station with 20 passengers, the average waiting time of the whole packet can be calculated as 1/3rd of 10,000 ticks (10 of 30 passengers waiting for 10,000 ticks, 20 of 30 passengers waiting for 0 ticks), and the waiting time should be 3,333 ticks. The number of ticks can be mapped to a notional time calibrated to the equivalent of the times used for calculating the journey time for the purposes of the new revenue model. The waiting times per destination (here, not the ultimate destination, but the next transfer) per station can be stored at each station, and form part of the total journey time of the route.

Furthermore, taking from Isidoro's old quality of service patch, a packet can initially be set not to board a convoy unless its average speed/comfort matches the optimum calculated when calculating the route (the actual line/convoy might need to be identified). This would stop passengers who had arrived at a station with the intention of catching the express train boarding the local stopping service to the same destination merely because it arrived sooner (even though the express train overtakes it en route). However, after a certain amount of time has passed, the packets would be willing to board any convoy going to their next transfer (the elapsed time would be checked whenever a convoy arrived). Furthermore, after a very long time, the packets might be discarded entirely (adding to the number of unhappy passengers at the station, which would negatively impact on revenue for services starting at that station, under the New Revenue Model for Simutrans-Experimental).

I should be very grateful for any feedback on these ideas, or any suggestions for how they could be made more efficient or otherwise improved. As can be imagined, this coding will take a long time, and it may be several weeks before I am able to release the next version of Simutrans-Experimental containing the New Revenue Model.

The Hood · March 31, 2009, 04:58:28 PM

James, no offence intended here or anything but I can't really tell whether that's a good idea or not because of the sheer length of that post! It sounds good, but each time I read it I reach overload about a quarter of the way through! Any chance you could summarise it for ease of digestion?

jamespetts · March 31, 2009, 05:26:26 PM

Quote from: The Hood on March 31, 2009, 04:58:28 PM
James, no offence intended here or anything but I can't really tell whether that's a good idea or not because of the sheer length of that post! It sounds good, but each time I read it I reach overload about a quarter of the way through! Any chance you could summarise it for ease of digestion?

I appreciate that it is long, but that is because I am discussing the suggestions at a very detailed, techincal level in order to get feedback from coders on the more technical aspects of the pathfinding algorithm, rather than the gameplay questions of whether the features that I intend to put in would be desirable of themselves. That is why this is in the general "Development and Bug Reports" forum, not the Simutrans-Experimental subforum (that, and partly because this thread documents the existing path finding engine in some detail, which some people might find useful in itself).

However, if you would find it interesting, I will summarise briefly the gameplay effect of what I wish to achieve, in bullet-point form:

Game automatically finds the best route to any given destination, rather than the first route that the program happens to find
What counts as the best route assessed by reference to the journey time and (for passengers) the average comfort
Waiting times at stations to be measured: passengers/goods will initially wait for the best service, but, if they have been waiting too long, will board anything that will get them to their destination. If they have been waiting a very long time indeed, they will leave entirely, and record as "unhappy".
Waiting times to be taken into account when deciding the best journey time
When there are multiple possible starting stations for passengers or goods, they will choose, not at random, as in current Simutrans-Standard, nor on the basis simply of the fewest transfers, as in current Simutrans-Experimental, but on the basis of the best route
Pathfinding algorithm to be rewritten so that it is called less often, hopefully increasing speed (or, at least, offsetting any speed reduction caused by using a more accurate method of finding the best route)

I hope that this is clear - do ask if you want to know more :-)

The Hood · March 31, 2009, 06:36:51 PM

Much clearer thanks! Looks great - if you can achieve all that without making simutrans slower than a sloth on tranquilisers then you are doing well!

VS · March 31, 2009, 07:15:10 PM

Je n'ai fait celle-ci plus longue que parce que je n'ai pas eu le loisir de la faire plus courte.

--Blaise Pascal

I don't know what exactly, but the sheer amount of data needed to express your ideas suggests that you are doing something wrong.

This is the internet. People have attention span of a goldfish - myself included :(

prissi · March 31, 2009, 07:30:20 PM

This is unfourtunately exactly what gerw was trying without sucess. And he did only try to minimize transfers ... A* is theoretically faster, if you have a good heuristic. Manhatten distances is not a good one, since most transfer routes tends to be way longer than the initial ones and in the end you have to check the whole tree.

Also the list is generated, wheneve a schedule is changed (because that rebuilding itself is very processor intensive. Perhaps long time player remember the freezing of the game for the first day every month => This was due to stations recalculating all their connections.) But when I start a convoi or assign a line the travel time is still infinity (or zero, or whatever you initialize it with). Thus the wrong time will remain even though the connection is open. Thus no one would go there if he/she/it cannot go something else.

Imagine an express train set to 100% load: Nobody would board it until is has a travel time, i.e. it travelled at least one. But for that it needs to be filled once ... and hao to treat connections with multiple cars going that connections?

Imho the only meaningful measure could be intermediate stops, which is still on my todo list. Apart from that I see little chances of including real travel times of congestions or whatever. And not too forget: For the overwhelming number of people passengers of simutrans are intelligent enough ...

jamespetts · March 31, 2009, 11:25:43 PM

VS,

half of that post was spent explaining the existing routing system; do you say that there is something wrong with the existing system for the same reason?

Prissi,

thank you very much for your feedback - it is very helpful. I will have a think about that, and reply further when it is not so late. I appreciate you taking the time to respond in relation to this :-) Thank you.

gerw · April 01, 2009, 07:08:49 AM

Quote from: prissi on March 31, 2009, 07:30:20 PM
Imho the only meaningful measure could be intermediate stops, which is still on my todo list.

I think, this isn't hard to include. Therefore you have only to add the stops to warenziel, which are directly reachable from that station (without intermediate stops). Then, of course, you have to increase the number of transfers and you have to take care, what to do with passengers at intermediate stops.

jamespetts · April 01, 2009, 08:00:12 PM

Prissi,

I have now had further chance to consider the points that you made. Firstly, thank you again for replying and considering my thoughts, recorded at some length. Secondly, I am having some difficulty in understanding why the Manhattan distance is not a good heuristic for the A* pathfinding system; is it not true that, for any given possible transfer station, it is more likely to be part of the shortest route to the destination the nearer that it is to the destination?

Thank you for pointing out the issue about uninitialised average speeds: I had not considered that. However, on consideration, I do not think that the problem is insurmountable: the program could simply detect when the values are uninitialised, and use a heuristic instead of an actual figure in those cases (such as assuming that the convoy's average speed is half its maximum speed - slightly generous, but it is better to be biased in that direction for a starting heuristic, the assumption being that people are keen to try new things in the hope that they will be good). That should deal with the express train with the 100% loading setting problem, I think.

So, for that reason, I disagree that intermediate stops are the only meaningful measure: the initialisation problem can be circumvented by using a heuristic in those cases, and, in real life, passengers do strongly prefer routes with lower journey times. Indeed, any effective competition between transport providers really needs a system such as this. It is, perhaps, because we disagree on topics such as this that Simutrans-Experimental exists in the first place

But that is understandable: different people have different preferences, and, with more options, more people's preferences can be catered for.

As to the speed, I am aware of Gerw's previous efforts, although, from what I understand, they did not add any function, simply sought to make things faster, and testing showed them to be somewhat (but not enormously) slower. I am not sure that I understand precisely how Gerw's system worked, nor why it was slower, but the reason that I am looking at doing it this way is that, if I am to use a different (and possibly more intensive) pathfinding algorithm, then the test performed on Gerw's code is not valid for showing that this code would be slower done per transfer rather than per schedule rebuild, and the more intensive the computation, the more will be gained by trading off processor time against memory consumption. Do you think that using the A* method with journey times as the cost and the Manhattan distance as the heuristic would really be faster if it is done by the ware packets per transfer, rather than by the stops per rebuild? If you do think that with the code with the added function that I propose to add, it would be faster to do it that way than the way that I propose, then I should probably do it as you suggest. I do think that this functionality is important for the greater economic depth for which Simutrans-Experimental aims, so, unless there is no way of making such a method acceptably fast, the only outstanding question is whether the method that I have proposed is the fastest means of achieving the functionality that I want to achieve. If you think that it is not, please let me know.

Thank you again for your reply :-)

prissi · April 01, 2009, 08:49:57 PM

Actually I would really appreciate if you could limit your post to less than 2kB ...

Heuristics with A* are a very non-trivial thing. I fthe heuristic overestimates it will be not the best route but a weird one. This is not a problem with Manhattan distance. However, if the distance is underestimated by about 10% you are testing a lot of unneccessary halts too. If it is lower than that, you will test as many tiles are for breath search but with more overhead, i.e. A* will be considerably slower than the normal undirected search.

One might consider some other strategies, i.e. sorting the stops by the number of intermediate stops and then by the number of connections. This would result in more stable, more "expected" connections.

@z9999
Recording the intermediate stops is simple. However, then the loading routines need chages too, since then the goods should only load into the convoi with this number of intermediate stops.

jamespetts · April 01, 2009, 09:03:46 PM

Prissi,

ahh, sorry: how do I check the the size of my posts in kilobytes? Thank you very much for that interesting information - I shall have to consider that carefully. That is very kind - thank you :-)

prissi · April 01, 2009, 09:07:16 PM

I would say keep them to less than ten lines and that will make them more readable especially to the overwhelming number of non-native speakers like me too.

jamespetts · April 01, 2009, 09:21:59 PM

How would I get all the information accross in ten lines?

mwoodburn81 · April 01, 2009, 09:22:13 PM

James,
If it makes you feel better, I like reading your long posts. Perhaps in the future if postlength > 7 paragraphs you could do a executive summery type thing in the beginning like you did in your other thread. It would be useful to those who don't have the time (or attention span) to read the whole post.

Regarding the post it self. I don't know if this would help or not, but maybe if updating every station list at once becomes to CPU prehaps instead you could just update one at a time every so often, keeping track which was the last one to be updated. That way you spread the spike in CPU over the course of the whole month rather than just one the first day.

Also, I like to say, all the effort and changes you have made so far is very impressive in my book, and I would love it if you keep up the good work.

Thanks

Michael.

VS · April 01, 2009, 09:29:37 PM

Finally found the willpower to read it all!

The one idea that springs to mind immediately is that routes between station could cache number of convoys on that connection (reference counter) and thus avoid most recalculations. However if I understand the intent behind this change, it does not actually get to the point which is that some indirect connections could be favoured over direct, due to quality. And updating the quality would get here a lot more tricky - some sort of weighed average would be (probably) needed so as not to shift resource usage to "different department" so to say.

Maybe the scheme could be inverted: instead of keeping local lists of connections, one global list could exist, probably hashed by the two end stations. Given that connections are symmetrical, sorting the inputs could lower data amount to half.

jamespetts · April 01, 2009, 09:36:29 PM

Michael,

thank you for your reply; and also - welcome to the forums! I am very pleased that you are interested in the work on Simutrans-Experimental. I think that I shall take your advice about the executive summaries - they do seem to help :-) I am glad that you find my posts interesting.

As to the checking all at once, that is an interesting thought. That thought had crossed my mind, too, but the trouble with it is, I think, that if not everything was updated at once, one might end up with certain things being broken: in other words, the reason that one updates the routes is because a schedule has changed somewhere. There is no reliable way (as far as I can tell) of checking which routes will and will not be affected (or at least, no reliable way that is quicker than re-checking everything). If you can think of such a way, however, that would be very much appreciated!

VS,

thank you for reading and considering it all - very interesting thoughts! I am not sure that I fully understand the first point about reference counters, however: could you explain a little more, perhaps? Why is the number of convoys on a particular route important? How would knowing that avoid recalculations?

I am also a little unclear on why the system that I have described would not favour routes with lower average journey times/higher comfort: A* routing is designed to take into account the movement cost between nodes, and be able to find a more indirect route if the total movement cost of the indirect route is lower than that of the more direct route.

The global list idea is a very interesting one, though. However, routes are not necessarily symmetrical: consider a circular 'bus route that only has 'buses going in one direction: going back might well be quicker using a different 'bus route (not part of the circle) than going forwards. Can you tell me a little bit more about your hashing idea?

Thank you again for replying - your input is much appreciated.

mwoodburn81 · April 01, 2009, 10:12:59 PM

You could still have a unsymmetrical global map where the origin station and the destination station are the keys. You just have to remember that, AtoB != BtoA

jamespetts · April 01, 2009, 10:58:46 PM

Michael,

could you expand on how that might work? Also, how would that be an advantage?

mwoodburn81 · April 02, 2009, 01:06:03 AM

Disclaimer: the last time i did any significant programming in C++ was in high school, and my compiler didn't even have the Standard template library stuff back then. So any suggestion I make may be way of base or just out right wrong.*

I am not sure if it would save in calculating time, but I would think it would be faster/easier to look up... Basically I just envision a table where you have a two dimensional array where you lookup what the next transfer will be.

e.g. viaNextTtransfer = array[StationA][StationB] ;

--
_{*I am looking at learning C++ again. I been doing some web stuff in php and perl so it shouldn't be to hard. This past weekend I made a little C++ program to plot data (that I copied from my current simutrans game) to make a map that looks like one of those subway/train maps that various transit authorities publish.. Nothing to fancy, and I don't really like the code I wrote, It is kind of sloppy.}

Really the hard part I would think is making the table. How do simutrans look up the next transfer now? Do it generate it on the fly or do it cache it?

z9999 · April 02, 2009, 06:22:33 AM

[off-topic]

Quote from: prissi on April 01, 2009, 08:49:57 PM
@z9999

Maybe @gerw ?
You often confound gerw and me.

gerw · April 02, 2009, 07:16:54 AM

Quote from: prissi on April 01, 2009, 08:49:57 PM
@z9999
Recording the intermediate stops is simple. However, then the loading routines need chages too, since then the goods should only load into the convoi with this number of intermediate stops.

I think, you misunderstood me. My idea is the following: Each good knows it's next intermediate stop and take any convoy, which has this intermediate stop as next stop in his schedule. Consider a good, traveling A-B-C-D (this are already the intermediate stops). Then it will take every convoy, which has next halt B, no matter, how long this good will travel with this convoy. So the transfers aren't minimized anymore.

Imho, this idea could be realized, if the haltestelle_t::warenziele only contain the next stops of the convoys serving this station.

prissi · April 02, 2009, 01:30:47 PM

@gerw
But then there will bi never loaded goods at A for a line A-B-C because in A C is not know?!?

@z9999
very sorry, will try to do better.

@mwoodburn81
The idea is nice as such map would need "only" about for 500 stations 500*500*2~0.5MB if it only contains solely the next stop on a route to this stop. It is a little worse, since it is needed for all good categories, thus is it 8MB assuming 16 different kinds of goods, but this is still possible and the rerouting would needs to be done immediately; apart fro the actual programming issues it might be a very fast but not too memoryconsuming implementation.

jamespetts · April 02, 2009, 04:19:09 PM

Prissi/Michael,

I am not quite clear how a global two-dimensional array would be better than using object-oriented features of C++ and storing the lists in stations ("halts"). Also, would a two-dimensional fixed array not be very inefficient in terms of memory, since the maximum possible amount of memory that it would ever need would need to be allocated from the start? The plan was to use a hashtable, using the quickstone pointers to the destinations as keys, and the structs, containing quickstone pointers to the next transfer and the average journey time as the values. Those hashtables (contained in each halt object) might themselves be contained in an array for each type of goods, which is a small and fixed number.

However, my one query on that is whether it is possible within the existing Simutrans templates to have a hashtable with quickstones as the keys, rather than straight pointers, and, if it is not, how best to go about adapting the pointer hashtable class to deal with quickstones.

(Incidentally, Michael, your last question, I think, was answered in my first post).

Edit: Incidentally, Gerw, your idea wouldn't do what I want to do, because it would not minimise journey times.

gerw · April 02, 2009, 08:08:15 PM

Quote from: prissi on April 02, 2009, 01:30:47 PM
@gerw
But then there will bi never loaded goods at A for a line A-B-C because in A C is not know?!?

But all goods going from A to C will know, that there next halt is B, so they will book every convoy travelling directly to B.

jamespetts · April 12, 2009, 05:37:37 PM

The pathfinding system as described above, with some modifications, is now included in the latest release of Simutrans-Experimental. Get it here. See here for the source code.

The paths are stored in a hashtable in each halt. Whenever a packet of goods tries to get a path, it looks to see whether a path to the destination is included in the hashtable. If it is, and the paths have not been marked as stale (for example, by a change in schedules, or the passage of time (default: two game months)), it will simply retrieve the next halt from the hashtable. It will not store the whole route: just (as now) the next halt. The same process will happen when it gets to the next halt, until it gets to its destination.

If it cannot find the destination in the hashtable, and the search has not been marked as complete, or in any event if the paths are marked as stale, it searches for a new route to that destination. I found that I could not use A* routing because there was no sensible heuristic, so I used Dijkstra's Algorithm instead. I use the binary heap template already in Simutrans for the open list, and the same hashtable as the closed list. Although it is slower than A* at finding any one destination, it has the advantage of finding the shortest paths to lots of destinations all at once, which are then stored in the hashtable. The search will run either until it is complete, or until it finds the shortest route to the goal stop, where it will abort early, having already stored the paths to all of the other stops to which it incidentally found the shortest route in the hashtable. The open list is stored so that it can resume where it left off the next time. If it finishes the search in finding, or without finding, the path to the destination, it will mark the search as complete. The depth of the search is limited with the "max_transfers" setting in simuconf.tab, but it can no longer accurately specify the maximum number of transfers for any packets of goods: rather, it limits the iterations through the algorithm to the total number of stations multiplied by the value of max_transfers.

If the hashtable lookup to the path fails to find any path to the destination, the search is marked as complete, and it is not stale, then the route will be marked as "no route" in the usual way.

In cities, passengers will select all the starting halts that are near their origin, that accept passengers, and that are not overcrowded. The passengers will then search for a route from each of those start halts to their destination, and choose the start halt with the shortest travelling time to their destination. That way, there can be effective competition between different players. I have not altered the system for industry choosing its origin for the time being, since there seems to be a sophisticated system for that already in place.

Alongside the next transfer stop to each destination, halts also store the identity of the fastest line or convoy to that next transfer stop: the line or convoy whose journey time would have been used as the basis for the calculation of the fastest journey in the first place. Goods (other than those with a 0% speed bonus rating) have a maximum waiting time (configurable in simuconf.tab), after which they will leave the stop, and, if passengers, register as unhappy. For the first one third of that waiting time, the goods or passengers will refuse to board any convoy or line not being the fastest convoy or line stored in the halt. After that time has expired, they will board the first vehicle that takes them to their destination. This can be used to increase competition at public stops, and can also make the utilisation of the player's network more efficient, and can even increase a player's profits (as a higher proportion of time-sensitive goods will be using faster services, which give a greater revenue). This behaviour is also more realistic than the default behaviour, which should make it easier to understand.

Because the paths are only recalculated periodically, not every time that passengers or goods need to find a route, the overall performance of this system is not noticeably worse than the default system, even though Dijkstra's Algorithm is an inherently slower method than the simple beam search used by Simutrans-Standard (described in the first post). However, there is a noticeable slow down for a short time when the routes are being recalculated.

One thing that I do wonder is how efficient that the Simutrans hashtable templates are at repeated lookups. Looking inside their code, it seems that they use singly linked lists, and iterate over every member whenever a key/value lookup is performed. Can one of the main developers confirm whether this is so, and, if so, whether a different kind of hashtable would make a noticeable difference to performance?

In any event, I should be very grateful for anyone who would like to download the latest release of Simutrans-Experimental and try out the latest system: bug reports in particular would be welcome.

prissi · April 12, 2009, 08:21:28 PM

Actually, simutrans already has (a very much optimized) Dijkstra incorporated. We have tried also your solution and in the end profiling showed that chaching was consuming quite a lot of memory compared to the gain.

What are the results of your profiling? I would be very interested, since in this area gerw and I worked for many month without a definitive conclusion. (In the end we always came back to the original algorithm.)

jamespetts · April 12, 2009, 11:00:29 PM

Prissi,

alas, I have yet to be able to find a way of setting up a (free) profiler. If you know how I can do that (in Windows), I'd very much like to know. The main reason that I switched to Dijkstra's Algorithm was to be able to measure the shortest journey, not because it performed better. If it was just a question of performance, I'd probably leave that to your judgment, and stick with what was in Simutrans-Standard :-) The work on performance was simply designed to ensure that using an inherently slower algorithm did not make the game slower: caching routes may well not have given a performance advantage using Simutrans-Standard's beam search, but is very likely to have made a difference to the search using Dijkstra's algorithm.

Actually, I don't think that the existing type of search that Simutrans uses is Dijkstra's Algorithm, is it? It is described in the code comments as a "beam search" (which is the name that I have given it above). Doesn't Dijkstra's Algorithm require that the nodes be weighted? The search in Simutrans-Standard (as I described in the first post) merely counts the nodes - it doesn't weigh them.

z9999 · April 13, 2009, 04:34:40 PM

Maybe, there might be momory leak.
I used the game which consumes only 64M in simutrans.
When I start to plaing this game in Simutrans-Experimental, it consumed 320M.
And after some minutes, it consumed 630M. And I gave up.

jamespetts · April 13, 2009, 04:43:07 PM

Ahh, thank you for the report. Can you provide a copy of your saved game? I have not noticed those symptoms myself.

Edit: Was the performance noticeably degraded? What are your system specifications?

prissi · April 13, 2009, 07:27:16 PM

For profiling you must setup mingw (which is fairly easy). In the config.default uncomment the PROFILE = 1 and start simutrans normall. A gmon.out will be created in My dociments/simutrans/ Copy this next to your executable and run "gprof sim.exe >results.txt" which will give you a prifilng. In MSVC 6.0 there was a profiler, but the free MSVC Express ones are deliberately without.

jamespetts · April 13, 2009, 07:35:09 PM

Prissi,

thank you very much! I shall have to have a go at that when I get a chance. (Currently working on updated help texts and bug fixing...).

Edit: As to the memory leak, I have fixed a memory leak in the latest release, version 2.2. I am not sure whether I have found all the memory leaks, however, although memory consumption seems to be much reduced. Do give it a try and let me know whether it works for you :-)

TrainMith · June 09, 2009, 06:24:20 AM

prissi: You mentioned how to profile under the MS Windows environment. What about under linux? I'm familiar with the gmon.out output and the gprof program. Is there anything else needed to be done during the compiling of the simutrans executable than the PROFILE=1? I also am assuming that the gmon.out would be placed directly in the same directory as the executable, under linux.

gerw · June 09, 2009, 06:39:10 AM

Quote from: TrainMith on June 09, 2009, 06:24:20 AM
Is there anything else needed to be done during the compiling of the simutrans executable than the PROFILE=1? I also am assuming that the gmon.out would be placed directly in the same directory as the executable, under linux.

I'm not prissi, but I can help you, too. It's sufficient to compile with "PROFILE=1". If you already compiled simutrans, you have to "make clean", otherwise simutrans won't be recompiled. And the gmon.out is placed in ~/simutrans/, if I remember correctly.

TrainMith · June 09, 2009, 06:45:59 AM

Thanks gerw!

z9999 · June 12, 2009, 01:48:04 PM

Quote from: gerw on April 01, 2009, 07:08:49 AM
I think, this isn't hard to include. Therefore you have only to add the stops to warenziel, which are directly reachable from that station (without intermediate stops). Then, of course, you have to increase the number of transfers and you have to take care, what to do with passengers at intermediate stops.

I have very interested this and tried.
Does this need doubled calculation for both outward and return ? Because shortest route is not always shortest for return route.

For example, I made 2 routes.

route1: A(town)-EX02-B-C-EX01-D-E(attraction)-D-EX01-C-B-EX02
route2: EX02-EX01-E

Shortest route from town to attraction is A-EX02-EX01-E.
But shortest route from attraction to town is not E-EX01-EX02-A.

If we don't calculate return route, they go from E to EX01, and next they will want to go from EX01 to E, because EX02 is not the next stop. As a result, they never ride off bus.
Maybe, they use wrong cashed route ?

So, I calculated twice for both outward and return. This is not slow and very interesting.
Only the problem is we can't know real transfer station name in freight list on station window and vehicle windows.

News:

Pathfinding algorithms