[Bug v3.4] Waiting Mail with No Mail Delivery Line/Convoy

knightly · May 05, 2009, 07:29:40 PM

I haven't set up any mail delivery line/convoy, but there is mail waiting at one of the stops. Please see the attached image.

This has been reported before by z9999, but seems that the problem has not been completely solved.

jamespetts · May 05, 2009, 07:42:35 PM

Knightly,

can you post a saved game? Thank you.

knightly · May 05, 2009, 08:03:58 PM

Really sorry, James. I thought that a screenshot is already sufficient, and so I didn't save the game.

If I encounter this problem again later, I will surely save a copy of the game for your investigation.

jamespetts · May 05, 2009, 08:15:17 PM

Thank you very much. A save game is very helpful, because otherwise it might be difficult to know how to recreate the conditions in which what you describe occurred.

jamespetts · May 07, 2009, 08:55:10 PM

I am afraid that I am having trouble tracking this one down at the present...

knightly · May 09, 2009, 09:20:26 AM

I have made an attempt to find out where the problem lies, but not successful so far. I suspect the problem lies in stadt::step_passagiere(), but your implemenation of this function is very lengthy and difficult to follow, at least to me.

I don't know if you are aware or not, no mail connexions are created at the halts exhibiting this problem. That means, no mail route to another halt could be found, and if there is no mail route, the mails should not be added to the halt in the first place in stadt::step_passagiere(). Hope this info can help you to narrow down your search a bit.

knightly · May 14, 2009, 08:59:10 AM

James, I finally know why there are mails waiting at halts when there are no mail-delivering convoys/lines. This bug is no less intricate than the bug of default waiting time.

I started out to output a message whenever calculate_paths is invoked with the mail category, together with the halt name of any node explored in that call. I found that there are indeed some nodes other than the origin halt that are processed. So, I suspected there might be something wrong with haltestelle_t::get_connexions(c). I proceeded to output the count/size of the mail connexions retrieved from the current halt, but found that it is always zero, which is correct. My suspicion then turned to the iterator that wraps the connexions hashtable, and I went one step further to output a message whenever a new node is created in the relaxation phase. But the result is, no new node is being created, which is also correct.

If no new node is being added to open_list, the nodes must have existed in the first place. Continuing this line of thought, I suddenly realised what causes the problem -- while you have connexions, paths, waiting times, reschedule, connexion and path timestamps etc. for each ware category, you only have a *single* set of open_list, path_nodes[], search_complete, iteration, max_iterations, etc. to share among different ware types.

An example of how the problem arises : A halt is instructed to calculate paths for some pax to a certain destination, and has successfully found the destination, leaving the search unfinished. Later, that halt is instructed to get a path for mail. In get_path_to(), if mail is not rescheduled or if its connexions and paths timestamp have not expired for that halt, get_path_to() will try to fetch a path from paths[mail]. Since no valid path is returned and search_complete is false (previous search is unfinished), calculate_paths() is invoked. In calculate_paths(), search continues with where it is left unfinished, popping out nodes stored in open_list. These nodes are leftovers from the previous search for pax routes. Although these nodes will not have any mail connexions and will generate no new nodes, new paths are nevertheless created for these old nodes and are stored in paths[mail]. In this way, mail paths are accidentally and incorrectly created. If one of the old nodes' halt happens to be the destination halt of the mail packet, a valid path will be returned from get_path_to(). Needless to say, other mail packets can also take advantage of the added mail paths. Since specific conditions have to be met for such cases to occur, this explains why this problem happens only occasionally and not for all mails.

Possible solutions would be
(1) to have multiple sets of open_list, path_nodes and other related variables for each category, or
(2) to clear open_list and path_nodes and initiate new search whenever current search is performed on a ware category different from that of the previous search.
The 1st approach will increase memory consumption a lot, the impact of which could be reduced by setting the max size of path_nodes[] to (total_halts_supporting_certain_ware_type x max_transfers) instead of the current (total_halts x max_transfers).
The 2nd approach will not increase memory consumption much. But the benefit of resuming search will be diminished : when a path to certain destination does not already exist, search has to start from scratch if previous search is done on a different ware type.

I think of a different approach which I am not sure it will work or if it is efficient enough : How about getting rid of path_nodes altogether by allocating new path_nodes dynamically (of course taking due care to deallocate memory), and instead of storing paths in paths[c] hashtables, we store path_nodes instead. Path_nodes should also store the immediate transfer from origin, so that recalculation is not necessary in each iteration of path search and when immediate transfer is requested from a calculated path. By combining the function of path_node and path, paths[c] will become both the search tree as well as the closed list, and then we only need extra open_lists for different ware types. You may think that not much memory will be saved by such merge, but please don't forget that the size of path_nodes[] is precalculated and it is not always fully utilized, while in this approach, only as much memory as needed will be consumed. (Note: paths[c] would need to store pointers to path_nodes, instead of paths objects.)

jamespetts · May 14, 2009, 12:01:52 PM

Knightly,

thank you again for your detailed and very helpful work - I am extremely grateful. I will have to consider this in some further detail and decide how to fix the problem. Thank you very much!

James

Edit: Without having the code in front of me, one possible difficulty with your third approach is that the path_nodes would require more data to be stored for each node. Also, I suspect that dynamic allocation of path nodes would itself be a time-consuming process which would add significantly to the computational time of calculating paths.

knightly · May 14, 2009, 05:39:09 PM

You are most welcome, James.

Although bugs are sometimes intricate and difficult to track down, the process of debugging is nevertheless an enjoyable experience to me.

My 3rd approach is just a preliminary idea, so maybe it doesn't work in the end. But before tagging it as certified (dead), let's brainstorm its feasibility.

Anyway, regarding your 2 concerns :

(1) It is expected that a path_node will now store more data, for it combines path_node with path into one. From your code, comparing the 2 structs :

Quote
   struct path_node
   {
      halthandle_t halt;
      uint16 journey_time;
      path_node* link;

      // Necessary for sorting in a binary heap
      inline bool operator <= (const path_node p) const { return journey_time <= p.journey_time; }

   };

   // Data on paths to ultimate destinations
   // @author: jamespetts
   struct path
   {
      halthandle_t next_transfer;
      uint16 journey_time;
      path() { journey_time = 65535; }

      //TODO: Consider whether to add comfort
   };

There is only one common member variable -- journey_time. But I am thinking, since path_node::halt is only needed in path searching but not after being added to paths[c] (current halt becomes the key stored in the hashtable), and on the other hand, path::next_transfer is not calculated until after being added to paths[c], maybe the same variable can serve 2 roles -- this is possible, because the 2 roles do not overlap in time. So, we end up needing only halt, journey_time and *link, which is the same as the current path_node. But of course, we should not forget the pointer stored in paths[c]. So, the same amount memory is needed per path/path_node. Having said that, we should not forget the unused nodes in the current path_nodes, so on the overall, we are still saving memory.

(2) You are right that allocating memory dynamically is slower if it is done on a per-node basis. So, how about setting up a global pool and recycle discarded path_nodes? Since path_node already has a link variable, this can be exploited to chain all the discarded nodes in a linked-list (a static member of a class). Whenever a new node is needed, just pop one out in LIFO order, and only dynamically allocate new nodes (maybe many nodes each time) when the pool is depleted. The same mechanism can also be applied to connexion objects, or any other objects that are created and destroyed frequently in large quantities.

jamespetts · May 14, 2009, 06:12:28 PM

Knightly,

hmm, very interesting thoughts. A memory pool sounds like it might be hard work, but perhaps it might be worth it if the other alternatives involve using much more memory. Presumably, one would use a stack, rather than a linked list to hold the pooled nodes? If you have any more detailed thoughts on how this might be implemented, I should be very interested to know them. Thank you, as ever, for your hard work,

James

knightly · May 14, 2009, 07:06:35 PM

James,

Actually I am talking about the kind of stack which has an underlying linked-list representation. (So, you have a better internal representation for stack other than linked-list?)

I don't know about complicated stack implementations. Pushing and popping are done at the same end of the list, which is referenced by a head link of the pool class. In case where recycled nodes are depleted, dynamically allocate a certain amount of nodes as an array (the size may be a function of total halts on the map), set a pointer initially to the address of array[0], and advance the pointer whenever new nodes are requested, until the new nodes are also depleted -- then we will create another array of new nodes.

Edit :

For the newly created array of nodes, I am not sure if it will be better to chain them up (i.e. setting new_nodes[ i ]->link = &new_nodes[ i+1 ]) and set head link to the first node. This will take a bit longer time for setting up the new nodes, but all nodes (new and recycled) will be accessed using the same method, saving an extra check when nodes are popped out.

As for memory consumption, I think we can even save the memory for pointers, by relocating the relaxation phase right after adding the node to paths[c] but before determining next transfer (before current halt becomes next transfer), and by fetching the actual address of the path_node stored in paths[c] :

Quote
if(current_node->link != NULL && paths[category].put(current_node->halt, new_path))
{

   // Track the path back to get the next transfer from this halt
   // current_node is updated to point to the actual path object stored in paths[category]
   path_node* temp = current_node;
   path_node* track_node = current_node = paths[category].access(current_node->halt);
   temp.recycle();

   // Original code for the relaxation phase
   current_connexions = current_node->halt->get_connexions(category);
   quickstone_hashtable_iterator_tpl<haltestelle_t, connexion*> iter(*current_connexions);

   linehandle_t previous_best_line;
   convoihandle_t previous_best_convoy;
   if(current_node->link != NULL)
   {
      previous_connexion = current_node->link->halt->get_connexions(category)->get(current_node->halt);
      if(previous_connexion != NULL)
      {
         previous_best_line = previous_connexion->best_line;
         previous_best_convoy = previous_connexion->best_convoy;
      }
   }

   while(iter.next() && iterations <= max_iterations)
   {
      next_halt = iter.get_current_key();
      if(paths[category].get(next_halt).journey_time != 65535)
      {
         // Only insert into the open list if the
         // item is not already on the closed list.
         continue;
      }

      current_connexion = iter.get_current_value();

      if(previous_best_line == current_connexion->best_line && previous_best_convoy == current_connexion->best_convoy)
      {
         continue;
      }

      new_node = &path_nodes[iterations++];
      new_node->halt = next_halt;
      new_node->journey_time = current_connexion->journey_time + current_connexion->waiting_time + current_node->journey_time;
      new_node->link = current_node;

      open_list.insert(new_node);
   }


   //while(track_node->link != NULL)
   //for(uint8 depth = 0; depth <= max_transfers; depth ++)
   for(uint8 depth = 0; depth <= 255; depth ++)
   {
      if(track_node->halt.is_bound())
      {

         if(track_node->link->link == NULL)
         {
            paths[category].access(current_node->halt)->next_transfer = track_node->halt;

            // End of search.
            break;
         }
      }

      track_node = track_node->link;
   }

In this way, we now only need the amount of memory of the current path_node per node, and you don't need to modify paths[c] to store pointers to path_node (but still need to change it to store path_node objects).

jamespetts · May 15, 2009, 04:07:48 PM

Knightly,

thank you for all your detailed thoughts. I am currently trying to think through a possible implementation. My preference tends always towards high-level/object oriented solutions, largely because they are much easier to understand, although I realise that that is often slower.

Some fragments of thoughts that had occurred to me and on which your comments would be appreciated are as follows: a static (i.e., global) memory pool of paths/path nodes (which could indeed be combined in this implementation: thank you for your thorough analysis of the mathematics and mechanics of that) might work. It would be used instead of the array of nodes used for the open list, which in effect, acts as a sort of crude, temporary memory pool in the current implementation (that code was replicated from the existing code used for pathing in Simutrans-Standard).

The memory pool would then overload the new and (possibly) delete operators for path and connexions objects. New smart pointer classes for paths and connexions would be created, and the overloaded new operator for each would return a handle of that type. The memory pool would monitor the number of handles open for each path or connexion object, using an 8-bit unsigned integer per object*. The constructor for the handles would increment this number, and the destructors decrement it. Whenever the number reaches zero, the object would be transferred back to the stack of free objects for reallocation using the overloaded new. Delete would not, therefore, be called on the individual objects themselves, but only on their handles.

That way, the overloaded new operator could be called one at a time for each creation of path and connexions objects, and there would be no need manually to delete them, as the handles simply going out of scope somewhere would trigger the object to be thrown back into the pool stack. The risk of memory leaks would be minimised, as the memory management would be semi-automated. I should have to write a new version of the hashtable class to take, not Quickstone pointers, but the new type of handle that I should need to create for this implementation, but that would not itself be too hard.

The links could indeed be used for the stack, but they would have to be the handles, rather than raw pointers, since there is always the possibility of the thing to which they link being re-used or deallocated between the time of their creation and their deployment in the model that you suggest (which is not the case in the present model, since they currently have a very short and deterministic life-cycle). This would not happen while the nodes were in the stack, but would happen while the nodes were out in the wild being used, and the link is serving its other function. One possible problem with that is that that would greatly inflate memory consumption, because the handles would take far more than 4 bytes.

The stack would not, actually, be too hard to implement: all that it would need to do is store the handle to the first node as its "head" pointer. For a push operation, the link of the stored node would be set to the current head, and the head then set the the pointer to the pushed node. For the pop operation, the current head would be returned, and the head set to the link specified in the node that is being returned. That simple implementation should be quite efficient - thank you for suggesting it!

In any event, I should be very grateful for any thoughts on that outline implementation. Also, I should be interested in your view as to whether any of the objects in the stack should ever actually be deleted, whether the stack should keep a count, how much memory that the stack should allocate in the first place (and how best to calculate it), and anything else on the automated system of re-pooling the memory by counting outstanding handles. Thank you for all your help!

* An interesting question arises as to how to store this. Perhaps a hashtable in the memory pool class with the keys the raw pointers to the objects, and the values the counters?

Edit: Another thought re: memory consumption: if the pool approach is to be used for connexions, the connexion objects will also need to have a link node to use for a stack, which would inflate memory consumption - unless, perhaps, the memory pool for connexions used a different type of stack...?

knightly · May 16, 2009, 07:49:39 AM

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
The memory pool would then overload the new and (possibly) delete operators for path and connexions objects.

I think overloading the new operator should be done in path_node and connexion classes, by always returning a handle instead of a pointer. In that way, all path_nodes and connexions will have to be manipulated using handles. If delete operator has to be overloaded (recycling data if current reference count = 1), it should be done in your smart pointer class, as only handles are available.

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
The memory pool would monitor the number of handles open for each path or connexion object, using an 8-bit unsigned integer per object*.

I suppose the pool is managed by smart pointer class? The pool is just a stack, and it should not care about the objects once they leave its internal storage. It is smart pointer's responsibility to store, and keep track of the reference count of, any object currently in use. I guess you will create a template class for smart pointer, another template class for stack, so that you will combine them like smart_pointer_tpl<path_node_t, stack_tpl<path_node_t>>.

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
One possible problem with that is that that would greatly inflate memory consumption, because the handles would take far more than 4 bytes.

May I know why your handle needs more than 4 bytes? Currently, a quickstone handle is only 2 bytes, as it uses uint16 to store the only instance variable entry. Even if you use uint32, that is only 4 bytes. So, your handle stores other things as well? But I agree with you that memory is really quite a concern with your approach, as you also need a variable for reference counter, and possibly a pointer to path_node when storing it in smart pointer's internal storage for objects in use.

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
In any event, I should be very grateful for any thoughts on that outline implementation.

I like the idea of reference-counted pointer. It is systematic and safe from dangling pointer and memory leakage. If this can be implemented very efficiently, it would be good. Maybe when you have the code out, we can discuss the performance and memory issues more concretely.

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
Also, I should be interested in your view as to whether any of the objects in the stack should ever actually be deleted, whether the stack should keep a count,

My understanding is, normally more nodes are needed as players add more halts or more lines, so usually, more will be needed than less. Therefore, to keep track of the count and delete nodes is not very necessary imho. Overhead incurred by maintaining such counter and performing deletion is not really expensive per se, but we have to consider the overall performance when combined with the overhead of smart pointer.

We should not forget our original goal -- we want to reduce the increase in memory consumption when supporting resumable search for different ware types, while maintaining the original performance of route searching. We are trying to find an efficient way of obtaining new nodes instead of using the slow dynamic allocation per node. If the total overhead of smart pointer + pooling exceeds that of dynamic allocation per node, I am afraid we'd better stick with dynamic allocation.

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
how much memory that the stack should allocate in the first place (and how best to calculate it),

I suggest to fix the number of new nodes to be created each time the recycle pool is depleted, like 32 or 64. This applies to new game and ongoing games. Halts and transport links are usually added incrementally, so there will not be a quick jump of demand for path_node. Besides, we will have a lower chance of allocating too many new nodes, wasting memory. Actually, it is more important to make the process of creating new nodes reasonably fast -- a complicated formula for determining the appropriate size will make the process slower, and if the size calculated is large, it will take quite some time to chain all nodes up and assign to head link.

But for loading a save game, more nodes are needed. In your current approach, path_nodes[] may store a few nodes for the same halt during path search, but in your new approach, such duplication is restricted only to nodes in open_list, which is comparatively less -- this is because any popped node which is found in the closed list can immediately be recycled. Since each halt can have at most (total halts - 1) paths for a single ware type, and if we can estimate the average number of nodes stored in open_lists (this can be easily estimated by outputing such data in the testing phase), we can set the formula to the total_halts x (total_halts - 1 + average number of nodes in open list)]. However, this is just for one ware type. Thus, we need to calculate like this :

new_node_count = 0
for each ware type c
routable_halts = no. of halts where connexions[c].count > 0
new_node_count += routable_halts x (routable_halts - 1 + average_node_in_open_list)

Quote from: jamespetts on May 15, 2009, 04:07:48 PM
* An interesting question arises as to how to store this. Perhaps a hashtable in the memory pool class with the keys the raw pointers to the objects, and the values the counters?

One question : how do you save the objects themselves? In array of pointers, just like quickstone? Or in hashtable? If you are using hashtable, probably the key will be the entry number of your smart pointer, and the data will be a node storing both the reference count and the object itself. However, I am not sure if hashtable is fast enough for such performance-critical task.

Quote
Edit: Another thought re: memory consumption: if the pool approach is to be used for connexions, the connexion objects will also need to have a link node to use for a stack, which would inflate memory consumption - unless, perhaps, the memory pool for connexions used a different type of stack...?

If you want to do similar thing for connexion object, then the stack class have to be generalised to hold a linked-list of stack nodes, each of which store a pointer to your data object. This is not very efficient, because you have to allocate and deallocate new stack nodes (or recycle them with another stack

[just a joke]). We are trying to minimize the overhead of getting new path nodes, so such generalization is not desirable. Probably a separate stack class for objects without link will be better -- maybe this stack can be implemented using an array of pointers instead, with expandable size.

As for my approach above, I would say that it minimizes memory consumption to the least. Using pointers in my approach will be quite safe, because the life cycle of a path_node is short -- once created, it will be added to open_list, and when being popped out of open list, it is either immediately recycled (if already in close list) or is recycled as soon as it is copied to the path in paths[c], thus it will never leave the route searching function or open list. I know that you are keen on using smart pointers right now, but if it ends up adding too much overhead and seriously affect performance, please consider adopting my approach.

Edit : Formula for calculating required new nodes is updated.

News:

[Bug v3.4] Waiting Mail with No Mail Delivery Line/Convoy

knightly

jamespetts

knightly

jamespetts

jamespetts

knightly

knightly

jamespetts

knightly

jamespetts

knightly

jamespetts

knightly