Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Wrong Passengers get on the train

Started by dennosius, December 30, 2012, 07:04:12 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.


I put all the pros and cons into a table for better understanding:


I really wonder about one thing you remarked. You said that in you example in D was always enough capacity and the same for all other outgoing stops. Since simutrans intercity passengers are generated symmetrically (same number travel in one direction than in the other) this points to that either lot of passengers in D are not generated (due to overfull stops). Otherwise there would capacity lacking between D-C as well as between C-D. (Hence my remark about missing capacity.) Or you did not played long enough and the overflow at C will be gone just after some years by itself.

It could also be, that there is a second, almost equally good route for passengers to leave D going via F; but then you will always have to built excess capacities in one directions, no matter what loading. That would occur no matter what loading is done.

The current loading came from the request of many players who wanted to create distinct local and express lines. As yoshi pointed out, this is most easily done with first loading (and also very efficent). Since currently you advocate proportional loading and at least more than 10 people in the past requested distint local and express lines (as "realistic") I have no big incentive to introduce proportional loading.

Proportional loading will also require much more computations, because first a list is needed to sort passengers after next stops of the vehicle, then find of those pakets a fair share of the actualy passengers (since thos packets must be integer). It will generate many lonely packets with only ine passenger per destinations, making chae misses more frequent and increases memory demand. However, this was just my envelop backside though of implementations. If someone comes up with something decent, I would be open for a option.


In the model calculation, there was the line C-D-e. The model calculation took into account only pax that travel from C to D or E (just for the overhead calculation, I assumed the train goes back as full or empty as on the way there, which corresponds with what you said that the number of pax that travel from D or E to C should be just the same).

I did not take into account pax that travel between D and E. So in practice (and even in theory, if considered), the capacity would not be sufficient, but this applies to all loading logics. But this does in no way question the logic result. Because the logic result is right as long as the amount of pax that want to travel from D to E does not reach the difference between C-D and C-E passengers (so as long as they don't fill all empty seats in every train that occur when the C-D pax depart at D in case all trains serve E). And if they do, it obviously makes no sense to sometimes not serve E, so it's not the use case we are talking about.

But what you say is actually how I discovered that something goes wrong with current loading in this case,  I also assumed that demand should be equal for both directions. I did leave out E sometimes. So pax for E queued up in C. My first thought of course was, I just need more trains, but then I discovered that no one queued up at E, but trains were already leaving E almost empty (like 1/4 to 1/3 filled) and did not even fill up at D on their way to C. This is because if you look from the perspective of E, no stop is left out (all trains that leave from E serve all stops), so loading doesn't matter at E as long as there's enough capacity, and there was enough (already too much) capacity, as trains were leaving E so empty. So it became obvious to me that pax at C only queue up because the wrong pax are loaded and not because capacity was insufficient. This is how I discovered the whole problem.

So it's not about overcrowded stops, passengers not generated or a temporary thing. Just because of the loading, although all pax are carried from E to C (and D),  pax for E queue up at C. I know this sounds illogic. But this is reality, because the loading is illogic. That's exactly my point.


Quote from: Ters on January 03, 2013, 05:32:48 AM
Someone has to actually write an implementation and measure it to see if it's just wasted work or not, though.
I'd like to, but I can't figure out where to look since I don't speak german. I guess it's a feature that the code is protected against me XD .


Quote from: Iluvalar on January 08, 2013, 07:22:47 PM
I'd like to, but I can't figure out where to look since I don't speak german. I guess it's a feature that the code is protected against me XD .

You would have to adjust the haltestelle_t::hole_ab routine in Only one comment in German, rest is in English. But you will see that it is not really easy, not because of the language, but because the code is a little tricky, thanks to a few exceptions in the load-handling (avoid overcrowded option, can load different goods in one vehicle and the fact that the loading takes place per vehicle, not per convoi). It is called from vehikel_t::load_freight in, which is called from convoi_t::hat_gehalten in Other complication is that you have to round down the amount of freight to the nearest number when loading proportionally, which leaves you with a remaining capacity to be loaded (you could take one freight-item extra from each of the first destinations in the list).

In short, the complication is not the language but proportional loading itself, because it is quite processor-intensive.
Bob Marley: No woman, no cry

Programmer: No user, no bugs


Combuijs, Iluvalar,

In short, the complication is not the language but proportional loading itself, because it is quite processor-intensive

That is exactly right. I made a first implemetation of proportial loading in hole_ab. Initially it looked easy but now I am looking into methods of getting the remaining capacity distributed.

I attached working copy of the hole_ab with a lot of intermediate and trial code in.

Original post had wrong attachement. Now updated to the correct one.



QuoteInitially it looked easy but now I am looking into methods of getting the remaining capacity distributed.

Remaining capacity should be less or equal to the number of different station destinations, so you could add one freight unit to the first "remaining capacity" station destinations.
Bob Marley: No woman, no cry

Programmer: No user, no bugs


I dont get it, in this function there is 2 for loop that reset and use the same "i" int. Is that normal ?
hreintke, i dont see any change in your file. You probably posted the original one.[/size]


Loading per vehicle (and not per convoy) probably increases CPU load. Maybe it is possible to add up capacity for the convoy (by load type, of course) first, and then load the vehicles after the proportional calculation? Vehicles within a convoy do of course not have to be loaded proportionally, there's no reason at all to do this.

This would also allow to check first whether we need to calculate at all, because we don't need to do any calculations if freight to be loaded is not more than available capacity (or available capacity is zero).

Also, loading per vehicle makes rounding a slightly bigger issue, because the smaller the numbers are, the bigger become the rounding effects. There are dozens of possible approaches to deal with proportional distributions if number of indicators is larger than number of distributable items (most of them developed for distributing seats in parliaments).

It's possible to round down first. Then, one easy approach is to distribute remaining capacity in the order of the rounded decimals (for the last seat, randomly if equal decimals). But afaik, ordering always causes CPU load. Maybe it's good enough to just load anything (as Simutrans did some time ago, so what is in memory/array/whatever first) on remaining capacity.

Does anyone know which rounding causes less cases where too few/many capacity is distributed? My guess is that financial rounding leads to a correct number not necessarily, but in more cases than rounding down. So it may save CPU load to do financial rounding and correct the result only if necessary.

Another approach is, instead of rounding: All numbers of waiting pax (or whatever, I'm using pax as an example) for served destinations are divided by all numbers up to the available capacity. Each result is marked with the destination it comes from. Seats in train are distributed in the order of results. For small vehicles/convoys, this might be even better than rounding, but I don't have a good idea of what calculations cause more or less CPU load.


I still don't get why it's supposed to be CPU intensive. Here is what to do (sorry i'm coding in PHP most of the time so it's in pseudo-code):

instead of this part :

if(  tmp.menge > maxi  ) {
// not all can be loaded
neu.menge = maxi;
tmp.menge -= maxi;
maxi = 0;
else {
maxi -= tmp.menge;
// leave an empty entry => joining will more often work
tmp.menge = 0;

You do something like this :

weight=tmp.menge; // let's weight only on quantity right now
newArray[0]=i; //save the freight ID

Then you totaly exit the for loop, and start another one based on validPosibilities. This way you know totWeight which is necessary.[size=78%] After recovering the same environement, inside you do :[/size]

toLoad=floor(newArray[1]*maxi/totWeight); // maxi/totWeight should be cached actually since it is constant in that loop

neu.menge +=  toLoad;
tmp.menge -= toLoad;

Finaly, cycle again [size=78%]validPosibilities with a random offset to fill the last seats with it. The first possibilty should be enough[/size] unless there is very few freight in the stop.

I don't see what in that would be so CPU intensive. We have to store 2 short int in an arrray and there is 2 more addition and 1 more multiplication per freight. Unless there is hundreds of freight in the stop. Should be quick...

Or i'm completly wrong ?


This should be right. CPU load is a bit higher, if there are many stops on the line and/or many different load types in a convoy. But convoys with this constellation shouldn't be loaded every millisecond even on very large maps.

I'm also trying it with PHP-like pseudo code and human language:

$array[loadtype][destination][waiting|load] is the array that stores by load type and destination
$totalwaiting[loadtype] is the array that stores the total amount of waiting load (so the sum of all $array[loadtype][x][waiting])
$loadedcapacity[loadtype] is the array that stores the total amount of what we actually want to load (so the sum of all $array[loadtype][x][loaded])
$availablecapacity[loadtype] is the array that stores the total amount of available capacity for each loadtype

if loading_logic=='proportional' { //comes from GUI setting for line or simuconf parameter
   //First check what capacity is free in the convoy
   foreach vehicle { (check load type and add up free capacity for each,  add up in $availablecapacity) }
   //if there's no free capacity at all, skip going through schedule
   if (count($availablecapacity)) while (go through schedule until current stop is reached, even if another occurrence of current stop in schedule, or until a depot is reached) {
         (collect waiting load for each stop only for load types available in convoy; also add up total waiting load for each load type; store in $array and $totalwaiting)
   } //end if while
   //do loading calculation only if we have anything to load
   if (count($availablecapacity) && count($array)) foreach $availablecapacity[loadtype as $thisloadtype] {
      foreach ($array[$thisloadtype][destination as $thisdestination]) {     
      //Load proportionally if required, but load everything if capacity is sufficient
      $array[$thisloadtype][$thisdestination][load] = if ($totalwaiting[$thisloadtype] > $totalcapacity[$thisloadtype])  round(totalcapacity[$thisloadtype]/totalwaiting[$thisloadtype]* $array[$thisloadtype][$thisdestination][waiting] ) else $array[$thisloadtype][$thisdestination][waiting]
            $loadedcapacity[$thisloadtype] +=  $array[$thisloadtype][$thisdestination][load]
         } // end foreach destination
         //now deal with rounding error
         //check whether to add or substract
         if (($availablecapacity[$thisloadtype] < $totalwaiting[$loadtype) && ($loadedcapacity[$thisloadtype] > $availablecapacity[$thisloadtype])) { //need to substract
            $i = 0;
            while ($loadedcapacity[$thisloadtype] > $availablecapacity[$thisloadtype]) {
               if ($array[$thisloadtype][$i][load] > 0) { //we can't substract if nothing loaded already for this destination
                  $array[$thisloadtype][$i][load] -= 1
                  $loadedcapacity[$thisloadtype] -= 1
               } //end if actually substract
               if ($i < ($count($array[$thisloadtype])-1)) $i++ else $i=0 //go to next stop or back to first
            } //end while substraction
         } //end if substract
         elseif    (($availablecapacity[$thisloadtype] < $totalwaiting[$loadtype) && ($loadedcapacity[$thisloadtype] < $availablecapacity[$thisloadtype])) { //need to add
            $i = 0;
            while ($loadedcapacity[$thisloadtype] < $availablecapacity[$thisloadtype]) {
               if ($array[$thisloadtype][$i][load] < $array[$thisloadtype][$i][waiting]) { //we can't add if already everything for destination loaded
                  $array[$thisloadtype][$i][load] += 1
                  $loadedcapacity[$thisloadtype] += 1
               } //end if actually add
                 if ($i < ($count($array[$thisloadtype])-1)) $i++ else $i=0 //go to next stop or back to first
            } //end while addition
         } //end elseif add
         //add here the actual loading on vehicles of that load type
   } //end foreach load with free capacity
} else {
//current loading

Weight calculation has to be added, if that happens in loading. I have no idea how that works.

I think the effect for the capacity left free by rounding with this solution is that it also prefers early stops, but if there are 2 items free capacity, 1 will go to the first and one to the second stop (etc).

This can most probably be optimized by people who know coding better than me, but it should work this way. I think I considered everything in regard to dealing with the rounding errors (i.e. we can't repair rounding errors by substraction if there's already nothing to be loaded and we can't repair by addition if already everything is loaded for a destination).



hreintke, i dont see any change in your file. You probably posted the original one.

Sorry for the wrong post yesterday. Attached is the hole_ab including my current work.
I will update my original post also for "late viewers".



I just had a look at the code.

As I said, I don't have a good idea of what uses more CPU load or not. But the solution seems to go through all stops/waiting load and then check if the vehicle is compatible. Isn't it better to check vehicles first and disregard mismatching waiting load? Because it's much more likely that the stop has more waiting load types than the convoy has load types.

Another tiny idea for it (probably also applies to current loading): There should be no loading to stops behind a depot on the schedule (before the current stop is on the schedule again). It is possible to set depots on the schedule manually on any position, and as all load gets lost in depots, so there should be no loading for destinations behind the depot.


You are correct. First iteration is on stops then on goods. But as the comment says :

// prissi: first iterate over the next stop, then over the ware
// might be a little slower, but ensures that passengers to nearest stop are served first
// this allows for separate high speed and normal service

This is to make the current implementation work and

1/ The use in proportional loading in this loop is only to find out possible goods for loading
2/ I want to avoid duplicate code on other checkings already done in the current loop.
3/ The CPU intensive task comes afterwards anyhow when deciding what to actually load



I see. I don't know whether avoiding duplicate code has advantages in performance (or usability).

The whole hole_ab thing is called for each car? So if we wanted to load by convoy, we'd also need to change whatever calls hole_ab?

In the current loading, doing it by car is probably (almost) as good as doing it by convoy. You just fill up one car and then the next, without iterating through the stops again and again, because loading is stop-by-stop.

In proportional loading, iterating through the stops is needed for each car, if we load by car, isn't it? This wasn't effective.

Another question: What happens to load of the same load type, but different style (or how is the terminology? What I mean is, for example, books, glass, paper, beer etc require the same cars in pak128).


It is more CPU intensive, since if the vehicle is fully loaded after the first package the loading is finished and the other packets does not need to be touched. Not to mention the allocation of vectors and freeing them again.

The proportional loading has first to find out how many passengers are able to go with the convoi (first loop over all matching catg in the stop). In a second loop the loading can be done. Doing this in one run with the total number waiting will not give a proportional loading but rather something like the algorithm which was in place before. (Of course the results of the first loop could be cached.)

However, you implementation has a serious problem: If maxi << total_waiting, then the chances are very height that menge*loading_percentage<0.5 = 0. In this case nothing would be loaded at all. In any case it seems that the convois will never be fully loaded with the intended number. You need some management for fractions, either sum them up and then take an extra passenger every xxx packets. And the routine must be using exclusively integers, since the results of floating point operations can deviate between compilers and implementations.



I know about the "issue" that vehicles/convoys are not fully loaded when maxi<total_waiting.
That happens not only when "menge*loading_percentage<0.5 = 0" but due to rounding also at all other.
In fact you "loose" every fraction of  "menge*loading_percentage" for each of the goods you are loading

That is why I commented

// need to update with additional loading due to rounding

in the updated hole_ab.

The current is definitely only a working copy in which I wanted too show the way I was going.



Having thought about it during lunch :) I am convinced that financial rounding will do better than rounding down. Rounding down will almost certainly lead to a result we need to 'repair' afterwards by another distribution, while there's a good chance with financial rounding that the result fits exactly the free capacity, so we can skip the second distribution step, or the rounding effect is at least small.

Also, 'repairing' the result (which may require to add or substract) will lead to a pseudo-proportional (because it's random whether and where we need to add or substract) distribution even for the rounding effect over time.
I'll modify my 'code' right now.


Two updates in the new hole_ab with proportial routing.

1/ Solved an error that caused goods not to be removed from the halt
2/ Used financial rounding.

Financial routing indeed make the "reloading" less but in circomstances where limited number passengers can board on a vehicle it still leaves a lot to reload. An there is the possibility to "overload" in this version.

Next version will follow.



I'm really not good at even reading/understanding object-oriented code, so I already now apologize in case I misunderstood it.

But as it looks to me, you are making one calculation step too much by calculating percentage first and then applying percentage on waiting goods. The amount to be loaded for each destination is (capacity / all_waiting * waiting_for_destination). This may also cause additional rounding errors.

And do you actually add 0.5? Why? Is that a left-over from rounding down? Or are you trying to 'help' small amounts becoming loaded (the latter is not useful for small vehicles, I think, and not necessary for big ones)?

And yes, rounding still makes a difference and requires correction. Anyway, with financial loading, the probabilities of having to reload, having to remove load and having the right amount should be equal for any number, shouldn't they (again, I have not thought this to an end yet, but with financial rounding, it's at least possible that the amount fits, which is impossible with rounding down)?

Because of rounding effects, this all makes less sense the smaller the vehicles are and the more destinations they serve at the same time. That's why I propose to change loading so that it is done by convoy (but let's solve proportional by car first).

Edit: I just read what prissi wrote about different ways of handling fractions in different compilers. But if we correct the integer result at the end (by additional loading or unloading), it should always be fine. Worst thing that could happen is that different handling of fractions leads to one different loaded item, and probability even for this is really little (like once in billions of loadings).


Quote from: dennosius on January 13, 2013, 05:35:08 PM
Worst thing that could happen is that different handling of fractions leads to one different loaded item, and probability even for this is really little (like once in billions of loadings).

From what I understand, that is very bad in networked games as it causes players to be kicked out. Game state must be equal for all players, or things get ugly.


Then let's think whether it can actually happen. rounding (integer1/integer2*integer3) to the nearest integer cannot actually lead to different results, can it? I mean, no matter how the compiler deals with fractions, there's only one result that is mathematically correct.

And does, in network play, really each client do the loading? Or is it done by the server?

I can't imagine the whole Simutrans code works without fractional numbers to avoid differences.


Integer divisions are OK, but in general, the multiplication should come first. (integer1*integer3/integer2) I was (wrongfully) thinking of numbers like 0.5, which computer normally don't do mathematically correct.



I am doing exactly (capacity / all_waiting * waiting_for_destination) but doing it in two steps as capacity/all_waiting (= loading_percentage) is constant as soon as we know the two figures -> only calculate one time instead of inside the loop.

The addition of 0.5 get us from rounding down to financial rounding, this way no additional rounding routine needs to be called.

if outcome = 2.4
  > rounding down leads to 2
  > add 0.5 leads to 2.9
  > rounding down leads to 2

if outcome = 2.6
  > rounding down leads to 2
  > add 0.5 leads to 3.1
  > rounding down leads to 3




Took some more time as expected to get some special situations sorted out.

Attached is a updated proportial routing routine using only integer arithmetic.
It uses financial rounding to minimize the amount of goods to be loaded in the second phase.

For the last goods to load, for now I just took one by one from the availables.
Can be made much more fancy to get a really good proportial distribution.

Another possible optimization is to remember whether a destination halt already has been processed.
Would give loading_goods.append instead of loading_goods.append_unique for better performance.



Looks really good.

When doing the additional loading, are you checking whether everything for that destination is loaded already? Or is that a special situation you sorted out? I'm thinking whether this couldn't happen, but my first thought is it can.



Yes, that can happen and is avoided by the line :

if (loading_goods[i]->menge > 0)

The special situation I encountered was that a halt can be multiple times in the schedule.
In proportial loading we first have to get a distinct list of goods which is available for loading.
My first versions were not prepared for that and counted those goods more than once, giving issues when actual loading.



Alright. Subtracting if overloaded by rounding should be easy now (but I don't want to mess with your code).

Oh, a stop twice on the schedule is something I didn't think of. Additional entries for the same stop should of course be skipped.

What do you think of when you say 'make it more fancy'? What I can imagine is loading for destinations with the 'worst' rounding effect (rounded down most/up least) first. But that's probably a too intensive calculation. Any other idea?


Subtracting if overloaded by rounding is handled by not loading more than maxi in the line :

uint32 prop_load = min(maxi,((load_perc*lw->menge+50)/100));

The fancy one is indeed getting the additional loading more proportional using something like largest divider but expect that also to be too intensive. Then you come into sorting lists and I think that should be avoided.

What needs to be done anyhow is getting it embedded in overall simutrans code. This code just sets

bool use_proportial_loading = true;

but that needs to be and maybe even line gui configurable.



Yes, it'll need a GUI setting, probably in the schedule, if we want the current loading to be preserved for circular lines. But a simuconf setting may do for testing as a first step.

Is there any chance to get one step higher in the code and have convoys (and not vehicles) loaded? This doesn't matter much for the current loading, but it does for proportional loading (smaller numbers make rounding effects bigger).


Since the lenght of a convoi is important and can cause different loading of different cars, the answer is a clear in principle yes but ...


I would not care about rounding errors that much.

First, one can use

uint32 load_perc = min(XXX,((XXX*maxi)/loading_total));

with any XXX, preferably a power of 2.

Second, one can keep track of accumulated rounding errors like

// outside of loop
uint32 remainder = 0; // or XXX -1 ??
// inside loop
uint32 prop_load = min(maxi,  ((load_perc*lw->menge+remainder) / XXX));
remainder = (load_perc*lw->menge+remainder) % XXX;

Parsley, sage, rosemary, and maggikraut.



I understand that you are calculating the remainder of goods still to be loaded because of rounding.

That is what I do in my code by keeping track of
- the amount to be loaded in "loading_total"
- the amount actually loaded in the intial lop in "loaded"

Then using that in the second fase with

while ((maxi > 0) && (loading_total > loaded))

Or do you have something else in mind calculating the remainder ?



Quote from: hreintke on January 23, 2013, 10:01:28 AM
Or do you have something else in mind calculating the remainder ?
The idea is, to calculate the remainder during the first loop. With the goal, that after end of the loop the remainder is close to zero. Then the second loop can be omitted. This routine (hole_ab) is called very often and thus critical to performance.

Edit: here is the complete loop:

uint32 load_perc = min(100,((100*maxi)/loading_total));
                        uint32 remainder = 0; // or 99 ??

FOR(vector_tpl<ware_t*>,  lw, loading_goods) {
uint32 prop_load = min(maxi,((load_perc*lw->menge+remainder)/100));
                                remainder = (load_perc*lw->menge+remainder ) %100;

if (prop_load > 0) {  // proportional part can be too small
          // no second loop, as remainder is already too small ... hopefully
Parsley, sage, rosemary, and maggikraut.


OK, I understand your idea.

Will do some theoretical and pratical tests with it and see how it works out.