News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

poor game flow in large maps

Started by Colin, September 01, 2009, 10:50:53 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Colin

I've got a game going on a 1024x1024 map  with several large towns and 8 fast trains, (4 Northbound 4 Southbound) Multiple trams running in all full growth towns (15000 or over). The game has become extremely laggy with no seeming cause. It can happen at the start of a month, two days in or halfway. All not needed processes are shut down and the graphics card is running at 'Turbo'.

The same game played in Experimental suffers no lagginess at all.

Maybe a look should be taken at Knightlys 'New Path Search' to see if it can be incorporated into Standard? Or maybe a new type of search path could be written to suit Standard?
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

Isaac Eiland-Hall

I suspect this is not a pak64-specific issue...

Meanwhile, have you compared the performance of the GDI executable vs. SDL?

Colin

Quote from: Isaac.Eiland-Hall on September 01, 2009, 11:08:30 AM
I suspect this is not a pak64-specific issue...

Meanwhile, have you compared the performance of the GDI executable vs. SDL?

Yes I have, It doesn't make any noticable difference. I haven't tried it with PAK128 as I only have PAK128 Britain which is a EXP game. I would try PAK128 but it takes weeks to get to the stage where I have PAK64.  But you could be right it may be a Simutrans-Standard problem.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

gerw


Colin

I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

whoami

#5
(Moving to Development...)

Direct link to the savegame is http://simutrans-germany.com/files/upload/Colin%27s%20Standard%202611.sve; (EDIT:) it is loadable with plain pak64/r175.

knightly

I have just tried Colin's save game.

There was a brief period of reduced responsiveness immediately after loading, but soon afterwards the game ran smoothly. When I tried to add a halt or adjust certain convoy's schedule, responsiveness was reduced again shortly afterwards for quite some time.

So, probably the reduced responsiveness is caused by the re-routing of all existing ware packets in one single world step after set_schedule_counter() is triggered by actions like halt construction/deletion or schedule changes.

gerw

Quote from: Knightly on September 02, 2009, 08:57:08 AM
There was a brief period of reduced responsiveness immediately after loading, but soon afterwards the game ran smoothly. When I tried to add a halt or adjust certain convoy's schedule, responsiveness was reduced again shortly afterwards for quite some time.

So, probably the reduced responsiveness is caused by the re-routing of all existing ware packets in one single world step after set_schedule_counter() is triggered by actions like halt construction/deletion or schedule changes.
I've noticed the same, while trying Colin's game. So it's the rerouting of goods, which makes the game laggy...

Colin

Quote from: Knightly on September 02, 2009, 08:57:08 AM
I have just tried Colin's save game.

There was a brief period of reduced responsiveness immediately after loading, but soon afterwards the game ran smoothly. When I tried to add a halt or adjust certain convoy's schedule, responsiveness was reduced again shortly afterwards for quite some time.


Reduced by quite a lot and when building a game, because one is continually adding halts or new schedules, it happens with repeated monotony. Every few game days actually, and not counting Auto Save which I've extended to every 3 months. Brave little Aussie fellow that I am. ;D ;D ;D
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

knightly

Quote from: gerw on September 02, 2009, 09:04:10 AM
I've noticed the same, while trying Colin's game. So it's the rerouting of goods, which makes the game laggy...

Gerw, is it possible to spread the load of re-routing ware packets across a few steps?

Quote from: Colin on September 02, 2009, 09:06:30 AM
Reduced by quite a lot and when building a game, because one is continually adding halts or new schedules, it happens with repeated monotony. Every few game days actually, and not counting Auto Save which I've extended to every 3 months. Brave little Aussie fellow that I am. ;D ;D ;D

Colin, I understand how poor the responsiveness would be when you continue to add/remove halts or adjust schedules, because I have a similar PC specifications as yours ;) The game almost freezes during such global re-routing.


gerw

Quote from: Knightly on September 02, 2009, 09:13:59 AM
Gerw, is it possible to spread the load of re-routing ware packets across a few steps?
It's currently spread across 32 steps, see spieler_t::step. But maybe this number should depend on the time, the rerouting needs.

knightly

Quote from: gerw on September 02, 2009, 10:47:49 AM
It's currently spread across 32 steps, see spieler_t::step. But maybe this number should depend on the time, the rerouting needs.

Thanks a lot for pointing out. :) It is quite incredible that after spreading re-routing across 32 steps, the game is still so unresponsive.

gerw

Quote from: Knightly on September 02, 2009, 11:34:41 AM
Thanks a lot for pointing out. :) It is quite incredible that after spreading re-routing across 32 steps, the game is still so unresponsive.
But one can think of a worst case, where all the 'expensive' stations (with many waiting goods) are rerouted in one step...

z9999

At least, there wasn't this problem in 99.17.1.
But 100.0 and after versions have this problem.

I heard many people saying 100.0 and after versions are too heavy.

whoami

#14
My observations with Colin's savegame:
1) After loading of the savegame, and also after deployment of a new schedule or a new goods category for an existing schedule, ST becomes unresponsive for some seconds, with down to 2(!) fps and 100% cpu/core utilization. => ST seems to recalculate routing for all goods packets, not just for the goods categories served by the new schedule/convoy. This should be fixable by filtering before re-routing.
2) The previous observation especially applies to the change of routing in an isolated network, which seems to still lead to re-calculation of routing on the whole map. For isolated freight lines/networks, it may be useful to limit the refresh, e.g. by walking through the graph of stations *and* connected convoys, also filtering(!) by goods category (see above), marking the stations (also convoys due to their contents?) for refresh, and refreshing only the marked ones in the next loop (removing the marker here again).
1)+2) could be changed as described (or by other means), but a large, fully interconnected P+M network will still not benefit directly from such changes, only indirectly by optimized routing for certain other cases. Further spreading of re-routing calculation may help.
3) side-note: station coverage setting in the savegame is 3 (=> 49 tiles to check at P+M+G generation), which doesn't matter for re-routing, but the additional calculation effort for newly generated passengers+mail increases overall load.

Colin

#15
I have just read whoami's post and it seems to me that he has hit the nail on the head, as they say. If his suggestions can be implemented we should see a, not vast, but quite substantial in crease in game flow. (Incidentally my FPS went down to 1)

I could set the station coverage back to 2 but surely that would lead to an increase in Trams/Tram lines/Stations, creating even more route finding?

I have question re this particular game.

modnote by whoami: this part is now discussed in http://forum.simutrans.com/index.php?topic=3203.msg31352#msg31352:
I have a passenger train (Not shown in the uploaded game)  Which travels between two stations only. It leaves one station with over 3000 pasengers and arrives at it's destination where they all disembark. It makes a loss of over 41 thousand (not in travel costs) when the passengers leave. It returns home with over 300 passengers and makes a profit of over 22k when the passengers leave. Have I got a setting in wrong in the config.tab or is this another bug?
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

whoami

#16
As I wrote, my suggestions (if they matter at all) will not help (much) in the case of the passenger/mail network.

Quote from: Colin on September 03, 2009, 08:29:17 AM
I could set the station coverage back to 2 but surely that would lead to an increase in Trams/Tram lines/Stations, creating even more route finding?
Directly, it will only increase routing calculation effort to a smaller extent, because of the larger number of stations ("hops"). But usually, you will have more different lines, more interconnection points and more transfers, and that makes a big difference.

QuoteIt makes a loss of over 41 thousand (not in travel costs) when the passengers leave. It returns home with over 300 passengers and makes a profit of over 22k when the passengers leave. Have I got a setting in wrong in the config.tab or is this another bug?
(I assume that you don't refer to ST-experimental.)
pay_for_total_distance = 2 can lead to such a result.
Also, cost deduction at switching of the month might be involved.

Colin

#17
Quote from: whoami on September 03, 2009, 11:22:18 AM
As I wrote, my suggestions (if they matter at all) will not help (much) in the case of the passenger/mail network.
Directly, it will only increase routing calculation effort to a smaller extent, because of the larger number of stations ("hops"). But usually, you will have more different lines, more interconnection points and more transfers, and that makes a big difference.

Do you mean that if I make multiple smaller Tram networks in my cities instead of one large network it would improve the flow of the game?

In Experimental the train made $721 profit on the outward journey with 3078 passengers as apposed to a $41k loss in Standard.

Additional info re-train behaviour.

Train travels A-B-A.
From A to B
journey 1. 3078 passengers, loss on arrival at B -$41k plus.
Journey 2. 3078 passengers, loss on arrival at B -$42k plus.
Journey 3. 3078 passengers, loss on arrival at B -$43k plus.
From B to A
journey 1. 325 Passengers, profit on arrival at A $22k plus.
Journey 2. 476 Passengers, profit on arrival at A $16k plus.
Journey 3. 124 passengers, profit on arrival at A $4k plus.

This doesn't make sense and it doesn't matter if I set the pay_for_total_distance to 0, 1, or 2. I realise that these settings may not affect a current SVE game. But it doesn't detract from the nonsensical profit and loss that is being generated.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

whoami

Quote from: Colin on September 03, 2009, 06:11:00 PM
Do you mean that if I make multiple smaller Tram networks in my cities instead of one large network it would improve the flow of the game?
More different lines means more schedules and transfers to check, so goods routing calculation will need more CPU time. From this aspect, a larger station coverage can support in reducing calculation effort.
The performance-wise effect of the station coverage setting depends, however, also on your network layout: The more tiles with houses are covered by several stations, the slower the game may become, because a generated passenger will check more than one station for suitable connections. But the developers can provide a better, substantiated analysis on this complex problem.

As your savegame doesn't contain an extremely large network, there may be a performance bug. Additionally, Z9999 may be right when he said that ST has become "too heavy".

Regarding the obscure revenue: It's unrelated to the program performance problems, so could you please open a new topic for that, also providing the savegame with the train? After that, I will move my own replies and any remaining related text there.

Colin

Quote from: whoami on September 04, 2009, 07:39:38 PM
As your savegame doesn't contain an extremely large network, there may be a performance bug. Additionally, Z9999 may be right when he said that ST has become "too heavy".

Oh dear! I thought my SVE contained a fairly massive network. I must be really naive.

Quote from: whoami on September 04, 2009, 07:39:38 PM
Regarding the obscure revenue: It's unrelated to the program performance problems, so could you please open a new topic for that, also providing the savegame with the train? After that, I will move my own replies and any remaining related text there.

Ok I'll upload my SVE with the latest additions and then open a new topic.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

prissi

The station coverage is not important.

The problem in your savegame are not the size of the map but the shear amount of waiting persons. You have several stations with more than 300000 people waiting. Updating such a monster will of course take some time.

I will look into the code to make it smoother, although no promises ...

Colin

Quote from: prissi on September 05, 2009, 11:38:36 AM
The station coverage is not important.

The problem in your savegame are not the size of the map but the shear amount of waiting persons. You have several stations with more than 300000 people waiting. Updating such a monster will of course take some time.

I will look into the code to make it smoother, although no promises ...

Thank You Prissi, I will look forward to any improvements you can make.

In my next game (which may be awhile away due to this one being nowhere near completed) I will reduce the Passenger Level.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

Colin

@ Prissi,

Does nightly 2640 contain your changes?
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

knightly

Hi Colin :)

According to the source code update history, Prissi's changes for speeding up re-routing have already been incorporated. So, does it improve the responsiveness of your game?

Knightly

Colin

Quote from: Knightly on September 07, 2009, 11:36:01 PM
Hi Colin :)

According to the source code update history, Prissi's changes for speeding up re-routing have already been incorporated. So, does it improve the responsiveness of your game?

Knightly


No, if anything it's getting worse. I'm sorry to say.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

z9999

Quote from: Colin on September 08, 2009, 04:07:40 AM
No, if anything it's getting worse. I'm sorry to say.

Same here.
What we want is not to solve jumping convoi, but to solve unresponsive.
In recent simutrans, idle time is zero during long time even though convoi keep moving well.

knightly

#26
Quote
void spieler_t::step()
{
   if(  halt_list.get_count()>0  ) {

      uint32 it = halt_iterator_start;
      slist_iterator_tpl <halthandle_t> iter( halt_list );
      while(  it>0  &&  iter.next()  ) {
         it--;
      }
      if(  it>0  ) {
         halt_iterator_start = 0;
      }
      else {
         uint32 units_handled = 0;
         while(  units_handled<8192  ) {
            if(  !iter.next()  ) {
               halt_iterator_start = 0;
               break;
            }
            halt_iterator_start ++;
            // iterator until 8192 passengers were handled
            units_handled += iter.get_current()->sum_all_waiting_goods();
            iter.get_current()->step();
         }
         INT_CHECK("simplay 156");
      }
   }
}

After looking at Prissi's changes, I think one of the reasons why the unresponsiveness problem has aggravated lies in the fact that >= 8192 ware (not ware packets) are re-routed in each step. For hub stations with large-sized ware packets (i.e. large menge), this will indeed help to prevent too much re-routing work in one step; but with non-hub halts, they usually have only small-sized ware packets (please see the attached image), so it needs to process many non-hub halts before the limit of 8192 can be reached.

IMHO, re-routing workload depends not on packet size (menge), but on packet count. More packets, more re-routing work, and vice versa. So, I would suggest to change from counting ware menge to counting the number of packets processed. Of course, the limit of 8192 also needs to be reduced accordingly.


prissi

The problem in Colin save are that there are several stations with 300000 passengers. Updating (rerouting) them takes nearly a second on my computer. During this no user interaction is possible, since such can be only done outside a step. The only other way would be that the stations themselves only handle 8192 each step, and indicate the result then. But this would require more complex logik, as changes of the passenger numbers must be also taken into account (i.e. something like a dirty flag for ware in stations.) I will look into this next week.

knightly

#28
Edit : Post removed as the provided screenshot was wrong.

HeinBloed

Knightly,

I really only stumbled accross this thread randomly, but looking at your screenshot, I strongly suspect that you fell victim to a display bug: the sum of passengers of the packages displayed is never 300k, not even close. I guess lots of destinations with letter >B are missing.
(previously known as "tttron")

prissi

Display works only with integer coordintes with 16bit; therefore I think the scrollbar just messed up. It is very easy to time those stations and they have like 4000-30000 packets and cn take up to 800ms or rerouting (on this computer here).

knightly

Prissi,

Really sorry! I didn't realise that the list is incomplete. I loaded Colin's save game again and looked at the same station, and the list kept growing gradually when I tried to scroll down.

Nevertheless, I think it is more logical to count the number of packets instead of the sum of menge. Hope you can consider this suggestion when you look at the code later.

Knightly



Dwachs

[offtopic]
this:

Quote from: tttron on September 09, 2009, 12:23:14 AM
I strongly suspect that you fell victim to a display bug: the sum of passengers of the packages displayed is never 300k, not even close. I guess lots of destinations with letter >B are missing.

should be fixed in revision 2642.
[/offtopic]
Parsley, sage, rosemary, and maggikraut.

gerw

In the current implementation, if the goods of a station are rerouted, simutrans does a BFS for each packet. Since no heuristic is used, it is possible to do this in only one BFS, until the target for each packet is found.

I will try to implement this, if there is any free time next week...

prissi

Please check again with curretn nightly. Should be much smoother (only lagging on my eeePC).

z9999

#35

bool haltestelle_t::step()
(...)
else if(reroute_counter!=welt->get_schedule_counter()) {
// all new connection updated => recalc routes
if(  !reroute_goods()  ) {

and

bool haltestelle_t::reroute_goods()
{
// reroute only on demand
reroute_counter = welt->get_schedule_counter();
(...)
// break after a certain number of reroute actions
if(  ++packets==255  ) {
return false;
}


Then, after breaking, reroute_goods() will not be called again, isn't it ?


prissi

Thank you for spotting this out. Indeed, the update needs to be moved to the end ...

z9999

#37
I tested with Colin's savegame.

For 1184 Halts.
rebuild_destinations() used 106 steps and reroute_goods() used 210 steps.
Sum of them is 316 steps.


[EDIT]
Another game: 3 players and public halts

player 1: 36 halts used 12 steps
public halt: 945 halts used 97 steps
player 2: 122 halts used 8 steps
player 3: 90halts used 24 steps
sums: 1193 halts

Numbers of halts per step is.
older version 1193/32steps=37.28 halts/step
this version 36/12+945/97+122/8+90/24=31.74 halts/step

Not so bad.

gerw

I didn't tested it, but I think, the changes contains a bug: last_ware_index isn't reseted in the loop in simhalt.cc:781.

However, it would be faster, if the BFS is done for all goods in one step instead of distributing it to several steps... And 316 steps are roughly 60 seconds...

prissi

Doing in one step caused the lag that we are trying to get rid of ... and you are of course right about the last step.

gerw

Quote from: prissi on September 20, 2009, 08:51:39 PM
Doing in one step caused the lag that we are trying to get rid of
But if you do all searches within one BFS run, it will take only a little bit more time than the BFS run for the packet with the most transfers. So it won't be more laggy than the current nightly and the rerouting would be done much faster.

prissi

Not sure, what "BSF" is; but rerouting all goods of one of the large stations took more than 100ms, with no user interaction (and thus no mouse movement) is possible. This is recieved as lagginess, which must be avoided. It severance of course depends much on the computer used.

Spike

Quote from: prissi on September 21, 2009, 09:07:51 AM
Not sure, what "BSF"

I'd assume it means "breadth first search", and he wrote BFS which matches quite nicely on that.

Colin

Hi Guys,

I've just downloaded r2650 and I'm pleased to inform you that the game has improved at least 90%. There is still a bit of random jerkiness and it's reasonably difficult to drag a menu screen across the map, Apart from that. :award: :award: :award:

Thanks Gwer, and many thanks to Knightly who has tried his hardest to guide this addle brained gamer through my trials and tribulations.
It's a pity James doesn't listen to you a bit harder, mate. EXP might improve for the better. 
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

gerw

Quote from: prissi on September 21, 2009, 09:07:51 AM
but rerouting all goods of one of the large stations took more than 100ms
It tooks only 100ms if you reroute each good for itself. If you take one "big" run, where you search the graph for all possible ways, you can calculate all new routes at once.

Wiki says to the Dijkstra algorithm (which is a generalization BFS or A* without heuristic):
QuoteFor a given source vertex (node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path) between that vertex and every other vertex. It can also be used for finding costs of shortest paths from a single vertex to a single destination vertex by stopping the algorithm once the shortest path to the destination vertex has been determined.

So its better, to calculate all connections at once.

knightly

#45
@Colin

Quote from: Colin on September 21, 2009, 10:41:16 AM
Thanks Gwer, and many thanks to Knightly who has tried his hardest to guide this addle brained gamer through my trials and tribulations.

The improvements in responsiveness are made by Prissi so far. Gerw's suggestion to combine search has not been incorporated yet. And you don't need to thank me -- I have done nothing at all to improve the game responsiveness in this case. :P

@Prissi

I have looked at the code and noticed that in spieler_t::step(), total menge (i.e. get_finance_history(0,HALT_WAITING)) is still used as a basis for limiting rerouting (though packet count is also used as a limit in haltestelle_t::reroute_goods()). IMHO, this will cause some steps to do more rerouting or less rerouting depending on the sizes of the packets.

I have created a patch to make spieler_t::step() dependent on packet count. It is based on r2650, relative to the trunk folder. You may notice that I have changed all last_index to last_catg_index -- this is only done to improve readability.

prissi

The proble is, that the initial connection calculation also takes some time. Since the routine does not know (yet) in which state it is, it has to have some guidance. THe current code is not finished, I plan to move it into haltestlle, which would iterate over all stops and then could also display the current stage of routine (for debug purposes).

knightly

Prissi,

I suppose the above reply is in response to my patch?

If you take a look at my patch again, I have decremented units_remaining by 2 for the case of rebuilding destinations (please see haltestelle_t::step()). So, at most 64 / 2 = 32 halts will rebuild destinations in 1 single step. If you think 32 halts in 1 step is too much, you can increase that number from 2 to say, 3 or 4.

Knightly

z9999

#48
Tested again with same savegames.

Colin's savegame
Player 1: 1184 halts
Public: 3 halts

prissi's patch
rebuild_destinations() used 112 steps and reroute_goods() used 212 steps.
Sum of them is 324 steps.

Knightly's patch
rebuild_destinations() used 37 steps and reroute_goods() used 945 steps.
Sum of them is 982 steps.


Another game: 3 players and public halts
player 1: 36 halts
public: 945 halts
player 2: 122 halts
player 3: 90halts
sums: 1193 halts

prissi's patch
rebuild_destinations() used 94 steps and reroute_goods() used 246 steps.
Sum of them is 340 steps.

Knightly's patch
rebuild_destinations() used 30 steps and reroute_goods() used 551 steps.
Sum of them is 581 steps.


I like the idea of Knightly's patch because it reduces number of halts/step on normal state.

prissi

Although 500 steps mean two minutes. In multiplayer, when a station is placed every two minutes rerouting may never be completed then ... ALso iterating over all lines takes a constant amount of time for most stations, thus the step routine needs to know the state of each station to decide what to do. Furthermore one player are finished before others; the proper way out is most likey a incorporation into haltestelle_t.

@Knightly
I overlooked you attachment. Will have a look soon.

z9999

Quote from: prissi on September 21, 2009, 02:22:20 PM
Although 500 steps mean two minutes.

You can change it by units_remaining.
I think that anyway it isn't possible to play such a big game in network mode even though without this problem.

The question is which counter is reasonable.

prissi

I did some testing with this and yoshis game and I think I reasonable balanced it. Further tries are appreaciated.

knightly

#52
@z9999 : Thank you very much for testing my patch :)

@Prissi : Thanks a lot for considering my patch and adopting it after adaptation.

As for Gerw's suggestion to combine multiple searches for each individual packet into a single search, I think he is right that such combined search can reduce search time significantly, by eliminating the need to rebuild search trees from scratch for each individual packet. However, I think search time cannot be reduced to such a short time as entailed by the single longest individual search, because each examined halt needs to be compared against a list of outstanding ware packets' destination halts to see if there is a match. With a large game like Colin's, some halts may have over 900 pax packets, so even a combined search have to take quite some time to finish, still causing unresponsiveness problem.

I think Prissi's current system has successfully distributed search time relatively evenly across a reasonable number of steps. Nevertheless, combined search can still be useful in this system : it is possible to determine the number of packets to perform path search as a batch for the current category, using Gerw's combined search routine. If this can successfully reduce search time by a large percentage, we can consider increasing units_remaining so as to shorten the lifecycle of a global reroute.

To be more concrete, here is a code block from haltestelle_t::reroute_goods() :
Quote
         if(waren[last_catg_index]  &&  waren[last_catg_index]->get_count()>0) {

            vector_tpl<ware_t> * warray = waren[last_catg_index];
            while(  last_ware_index<warray->get_count()  ) {

               if(  suche_route( (*warray)[last_ware_index], NULL, false )==NO_ROUTE  ) {
                  // remove invalid destinations
                  warray->remove_at(last_ware_index);
               }
               else {
                  last_ware_index++;
               }
               // break after a certain number of reroute actions
               if(  --units_remaining==0  ) {
                  return false;
               }
            }
            // now we are finisched with this array
            // => reset last_ware_index for the next categorie
         }
We can reroute as a batch a number of packets, determined by the smaller of (1) units_remaining and (2) (warray->get_count() - last_ware_index).

Colin

My apologies to Prissi. I did not realise that it was mainly your patch that fixed my game. Thank you mate.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

z9999

Quote from: prissi on September 21, 2009, 04:32:59 PM
I did some testing with this and yoshis game and I think I reasonable balanced it.

yoshi's game is only 1 player and amount of waiting people are large.
Assume munti player and small amount of waiting people but poor PC.

I tested another game.

player 1: 9 halts used 1 step
public: 252 halts used 28 steps
player 3: 77 halts used 3 steps
player 4: 114 halts used 3 steps
player 5: 95 halts used 3 steps
player 6: 245 halts used 4 steps
player 7: 25 halts used 1 step
player 8: 104 halts used 2 steps
player 9: 31 halts used 1 step
sum of halts is 952 halts

Halts per steps
older version: 952/32steps=30 halts/step
prissi's patch: 9+252/28+77/3+114/3+95/3+245/4+25+104/2+31=284 halts/step

Every step reculc 284 halts in your patch.

Colin

If anyone is interested I've just uploaded my latest SVE using r2650 to the Simutrans File Sharing site. http://simutrans-germany.com/files/upload/Colin's%20Standard%202650.sve

Can anyone tell me why my uploads always have a Red 'File Delete' where the uploaders name usually goes?
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

gerw

Quote from: Knightly on September 21, 2009, 06:13:47 PM
As for Gerw's suggestion to combine multiple searches for each individual packet into a single search, I think he is right that such combined search can reduce search time significantly, by eliminating the need to rebuild search trees from scratch for each individual packet. However, I think search time cannot be reduced to such a short time as entailed by the single longest individual search, because each examined halt needs to be compared against a list of outstanding ware packets' destination halts to see if there is a match. With a large game like Colin's, some halts may have over 900 pax packets, so even a combined search have to take quite some time to finish, still causing unresponsiveness problem.
If you sort the destinations of the 900 packets, you can use binary search and you will need at most log(900) = 10 comparisons each step (Maybe a hash table is even better?). Further, you can kick out the packets, for which the route is already found. Imho, it takes maybe the time of routing 5 packets (with long route).

knightly

@ Colin :

Quote from: Colin on September 21, 2009, 09:21:21 PM
Can anyone tell me why my uploads always have a Red 'File Delete' where the uploaders name usually goes?

Once you have logged on to the file sharing service, files that you have previously uploaded will show a Delete link so that you can remove them if you want. That doesn't mean that those files have been deleted.


@ Gerw :

Quote from: gerw on September 22, 2009, 07:25:58 AM
If you sort the destinations of the 900 packets, you can use binary search and you will need at most log(900) = 10 comparisons each step (Maybe a hash table is even better?). Further, you can kick out the packets, for which the route is already found. Imho, it takes maybe the time of routing 5 packets (with long route).

I see. Look forward to your patch then :)

gerw

Quote from: Knightly on September 22, 2009, 08:17:26 AM
I see. Look forward to your patch then :)
I don't feel like patching now. I've got not much time this (and the next 2) weeks and I will wait until prissi has finished his work. Otherwise, I have to rewrite my patch thousand times... Moving the rerouting to haltestelle_t is a good idea, but I don't like the idea of splitting, since Imho my suggestion is more elegant.

And it would also be possible to process not a fixed number of stops (or with the splitting: packets), but as many as 100 ms are gone, isn't it?

prissi

Actually colins game is a little atypical, since usually never more than a few tens of packets are waiting in any stop for all the large savegames I ever got. Usually the RESCHEDULING takes more time, since one has to loop over a lot of convois and lines. Therefore building a search tree most likely will take longer than the actual rescheduling; also most stations have also in colins game very few people waiting.

Spike

Maybe, for stations above some limit, newly arriving packets should be discarded from the game.

gerw

Quote from: prissi on September 22, 2009, 09:58:28 AM
Usually the RESCHEDULING takes more time, since one has to loop over a lot of convois and lines.
Maybe the lineless convoys and lines should also be cached at the stations?

Quote from: prissi on September 22, 2009, 09:58:28 AM
Therefore building a search tree most likely will take longer than the actual rescheduling; also most stations have also in colins game very few people waiting.
You don't need to build a search tree. You could take the code from suche_route, but in every step you test if the halt is the destination of any packet waiting there.

Colin

Quote from: prissi on September 22, 2009, 09:58:28 AM
also most stations have also in colins game very few people waiting.

Thats probably because I try to provide enough transport for everyone. Mind you I just picked a station at random and it had 136231 passengers waiting. Isn't that quite a lot?
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

z9999

Result of r2658. It seems to be good.
First 3 games are same savegame which were tested before, last 2 are new.

Colin's savegame (551 convois, 1187 halts)
r2650 rebuild_destinations()=112, reroute_goods()=212, sum=324
r2658 rebuild_destinations()=20, reroute_goods()=488, sum=508

2nd savegame (3 players, 6671 convois, 1193 halts)
r2650 rebuild_destinations()=94, reroute_goods()=246, sum=340
r2658 rebuild_destinations()=20, reroute_goods()=373, sum=393

3rd savegame (9 player, 870 convois, 952 halts)
r2650 rebuild_destinations()=27, reroute_goods()=28, sum=55
r2658 rebuild_destinations()=14, reroute_goods()=42, sum=56

regbombay1's London_C.sve (4297 convois, 8577 halts)
r2658 rebuild_destinations()=135, reroute_goods()=117, sum=252

fuzion_051's The World 3.sve (2791 convois, 839 halts)
r2658 rebuild_destinations()=70, reroute_goods()=368, sum=438

prissi

If they are for pak64 or some other plain vanilla pak (no addons etc) I would be very interested in your testgames.

z9999

Unfortunately, each savegame needs ton of addons, sorry.

Colin

r2664 SDL has done it for me. The game is running 99.99% more efficiently. I doubt if you'll get it any better, but it's amazing what you can do when you put your minds to it. Thanks a million guys, I'm enjoying the game again. :D :D :D. I don't use GDI so I can't comment on that.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

knightly

#67
Hi Prissi,

I have created another patch, based on r2682 and relative to the trunk folder, which speeds up route searching. The major changes are :

(1)
The check for destination halt is moved inside the loop which iterates over reachable halts of the current halt. It helps to find the destination halt earlier on and skip some unnecessary iterations and checking.

(2)
A per-ware-category counter for each halt is added, which records the number of lines/lineless convoys serving that halt for a particular ware category. This info is then used in suche_route(). Only origin halt, transfer halts and destination halt will be added to the HNode array.

I have performed some tests by rebuilding all halts' reachable destinations and rerouting all halts' goods in one step, recording the time used. Below are the results (the figures are in milliseconds) :



Before applying the patch :

**** Colin
Rebuilding destinations takes :  174
Rerouting goods takes :  43746
Complete refresh takes :  43920

**** Yoshi
Rebuilding destinations takes :  5063
Rerouting goods takes :  2039
Complete refresh takes :  7102

**** 1982
Rebuilding destinations takes :  40
Rerouting goods takes :  332
Complete refresh takes :  372

**** Wipi
Rebuilding destinations takes :  8
Rerouting goods takes :  67
Complete refresh takes :  75

**** Ferndale
Rebuilding destinations takes :  3
Rerouting goods takes :  20
Complete refresh takes :  23

       

After applying the patch :

**** Colin
Rebuilding destinations takes :  170
Rerouting goods takes :  4172
Complete refresh takes :  4342

**** Yoshi
Rebuilding destinations takes :  5114
Rerouting goods takes :  684
Complete refresh takes :  5798

**** 1982
Rebuilding destinations takes :  34
Rerouting goods takes :  145
Complete refresh takes :  179

**** Wipi
Rebuilding destinations takes :  7
Rerouting goods takes :  24
Complete refresh takes :  31  

**** Ferndale
Rebuilding destinations takes :  2
Rerouting goods takes :  3
Complete refresh takes :  5


From the above, it seems that rebuilding destinations only takes a little more time, but the reduction in rerouting time is quite significant. You may wonder why such reduction is especially larger in Colin's save game : it is because the proportion of transfer halts is lower in his game, and thus can benefit more from this patch.

Further testing is welcome.




Incidentally, I have removed the following cleanup code from suche_route() :
Quote
   /* Need to clean up?
    * Otherwise we just incease the mark => less time for cleanups
    */
   if(  current_mark == 0xFFFFFFFFu  ) {
      slist_iterator_tpl<halthandle_t > halt_iter (alle_haltestellen);
      while(  halt_iter.next()  ) {
         halt_iter.get_current()->marke = 0;
      }
      current_mark = 0;
   }
I think it is not necessary to reset the markers, as suche_route() only checks the inequality between the current marker and a halt's marker.

Besides, I have renamed some variables to better reflect their functions to enhance code readability.

Knightly

prissi

Well, apart from the servicable_counter: Could the purpose not be fulfilled by the warenziele array anyway? If this is empty (or rather it will has only one entry) then this halt is only connected to us.

I.e. only adding when reachable_halt->get_warenziele[ware_catg_index].get_count() > 1 ? Since hat gehalten just adds to warenziele anyway.

knightly

Prissi,

The presence of reachable halts in warenziele does not indicate whether a halt is a transfer halt or not. A halt may have some entries in warenziele, but they may come from a single line's schedule. A transfer halt needs to have at least 2 lines/lineless convoys serving it, for a certain ware category.

Knightly

z9999

Interesting. I tested only reroute_goods().
Most interesting thing is London_C.sve with max_hops 8000.
It's really the same as max_hops 2000.

2nd savegame (3 players, 6671 convois, 1193 halts)
standard 9799 ms
Knightly 3885 ms

3rd savegame (9 player, 870 convois, 952 halts)
standard 677 ms
Knightly 221 ms

regbombay1's London_C.sve (4297 convois, 8577 halts)
max_hops 8000
standard 37259 ms
Knightly 4548 ms
max_hops 2000
standard 5517 ms
Knightly 4534 ms

fuzion_051's The World 3.sve (2791 convois, 839 halts)
standard 7979 ms
Knightly 4476 ms

knightly

Z9999 :

Thank you very much for testing. :)

Quote from: z9999 on September 28, 2009, 05:34:04 PM
Most interesting thing is London_C.sve with max_hops 8000.
It's really the same as max_hops 2000.

As I have mentioned above, only origin halt, transfer halts and destination halt will be added to the HNode array. Thus, the number of HNodes used in a single search is largely reduced. Judging from your figures :

Quote from: z9999 on September 28, 2009, 05:34:04 PM
regbombay1's London_C.sve (4297 convois, 8577 halts)
max_hops 8000
standard 37259 ms
Knightly 4548 ms
max_hops 2000
standard 5517 ms
Knightly 4534 ms

I think all or most route searches managed to complete within the 2000 max_hops limit. Probably the proportion of transfer halts is rather low.

Knightly

z9999

Additional information. The percentage of transported passengers.

regbombay1's London_C.sve (4297 convois, 8577 halts)
max_hops 2000
standard 9%
Knightly 83%

:award:

knightly

I think the size of the HNode array can be reduced (by half?) to compensate for the additional memory consumed by the service_counters.

Colin

Here is my latest STD SVE you can see the improvement that has been made also please notice thatthe game has gone from 470mil in the red to 10mil in the black.

Thanks to gwer/Prissi/Knightly and everyone else who contributed to making this game playable. If I didn't mention your name I'm sorry, but I do know who you are, Fred, er, Joe, No its Vil something or was it Z9***.

Anyway Thanks

http://simutrans-germany.com/files/upload/Colin's Standard 2682.sve
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

knightly

I have attached an updated patch, also against r2682 and relative to the trunk folder. It includes 2 bugfixes and reduces the size of HNode array by half.

Colin

Knightly kindly sent me a copy of the GDI version of his patch and I'm here to tell you, the flow of my game improved beyond all expectations. The traffic (all modes) flows smoothly, there is no jerkiness of movement and no pauses at Auto Save.

As I pointed out to Knightly, GDI version has never worked with larger maps on my computer. I've always used SDL.  With his patch I can now use GDI.

Anyone who has downloaded my latest SVE should try it with this patch.

Thanks Knightly.
I may not agree with what you say, but I will defend to the death your right to say it

Thought for the day

When you are up to your backside in alligators, it is difficult to remind yourself that your initial objective was to drain the swamp.

prissi

Only the service counter is not included, since to me it seems like the warenziel-list does essentially the same.

knightly

#78
Prissi,

Quote from: prissi on October 04, 2009, 07:09:56 PM
Only the service counter is not included, since to me it seems like the warenziel-list does essentially the same.
Sorry, but I don't think so.

A halt X served by a single pax line which has a schedule containing, say, 8 halts, will end up having 7 halts in warenziele[0] for pax. That fulfils your criteria :
Quote from: prissi on September 28, 2009, 11:07:18 AM
I.e. only adding when reachable_halt->get_warenziele[ware_catg_index].get_count() > 1 ? Since hat gehalten just adds to warenziele anyway.
but X is definitely *not* a transfer halt. A transfer halt is where ware packets can switch lines/convoys. If you think that I am wrong on this, please state the reasons.

The concept of transfer halt is the *gist* of my patch, and distorting it will largely diminish the usefulness of this patch. My centralised search in ST EXP also focuses on transfer halts, and it works fine for a long time without any problem, and has significantly reduced search time. I can't see why you have to insist on removing the service counters (which is actually the core of my patch), unless it really takes up a lot of space or processing time per se.

However, the addition of service counters only consumes a little bit more of memory. 1000 halts with 16 ware categories will only use an additional (16 categories  + 4 byte pointer) x 1000 halts = ~20KB. And the additional processing time for clearing and incrementing the service counters is relatively small, according to my test results. You may perform profiling to be accurate.

Knightly

prissi

But your patch, if I have two convois with the same schedule, then it will be too larger than 1 ... Or did I mistake it completely? But it would be still an improvement for any case without this.

I will look at it again.

knightly

#80
Quote from: prissi on October 04, 2009, 09:44:19 PM
But your patch, if I have two convois with the same schedule, then it will be too larger than 1 ... Or did I mistake it completely? But it would be still an improvement for any case without this.

You are right, but only if both convoys are lineless convoys. I think most players will group 2 lineless convoys with identical schedules into a single line for easier management, so this is not much of a problem in general.

You can solve this problem by modifying hat_gehalten() to return a boolean value which indicates whether any halt has been added to warenziele in this function call, and increment the service_counter[catg] in rebuild_destinations() only if that return value is true. But I doubt if the additional processing time justifies the gain in general.

BTW, do you mind performing profiling on my patch? I am interested to know how much faster suche_route() is, and how much slower rebuild_destinations() is, after the modifications. Many thanks in advance! :)

Edit 1 :

To make sure that you understand the difference between using service_counter[catg] and your suggested check :
Quote from: prissi on September 28, 2009, 11:07:18 AM
reachable_halt->get_warenziele[ware_catg_index].get_count() > 1
I have attached a simple diagram (1st attachment) to illustrate it. Assuming only pax are involved. The letters are the names of the halts, and the numbers in bracket refer to the number of halts in warenziele[0]. The blue circle is served by Line P (A>B>C>D>E), and the green circle by Line Q (E>F>G>H>I). In this setup, only Halt E is a transfer halt.

With my patch, only Halt E will be identified as a transfer halt, since there are 2 lines (P & Q) serving it (i.e. service_counter[0] > 1). Searching route from A to H will only add origin halt A, transfer halt E and destination halt H to the HNode array.

But with your suggestion, all halts satisfy your criteria (i.e. warenziele[0] > 1), thus any intermediate non-transfer halts may also be added to HNode unnecessarily during the search. In this particular case, nodes are added for A, B, C, D, E, F, G, H during the search from A to H, wasting processing time -- the waste is not just on adding the nodes, but also on additional processing time involved in processing that node afterwards.

As you can see, my original code and your suggestion are very different, and your suggestion will largely reduce the effectiveness of my patch in reducing search time. Service_counter[catg] is the cornerstone of my patch, and it is necessary to make my patch work as intended. So, unless you can find a better way to determine transfer halt, please do not remove service_counter from my patch.

As a sidenote, during my development of ST EXP's centralised search, I realised that most games usually have only 50% or less of its halts as transfer halts (of course, there are exceptions, depending on player's strategies). That is the reason why focusing on transfer halts in suche_route() can significantly reduce search time.

Edit 2 :

Just in case you are not sure why we don't need to add non-transfer halts to the HNode array, please take a look at the 2nd attached image. In this diagram, there is only 1 single line serving halts A to E.

When searching for the route from A to E, A as the origin halt will first be added to the HNode array. Then, A's reachable halts (namely, B, C, D and E) are checked, one by one : with B, it is checked against the destination halt to see if we have found the destination, but no, this is not the destination; next, check if it is a transfer halt, but no, so no more processing is needed.

Rationale : if B is a non-transfer halt, there is only a single line/lineless convoy serving it; if current node A has B as its reachable halt, then A must also be served by the same line/lineless convoy, which in turn means that B's reachable halts are included in A's reachable halt's list (except A of course) and will be checked while processing A's reachable halts. Thus, there is no need to add B to HNode array, and no need to check B's reachable halts.

z9999

Quote from: prissi on October 04, 2009, 09:44:19 PM
But your patch, if I have two convois with the same schedule, then it will be too larger than 1 ...

Even so, nothing worse than current bahavior.

In current simutrans, search priority of hub station is very low, and passengers can't reach their destination. That causes poor game flow, if player set high max_hops.

Quote from: z9999 on September 28, 2009, 06:17:08 PM
Additional information. The percentage of transported passengers.

regbombay1's London_C.sve (4297 convois, 8577 halts)
max_hops 2000
standard 9%
Knightly 83%

In current simutrans, I can't play regbombay1's London_C.sve withmax_hops 8000 because of Re: poor game flow. But max_hops 2000 is not fun, bevause only 9% of passengers can travel.

@Knightly
Can I use your patch for any fork version of simutrans, if this was denied ?

prissi

It will be incorporated. I will probably just get this a more descriptive name (like number_of_non_unique_schedules or so ... )

knightly

@ Z9999

Quote from: z9999 on October 05, 2009, 12:45:13 PM
Can I use your patch for any fork version of simutrans, if this was denied ?

Sure, no problem :D Please take the liberty to use it anywhere you like.


@ Prissi

Quote from: prissi on October 05, 2009, 02:20:19 PM
It will be incorporated. I will probably just get this a more descriptive name (like number_of_non_unique_schedules or so ... )

You mean "service_counter"? Actually I am not sure how it should be named. "service_counter" is the best I can think of, meaning "serviced by how many lines/lineless convoys", but I admit that it is not ideal ... If you can think of a better yet concise description, please feel free to change it. ;)

z9999


knightly

Prissi,

A suggestion regarding connections_searched in rebuild_destinations() : instead of incrementing this variable by 1 per line/lineless convoy, we can increment it by fpl->get_count(). I propose this because players seem to vary a lot as to the size of their schedules : some prefer to have 7-8 halts per schedule, but others may prefer >30 halts. Longer schedules take longer to process, thus by using fpl->get_count(), you can better control the pace of rebuilding destinations. Of course, your formula for decrementing units_remaining in haltestelle_t::step() needs to be changed accordingly.

Knightly

prissi

That is a valid suggestion. ALso it came to my mind, that it is very easy to check, whether a schedule added something to a connection or not, i.e. whether there are more than one identical schedule: The counter of warenziele is large after addition. WIll go to implement that and your patch tonight I think.

knightly

Quote from: prissi on October 06, 2009, 06:02:08 PM
The counter of warenziele is large after addition.

Good idea. A simple check like this should do.

For games like Yoshi's with numerous lineless convoys, I agree with Gerw that the only way to reduce the time involved in rebuilding destinations is to register them at the halts like registering the lines. Do you think it is feasible?

prissi

#88
Yoshis game is from an 87 version several years ago where line management was more clumsy ... and it does not contains so much lineless convois. I think this could be deferred, but if you want to have a go at it. Certainly it may give advantage; but then convois need to handle changes from and to lines even more carefully.

EDIT:
I did profiling, but the differences are small; moreover, most improvement seems from less calls to halthandle_t::is_bound() and halthandle_t(). But as z9999 said, I think most people actually finding a route is more important.

knightly

Quote from: prissi on October 06, 2009, 06:32:11 PM
I did profiling, but the differences are small; moreover, most improvement seems from less calls to halthandle_t::is_bound() and halthandle_t(). But as z9999 said, I think most people actually finding a route is more important.

Do you mind posting the profiling results? Many thanks.

prissi

Well, they are about 12MB, and the change is not seen easily. But here you go, everything up to 0.5% contribution, no calling stack:


Flat profile: (improved system)

Each sample counts as 0.01 seconds.
 %   cumulative   self              self     total          
time   seconds   seconds    calls   s/call   s/call  name    
 9.29     19.95    19.95    16647     0.00     0.00  karte_t::sync_step(long, bool, bool)
 5.83     32.47    12.52  1529605     0.00     0.00  haltestelle_t::suche_route(ware_t&, koord*, bool)
 5.00     43.20    10.73  5044991     0.00     0.00  display_img_nc(short, short, short, unsigned short const*)
 3.82     51.41     8.21  1869586     0.00     0.00  haltestelle_t::hole_ab(ware_besch_t const*, unsigned int, schedule_t const*, spieler_t const*)
 2.97     57.80     6.39 1460434544     0.00     0.00  vector_tpl<ware_t>::get_count() const
 2.15     62.42     4.62  2094715     0.00     0.00  hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::put(sync_steppable*, sync_steppable*)
 2.12     66.97     4.55 940084741     0.00     0.00  vector_tpl<ware_t>::operator[](unsigned int)
 2.03     71.34     4.37 696776484     0.00     0.00  quickstone_tpl<haltestelle_t>::is_bound() const
 1.87     75.35     4.01 192406260     0.00     0.00  ding_t::get_flag(ding_t::flag_values) const
 1.82     79.27     3.92 130461686     0.00     0.00  planquadrat_t::get_boden_in_hoehe(short) const
 1.75     83.02     3.75 549425413     0.00     0.00  slist_iterator_tpl<hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::node_t>::next()
 1.69     86.66     3.64 32334462     0.00     0.00  convoi_t::sync_step(long)
 1.67     90.25     3.59 233174821     0.00     0.00  grund_t::get_hoehe() const
 1.67     93.83     3.58 574663478     0.00     0.00  quickstone_tpl<haltestelle_t>::operator==(quickstone_tpl<haltestelle_t> const&) const
 1.65     97.38     3.55 152483261     0.00     0.00  dingliste_t::bei(unsigned char) const
 1.64    100.91     3.53 937711102     0.00     0.00  quickstone_tpl<haltestelle_t>::operator->() const
 1.54    104.22     3.31 240439540     0.00     0.00  karte_t::ist_in_kartengrenzen(short, short) const
 1.52    107.49     3.27 18540007     0.00     0.00  convoi_t::calc_acceleration(long)
 1.48    110.66     3.17 139927198     0.00     0.00  grund_t::get_weg(waytype_t) const
 1.43    113.73     3.07 176234657     0.00     0.00  vehikel_basis_t::fahre_basis(unsigned int)
 1.33    116.59     2.86 487695305     0.00     0.00  koord::koord(short, short)
 1.20    119.17     2.58 249793805     0.00     0.00  operator==(koord const&, koord const&)
 1.20    121.75     2.58 62523573     0.00     0.00  vehikel_t::get_gesamtgewicht() const
 1.19    124.30     2.55 54759565     0.00     0.00  quickstone_tpl<simline_t>::is_bound() const
 1.09    126.64     2.34 359156267     0.00     0.00  koord3d::get_2d() const
 1.04    128.87     2.23 32300103     0.00     0.00  convoi_t::step()
 0.95    130.91     2.04 67751873     0.00     0.00  grund_t::get_vmove(koord) const
 0.94    132.93     2.02 82602392     0.00     0.00  ding_t::is_moving() const
 0.90    134.86     1.94 16125608     0.00     0.00  dingliste_t::remove(ding_t const*)
 0.83    136.65     1.79 648278617     0.00     0.00  vector_tpl<quickstone_tpl<haltestelle_t> >::operator[](unsigned int) const
 0.75    138.27     1.62 89449878     0.00     0.00  fussgaenger_t::sync_step(long)
 0.69    139.76     1.49 665227174     0.00     0.00  vector_tpl<quickstone_tpl<haltestelle_t> >::get_count() const
 0.60    141.04     1.28 28862148     0.00     0.00  stadtauto_t::sync_step(long)
 0.55    142.23     1.19 174167252     0.00     0.00  hashtable_iterator_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::next()
 0.54    143.40     1.17 240375956     0.00     0.00  karte_t::lookup(koord) const
 0.54    144.55     1.15 143737674     0.00     0.00  obj_besch_t::get_child(int) const
 0.53    145.69     1.14   133776     0.00     0.00  route_t::intern_calc_route(karte_t*, koord3d, koord3d, fahrer_t*, unsigned int, unsigned int)
 0.52    146.80     1.11 13741184     0.00     0.00  haltestelle_t::get_ware_summe(ware_besch_t const*) const




Flat profile: (previous system)

Each sample counts as 0.01 seconds.
 %   cumulative   self              self     total          
time   seconds   seconds    calls   s/call   s/call  name    
 9.45     21.54    21.54  1483245     0.00     0.00  haltestelle_t::suche_route(ware_t&, koord*, bool)
 8.50     40.93    19.39    17067     0.00     0.00  karte_t::sync_step(long, bool, bool)
 4.95     52.22    11.29  5431731     0.00     0.00  display_img_nc(short, short, short, unsigned short const*)
 4.43     62.31    10.10 1645188374     0.00     0.00  quickstone_tpl<haltestelle_t>::is_bound() const
 2.85     68.81     6.50 1898991967     0.00     0.00  quickstone_tpl<haltestelle_t>::operator->() const
 2.03     73.44     4.63 1596670257     0.00     0.00  vector_tpl<quickstone_tpl<haltestelle_t> >::operator[](unsigned int) const
 1.89     77.77     4.32  2050348     0.00     0.00  hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::put(sync_steppable*, sync_steppable*)
 1.89     82.06     4.30 963837657     0.00     0.00  vector_tpl<ware_t>::get_count() const
 1.89     86.36     4.30 132048612     0.00     0.00  planquadrat_t::get_boden_in_hoehe(short) const
 1.82     90.52     4.16  1817492     0.00     0.00  haltestelle_t::hole_ab(ware_besch_t const*, unsigned int, schedule_t const*, spieler_t const*)
 1.71     94.41     3.89 188892750     0.00     0.00  ding_t::get_flag(ding_t::flag_values) const
 1.69     98.27     3.86 155337440     0.00     0.00  dingliste_t::bei(unsigned char) const
 1.66    102.06     3.79 1666172006     0.00     0.00  vector_tpl<quickstone_tpl<haltestelle_t> >::get_count() const
 1.65    105.82     3.76 237458275     0.00     0.00  grund_t::get_hoehe() const
 1.55    109.36     3.54 32458683     0.00     0.00  convoi_t::sync_step(long)
 1.51    112.80     3.44 246075434     0.00     0.00  karte_t::ist_in_kartengrenzen(short, short) const
 1.51    116.24     3.44 141813284     0.00     0.00  grund_t::get_weg(waytype_t) const
 1.50    119.65     3.41 528617071     0.00     0.00  slist_iterator_tpl<hashtable_tpl<sync_steppable*, sync_steppable*, ptrhash_tpl<sync_steppable*> >::node_t>::next()
 1.48    123.02     3.37 172490871     0.00     0.00  vehikel_basis_t::fahre_basis(unsigned int)
 1.46    126.36     3.34 18589421     0.00     0.00  convoi_t::calc_acceleration(long)
 1.32    129.37     3.01 496337298     0.00     0.00  koord::koord(short, short)
 1.16    132.01     2.65 32234028     0.00     0.00  convoi_t::step()
 1.13    134.58     2.57 54660352     0.00     0.00  quickstone_tpl<simline_t>::is_bound() const
 1.10    137.08     2.50 367527603     0.00     0.00  koord3d::get_2d() const
 1.07    139.52     2.44 68549744     0.00     0.00  grund_t::get_vmove(koord) const
 1.04    141.90     2.38 252639444     0.00     0.00  operator==(koord const&, koord const&)
 0.98    144.13     2.23 577424393     0.00     0.00  vector_tpl<ware_t>::operator[](unsigned int)
 0.97    146.35     2.22 62518948     0.00     0.00  vehikel_t::get_gesamtgewicht() const
 0.94    148.49     2.14 83061006     0.00     0.00  ding_t::is_moving() const
 0.78    150.27     1.78 291959379     0.00     0.00  quickstone_tpl<haltestelle_t>::operator==(quickstone_tpl<haltestelle_t> const&) const
 0.78    152.04     1.77 85600845     0.00     0.00  fussgaenger_t::sync_step(long)
 0.78    153.81     1.77 15981170     0.00     0.00  dingliste_t::remove(ding_t const*)
 0.70    155.40     1.59 69182787     0.00     0.00  haltestelle_t::get_warenziele(unsigned char) const
 0.62    156.81     1.41 28987517     0.00     0.00  stadtauto_t::sync_step(long)
 0.60    158.18     1.37 246011854     0.00     0.00  karte_t::lookup(koord) const


Indeed, suche_route is much faster, it is no longer the most time consuming subroutine any more. I think the next release will be a big step towards playing larger maps or better perfomance on netbooks.

knightly

#91
I have written another patch to speed up suche_route() a bit when checking those already-processed halts. Although we now only check transfer halts, transfer halts also have reachable halts which have already been processed. Originally, halthandle boundness is checked before checking marke, meaning that 2 checks are required. But the boundness check is actually unnecessary for these already-processed halt entries.

This patch moves marke out of haltestelle_t, and consolidates them into a static array. In this way, we can check the marker first, and skip the boundness check if this halt has already been processed before.

For small games, the effect of this patch is not noticeable; but for large games like Colin's and Yoshi's, there is a noticeable reduction in search time. In general, this patch will have a greater effect on games with a higher proportion of transfer halts.

The static array consumes 262KB of memory, which is the amount of memory saved by reducing the size of the HNode array by half in the previous patch.

Here are some testing results :



Before applying the patch :

**** Colin - Finished ****
Rebuilding destinations takes :  112 ms
Rerouting goods takes :  7487 ms
Complete refresh takes :  7599 ms

**** Yoshi ****
Rebuilding destinations takes :  4963 ms
Rerouting goods takes :  616 ms
Complete refresh takes :  5579 ms

**** 1982 ****
Rebuilding destinations takes :  40 ms
Rerouting goods takes :  149 ms
Complete refresh takes :  189 ms

**** Wipi ****
Rebuilding destinations takes :  9 ms
Rerouting goods takes :  19 ms
Complete refresh takes :  28 ms

**** Ferndale ****
Rebuilding destinations takes :  2 ms
Rerouting goods takes :  5 ms
Complete refresh takes :  7 ms

       

After applying the patch :

**** Colin - Finished ****
Rebuilding destinations takes :  115 ms
Rerouting goods takes :  6855 ms
Complete refresh takes :  6970 ms

**** Yoshi ****
Rebuilding destinations takes :  4945 ms
Rerouting goods takes :  528 ms
Complete refresh takes :  5473 ms

**** 1982 ****
Rebuilding destinations takes :  40 ms
Rerouting goods takes :  147 ms
Complete refresh takes :  187 ms

**** Wipi ****
Rebuilding destinations takes :  8 ms
Rerouting goods takes :  18 ms
Complete refresh takes :  26 ms

**** Ferndale ****
Rebuilding destinations takes :  2 ms
Rerouting goods takes :  3 ms
Complete refresh takes :  5 ms


Note : Colin's save game is different from the one used previously.

Further testing is welcome.

prissi

This will certainly perform different on different processors, the longer the pipeline, the better.

I am quite torn on this; even in your old patch you used an unchecked array, which was marginally ok, since it was only accessed with an predefined counter. However, for a large project like simutrans, I would like to keep objects together. The mark of a station belongs imho to the station.

Of curse, that is only my view. But there were many more places liek the different tree types. The are accessed via a 9 bit variable. Nevertheless I put them into a vector, which will check boundness at every access, to catch a broken register or some other programm error before really bad things happen. Even if this costs same percent performance.

People tend to forget, what things belong together. Keeping things together with their data was imho the main idea of OOP. Maybe I have become to fundamentalistic in my views. But I am certainly open to challenge my optionions by arguments.

knightly

#93
Prissi,

Quote from: prissi on October 07, 2009, 07:08:27 PM
This will certainly perform different on different processors, the longer the pipeline, the better.

Most probably you are right on this. Pipeline issue aside, this patch will avoid many unnecessary halthandle boundness checks. More specifically, each halt's handle will be checked for boundness for no more than once in a single suche_route() call.


Quote from: prissi on October 07, 2009, 07:08:27 PM
I am quite torn on this; even in your old patch you used an unchecked array, which was marginally ok, since it was only accessed with an predefined counter.

If non_identical_schedules were to use vector, then I believe warenziele should also be a vector of vectors too, right? IMHO, if our code are written such that errors are already guarded against, why should we compel ourselves to use vectors to incur unnecessary overhead?


Quote from: prissi on October 07, 2009, 07:08:27 PM
However, for a large project like simutrans, I would like to keep objects together. The mark of a station belongs imho to the station.

I think this is more a point of view than hard fact. Extension buildings are part of a station, no doubt. But for markers, I think its sole purpose is to serve suche_route(), but not inherently part of a halt by definition. It's only working data for suche_route().


Quote from: prissi on October 07, 2009, 07:08:27 PM
People tend to forget, what things belong together. Keeping things together with their data was imho the main idea of OOP. Maybe I have become to fundamentalistic in my views. But I am certainly open to challenge my optionions by arguments.

I agree that things that go together should be grouped together in their respective classes. And I agree with the doctrines of OOP too. But sometimes we tend to assume that certain variables should belong to a certain class of objects; but if we think deeper, there may be no necessary connection between them.


As for the patch, I know in advance that you may not like the static array. Actually, if I do not need to initialise the array elements with zeros, I would have declared it as a static local array inside suche_route() like the HNode array, for I believe the markers are only working data of suche_route().


Edit :

I have made a small change to the patch (moving the init_markers() call out of an IF block in karte_t::init()). Updated patch attached.


Knightly

Spike

Quote from: prissi on October 07, 2009, 07:08:27 PM
Keeping things together with their data was imho the main idea of OOP. Maybe I have become to fundamentalistic in my views. But I am certainly open to challenge my optionions by arguments.

Quote from: Knightly on October 07, 2009, 08:44:38 PM
I agree that things that go together should be grouped together in their respective classes. And I agree with the doctrines of OOP too. But sometimes we tend to assume that certain variables should belong to a certain class of objects; but if we think deeper, there may be no necessary connection between them.

Trying to deliver arguments pro separate markers:

If the lifetime of data differs much, I think this is a sign that they are independent, and can/should be kept independently.

Stations have a lifetime of program runtime, the markers only for a search.

Also, if we think about threading, such marks must not be part of the station data, otherwise multi-threaded searches are not possible.

prissi

Looking at the code, I feel like it would make sense to make the markers part of halthandle_t, i.e. derive a class from quckstone_tpl. That way also the initialisation would be automatically correct. Even more, the marked and mar test could be then part of the more intelligent halthandle too. But I would like to here you opinion.