News:

Congratulations!
 You've won the News Item Lottery! Your prize? Reading this news item! :)

New "stacked_stations" branch to allow for different players to stack stations

Started by neroden, July 06, 2013, 09:00:33 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

neroden

So, in a fit of madness, I got distracted by a couple of annoyances in the existing code, and I ended up making a *huge* patch.

This is on the "stacked_stations" branch.

Basically, I figured out how to move the "halt" handle from planquadrat_t into grund_t.  This will slow down lookups a little, probably, due to one layer of pointer indirection.  But crucially, it means that we can finally allow two different players to have stations on top of each other.  (I thought I hadn't actually allowed that, but I just tested and apparently I did actually allow it.  Hmm!)

This allows for certain 19th century London Underground layouts which ought to have been possible before but weren't.  (Central Line station over the Waterloo & City line station, owned by different companies, etc.)

Interestingly a large amount of the code actually got shorter; it turns out a lot of the code which was looking up the "halt" had to look up the 3d location anyway.

I can probably get a version of this ready for standard if anyone is interested.

kierongreen

Are stations on different levels owned by the same player automatically linked still?

jamespetts

Very interesting! Kieron's question is pertinent - they probably should not be linked if they are owned by different players, but should be if they are owned by the same player.

Also, would you be able to profile this to see whether the slowdown is significant before a final decision is made as to whether to put this into the next release? It would be helpful if you could use the large Bridgewater-Brunel saved game to which I have linked before.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

Such a patch has been attempted twice in the past, but it never was really running as expected.

One problem is, that such stations are not unique where pax should go for travel or where to send goods, i.e. there is not really a closer stations (or you use z coordinates as tie breaker). Also making halt part of grund_t could increase grund_t by 4 bytes, i.e. will increase the memory footprint by 16%. Which something so often acessed it mean more frequent cache misses and hence a performance impact.

jamespetts

My brief testing using a non-optimised build on the saved game from the Bridgewater-Brunel saved game shows no noticeable slowdown using the method of comparing fast forward speeds with another non-optimised build running the same saved game.

Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

That will depend on the size of the map. For sizes smaller than 40000x4000 and depending on your CPU you will see little difference. (But in MSVC the grund structure is anyway already 24 bytes, and hence might not be affected by adding halthandle to it). You can check by "-sizes" on the commandline.

jamespetts

Testing memory consumption just after loading the same saved game:

stacked-stations branch (debug build): 851,192k
devel-new branch (debug build): 835,996k

The map in question is 2048x1664

Why do you think that the difference will be greater above 40,000 x 4,000 (or is it 4,000 x 4,000)?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

With 2048*1664 increase should be 13 MB. With 4000x4000 it will be 64 MB (or about 10%). Then the two indirection levels must be added, since looking up stops is very frequent.

I would expect a drop in performance (depending on chace sizes and so on) or 5% or so. It will be larger, when more convois are present, as those obviously do the lookup. Ships may have also more impact than trains and so on.

Anyway, if one increases grund_t, on should think of a lot of other optimisations, which could be done then.

neroden

Quote from: jamespetts on July 06, 2013, 09:44:53 PM
Very interesting! Kieron's question is pertinent - they probably should not be linked if they are owned by different players, but should be if they are owned by the same player.
They most certainly are linked if owned by the same player.

neroden

Quote from: prissi on July 06, 2013, 10:05:59 PM
One problem is, that such stations are not unique where pax should go for travel or where to send goods, i.e. there is not really a closer stations (or you use z coordinates as tie breaker).
Ah yes.  This is an issue I didn't think about.  I'm not sure where the "passengers/mail/freight chooses what station to go to" code is in standard.  I'm not sure what will need to be done for standard.


In experimental, passengers and freight will consider both stations and pick the route which appears faster overall, so this should not be an issue for experimental.  In addition, in experimental, passengers and freight can "walk" between the two stacked stations in "zero time".  So in experimental this simply seems to work exactly the way it should. 

QuoteAlso making halt part of grund_t could increase grund_t by 4 bytes, i.e. will increase the memory footprint by 16%
No, this is wrong.

It does increase grund_t by 2 bytes (not 4 -- remember that halthandle_t is actually a uint16) , but it *DECREASES* planquadrat_t by 2 bytes at the same time. 

Therefore this increases the memory footprint only a little bit -- it only increases it for tiles which actually have multiple, stacked grounds, not for any other tiles.

The memory savings in planquadrat_t should be realized in both standard and in experimental. 

In experimental, however, there is something else going on: experimental stores a pointer to the city in every planquadrat_t.  This uses a huge amount of memory.  In addition, there's an alignment issue, meaning that the memory savings in planquadrat_t will probably not be realized unless we reorder the elements.   I want to replace the pointer to a city with a "cityhandle_t" based on quickstone_tpl, like halthandle_t, to reduce this field to 2 bytes.  I may have mentioned this before.

QuoteWhich something so often acessed it mean more frequent cache misses and hence a performance impact.
The main performance issue is the *single* extra pointer lookup indirection.  Formerly planquadrat_t was looked up, but each planquadrat_t was directly located in the "plan" array.  Now, rather than looking in planquadrat_t for the halthandle_t, planquadrat_t contains the grund_t pointer, while grund_t contains the halthandle_t. 

This is a single pointer indirection added, in the usual case. Most of the time, the grund_t needed to be looked up *anyway* -- in some functions it had, in fact, already been looked up, which will *save* a call to karte_t::lookup -- but for some of the uses grund_t did *not* need to be looked up, and this is the main source of added delay.

There will be more delay, and more memory usage, in locations where there actually *are* stacked grounds, but this will continue to be uncommon and should not cost a significant amount of time.

If the "stacked stations" patch is adopted, I believe a lot of extra optimization can be obtained by reordering the elements of grund_t to properly align all the datatypes on 64-bit or 32-bit boundaries, and to put things which are commonly accessed together into the same cacheline.

---
I would like to give credit to earlier programmers.  The reason this was so easy is that most of the work had already been done.  Almost all the code was already checking z-values and looking at grund_t, with only a little of the code (which looked only at planquadrat_t) needing work.

kierongreen

QuoteIt does increase grund_t by 2 bytes (not 4 -- remember that halthandle_t is actually a uint16) , but it *DECREASES* planquadrat_t by 2 bytes at the same time.
Adding a variable of size 2 bytes increases the structure size by 4 bytes to ensure it aligns with word boundaries though...

prissi

In simutrans standard with GCC adding one bit (or an unit16) will increase the size of a structure by 4 bytes. The definition of stuff is made in such a way that on x86 GCC the structures use all bytes available. Thus removing 2 bytes from planquadrat will not gain anything on GCC but will add 4 bytes to grund_t; but as simutrans-experimental is distributed by MSVC built, both structures are anyway 4 bytes larger. Hence the adding and removing of halthandle_t might work as described, i.e. decrease planquadrat without increase grund_t.