The International Simutrans Forum

 

Author Topic: Terrible performance changing undergroud view mode or slice level.  (Read 1692 times)

0 Members and 1 Guest are viewing this topic.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
A long running problem with both standard and extended is that using the underground view, underground slice tool or changing underground slices with the tool active performs absolutely abysmally. Sure it is ok on a tiny 128x128 map or so, but on a massive one like the 8,000x4,000 odd the extended server uses it is absolutely unusable. On the extended server it takes my computer 5-10 seconds per slice change, and another minute or so to catch up that lost time.

Clearly the underground tools are doing something with a complexity proportional to the map tile count. However this makes no sense since a change in view should only depend on currently visible tile count and not the entire world tile count.

So off I set trying to find the cause of the terrible performance, here it is!
Code: [Select]
void karte_t::update_underground()
{
DBG_MESSAGE( "karte_t::update_underground_map()", "" );
world_xy_loop(&karte_t::update_underground_intern, SYNCX_FLAG); // THIS LINE IS EXPENSIVE TO RUN
set_dirty();
}
Ok so it seems that to change underground view mode and slice level one needs to iterate through every tile on the map and perform non trivial logic on it. Ouch and certainly explains it.

If one comments out this line performance of the tool is usable on all maps, not even dropping frames. However one can see why the line is needed. Basically it recalculates all artificial cliff and water images to be suitable for underground view. Without this line there are visual bugs where cliffs and water does not match the current underground view or slice.

The solution? One needs to do what I described above and only calculate for tiles that are visible to the player. This must be done when tiles first pan or zoom into view, or for all visible tiles when first loading into a map or changing underground views. This will decrease the performance of panning and zooming a bit, but greatly increase the performance of loading into a map or changing underground view. Currently the performance of changing underground views is not acceptable.

A temporary solution would be a game setting players can turn on that disables this recalculation in exchange for visual artefacts. Underground views become useable on large maps at the cost of worse graphics.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #1 on: June 26, 2018, 10:57:06 AM »
Sometimes, I think panning can be slow enough already.

Looking at grund_t::calc_back_image, there are several functions that are called over and over again. Some of these contain quite complicated logic. I wonder if the compiler can be expected to understand that it can reuse the same return value.

Parts of the calc_back_image logic also appear to be unnecessary when changing underground level, such as draw_as_ding. If only there were places to store the original values for walls, they wouldn't have to be recalculated when undoing the effects of sliced view either. Maybe some of the logic could have been done during rendering, so that the values in grund_t didn't have to change.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4533
  • Languages: EN, DE, AT
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #2 on: June 26, 2018, 02:16:08 PM »
yes, true. But there are only so many people working on patches.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #3 on: June 27, 2018, 05:08:20 AM »
Quote
Sometimes, I think panning can be slow enough already.
The speed decrease would be trivial as updating the tiles is semi-trivial. It is only a problem currently because it updates all the tiles on the entire map at once. To put that in perspective on the extended server game that is ~32 million tiles it has to update all at once. Any piece of code becomes non trivial when run 32 million times. Although optimizing the calculation function will help (I will look into this later), it still does not make changing to underground view on large maps usable as it still has to perform the code for every tile all at once.

I will then look into how the view port code works. See how easy it would be to make it calculate when panning.

A third solution is asynchronous updating of the tiles when changing views. Currently it tries to update all tiles at once within a single frame. One could defer the updating of tiles to occur over many frames, say at a rate of 50,000 tiles per frame. This would mean the graphics would initially be wrong but then correct themselves over time. Trades graphic correctness for scalability and performance.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1320
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #4 on: June 27, 2018, 06:40:15 AM »
The solution? One needs to do what I described above and only calculate for tiles that are visible to the player. This must be done when tiles first pan or zoom into view, or for all visible tiles when first loading into a map or changing underground views.
Keeping track of which tiles have been updated would require yet more updates --> memory accesses which is the problem in the first place with all the routines that scan the whole map -> memory latency sucks.
Not keeping track and just calculating each frame might turn out faster...  i.e. calculations cheap, memory access bad.


Although optimizing the calculation function will help (I will look into this later), it still does not make changing to underground view on large maps usable as it still has to perform the code for every tile all at once.
The entire map also gets updated periodically for the snowline changes. I added a flag to the image updating routines to bypass many irrelevant checks when called for only snowline updates. Something similar might be possible for underground on/off as well.


A third solution is asynchronous updating of the tiles when changing views. Currently it tries to update all tiles at once within a single frame. One could defer the updating of tiles to occur over many frames, say at a rate of 50,000 tiles per frame. This would mean the graphics would initially be wrong but then correct themselves over time. Trades graphic correctness for scalability and performance.
IMHO that is horrible. Many 3D games defer texture and even object loading and plop you into a barren world that slowly fills in are you look around. Horrid. If not ready, then get busy and load, don't drop the player into an unfinished scene.

The snowline update does that ~50000 tiles per step update. The problem is on faster computers, it really should be a higher number, but then slower computers would hiccup. i.e. How to calibrate...

---
Underground mode changes were one of the initial routines to get the hacked on multi threading which makes a huge difference. It's also one of the multithreaded routines that does benefit from hyperthreading if you have that available on your CPU. For a 8192x4096 map, I get ~4 seconds for a slice change with 4 threads, ~3 seconds with 8. Note however the main display threads see a performance decrease (10-20%) with more threads than actual CPU cores, and that's running every frame...  One day the game needs a hyperthreading selection to run more threads on those routines that can benefit. (and a way to set thread count automatically to match peoples systems [in a platform portable way which is lacking]).


Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #5 on: June 27, 2018, 08:11:45 AM »
Quote
Keeping track of which tiles have been updated would require yet more updates --> memory accesses which is the problem in the first place with all the routines that scan the whole map -> memory latency sucks.
Not keeping track and just calculating each frame might turn out faster...  i.e. calculations cheap, memory access bad.
The calculations involve a lot of memory lookup for surrounding tiles. Hence if sensible tracking was used it might be faster.
Quote
he entire map also gets updated periodically for the snowline changes. I added a flag to the image updating routines to bypass many irrelevant checks when called for only snowline updates. Something similar might be possible for underground on/off as well.
Still should not be updating every single tile on the map for a visual only change...
Quote
For a 8192x4096 map, I get ~4 seconds for a slice change with 4 threads, ~3 seconds with 8.
This is not viable on the extended servers as for some reason thread counter effects game state. Hence one can only join a server when using the same number of threads as the server is using. James still has not given a good reason as to why this is the case...

Even with that, 3-4 seconds is not at all reasonable. In the case of the extended server you are then looking at an additional 20-40+ seconds to catch up with the server.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #6 on: June 27, 2018, 08:40:42 AM »
The speed decrease would be trivial as updating the tiles is semi-trivial. It is only a problem currently because it updates all the tiles on the entire map at once. To put that in perspective on the extended server game that is ~32 million tiles it has to update all at once. Any piece of code becomes non trivial when run 32 million times.

Activating underground mode is a one-off thing. Changing slice level happens a bit more often, but that is still nothing compared to doing something every frame. Pak sets with small tile sizes, or every pak set when zoomed far enough out, can have many tiles moving into view during panning that needs updating.

Simutrans appears to have a not insignificant user base on rather old/slow computers. Having to play on smaller maps, but probably also experiencing noticeably slow underground switching times on smaller maps, they will be the hardest hit if switching times becomes dependent on window size rather than map size, as the former makes up bigger fraction of the latter.

I will then look into how the view port code works. See how easy it would be to make it calculate when panning.

Remember that there are also smaller viewports in many of the internal windows. For vehicles, these will often be moving.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #7 on: June 27, 2018, 11:46:06 AM »
Well my research into this has lead me to find a bug in the back wall draw code...

When in sliced view mode the computed wall height did not factor in that underground tiles have a front wall. The result was that it would skip drawing a segment of foundation wall if the height between the slice level and the ground in front was 2 elevations.

There was also a minor bug where walls would always try to be drawn for one direction because of the image maths involved to get the appropriate wall image resulted in a condition always being true. No idea if fixing this will have improved performance or not.

There appears to still be minor front wall bugs with tunnel entrances in sliced view, however these are much less noticeable that the bug fixed above.

I have committed the fixes. I will look into optimizing the underground view mode transition later.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9353
  • Languages: De,EN,JP
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #8 on: June 27, 2018, 03:39:05 PM »
3-8s is really nothing. Before the deferred zooming was introduced, Simutrans zoomed all images at once. Changing zoom level took something between a few second to minutes (on pak128).

Simutrans is not suited for playing such large maps in my opinion. Even with 14 players, one could barely touch all regions once and even more no install a balanced transport net in a reasonable time frame.

Having said this, I will be happy about patches of course. Just I will not have the motivation to look into this.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #9 on: June 27, 2018, 07:21:41 PM »
Insane distances between industries must also be a problem with such huge maps, unless someone is manually controlling their placement and contracts. Even at 2048x2048, I have to ignore most factories because their supplier and/or consumer is too far away to bother. (In fact, some early vehicles would probably get obsolete before making one trip.)

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #10 on: July 16, 2018, 11:37:19 AM »
It cost me some part of my sanity but here is a patch...

I ran a variety of test cases and paksets and did not notice any visual errors. Additionally the speed at which one can change underground slices is as good as completely decoupled from map size. To put it in perspective on a 2,048x2,048 map one can change 5-7 slice levels per second with far out zoom easily.

Why 2 separate drawing and view algorthims are needed I do not really understand. However I did make sure that both object views and map view are effected correctly.

I have not really gone into the performance impact of this patch. It seems trivial to me, and care was taken to minimize and optimize where possible. There is some room to optimize I think if performance is a problem.

James might want to try this with extended as that has a huge server game.
« Last Edit: July 16, 2018, 01:10:00 PM by DrSuperGood »

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4533
  • Languages: EN, DE, AT
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #11 on: July 16, 2018, 12:04:51 PM »
The new files (prepmap.*) are missing in the patch file. Also the change to grund.h/.cc can be committed immediately imho.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #12 on: July 16, 2018, 01:12:22 PM »
Quote
The new files (prepmap.*) are missing in the patch file. Also the change to grund.h/.cc can be committed immediately imho.
Ops, thought the file was added to the SVN file structure but it was not.

Also forgot that grund was purely code quality improvements in the end and had nothing to do with the rest of the system. At some stage I was hacking around in it trying for optimizations or alternative solutions but in the end only the code quality improvements were left.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1320
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #13 on: July 17, 2018, 11:52:29 PM »
Tunnel portals facing away are glitching as shown in attached. Every now and then trunk also displays the chopped portal. Not sure of the cause, but with patch it's always chopped, without, not chopped >99% of the time.

The draw_as_obj flag for grund_t as set in calc_back_image() depends on the back images of the tiles behind it (left and up). The update_underground_intern() routine removed by the patch updates the entire map top left to bottom right ensuring the back images are calculated in the dependent order. I don't see how the patch is handling this dependance...

In the patched main_view_t::display(), //prepare view area, the tiles are being updated in screen coord order (diagonal across the map). What was the rationale for that?   Updating in tile order, and hence accessing the preparation map sequentially would be quicker (and even better if the preparation map were packed bits instead of a bool array - see the dirty tile packed bits setup. bool array was tried there to a good slowdown.)

How were the iteration screen coord limits chosen? It looks like a straight copy of the "// and finally overlays" loops. Those limits purposely scan tiles below the screen looking for overlays since the overlay/sign/label is displayed above the tile. i.e. The overlay may be onscreen even though the tile it's located in is several tiles off the bottom.  pre-preparing the tiles below in case one scrolls down doesn't harm anything, but speed...

For handling the back image calculation order, I wonder if starting the tile updates far enough offscreen to the top left (and not marking them as prepared since some might be wrong), if the calculations would self-correct by the time they became onscreen. I'd guess 'far enough' would be the distance required to go down (up?) a max height mountain?

Performance wise, preliminarily this doesn't look good. >20% increase in frame times at normal zoom, and still a >10% framerate hit at max zoom out pak64. Needs more testing, but needs to produce the correct results first before worrying about speed.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #14 on: July 18, 2018, 03:04:33 AM »
Quote
The update_underground_intern() routine removed by the patch updates the entire map top left to bottom right ensuring the back images are calculated in the dependent order. I don't see how the patch is handling this dependance...
I do not see how this dependency was handled before seeing how update_underground_intern was called by multiple threads covering separate map areas. Unless you are trying to say there was some kind of overlap hoping that the result was consistent enough to work?

I think the bug would be that calc_back_image() depends on the back images of other tiles.
Quote
In the patched main_view_t::display(), //prepare view area, the tiles are being updated in screen coord order (diagonal across the map). What was the rationale for that?
To know what tiles needed to be computed and only compute them. Otherwise it would have to prepare tiles that will not be drawn.
Quote
Updating in tile order, and hence accessing the preparation map sequentially would be quicker (and even better if the preparation map were packed bits instead of a bool array - see the dirty tile packed bits setup. bool array was tried there to a good slowdown.)
Packed bits would potentially be faster for pure testing, but might be slower for setting (both read and write) as well as for clearing areas (both read and write 1 byte at a time, no bulk memory set operation).

At the time I designed the structure I was considering that updating could possibly be performed multi threaded. Packed bits do not support multi threaded updating due to the underlying memory model.
Quote
How were the iteration screen coord limits chosen?
Tall mountains at the bottom of the screen must be prepared in case they have cliffs on them. Possibly an early exit clause could be added to only prepare what is visible and skip preparing tiles with signs that extend above.
Quote
For handling the back image calculation order, I wonder if starting the tile updates far enough offscreen to the top left (and not marking them as prepared since some might be wrong), if the calculations would self-correct by the time they became onscreen. I'd guess 'far enough' would be the distance required to go down (up?) a max height mountain?
As mentioned above, I think the problem is that calc_back_image() depends on the back images of other tiles. This was safe in the old days with single threads however even before with multiple threads I do not see how the results could be consistent at seam areas.
Quote
Performance wise, preliminarily this doesn't look good. >20% increase in frame times at normal zoom, and still a >10% framerate hit at max zoom out pak64.
Under what conditions was this tested?

A problem is the preparation testing for the map has to be performed single threaded because multiple threads overlap tiles at seams during parallel draw from what I can tell. Before parallel drawing is performed it must prepare the required area, which adds a bottleneck to the process.

If the camera is panning or if the mappings were invalidated then all tiles have to be rechecked. However if the camera is not panning and everything is still valid then nothing technically has to be checked so the entire process could be skipped for most frames of a stationary camera except for when underground slice/mode is changed. The reason I do not perform this optimization is there is no easy way to perform it currently as camera position is defined by 4 or was it 6 different values and that does not include the actual view area changing itself due to window resize. I could add this optimization and it should mostly remove the overhead when the camera is not panning or invalidated.

Ok enough of what was done, now for what could be done.

There is the above mentioned optimization. No need to check tiles to prepare unless the covered preparation area has changed.

Now what I could do is stop trying to prepare only visible tiles and instead define a rectangular region of preparable tiles that includes the camera view as a subset. The advantage of this is it makes preparing a bulk operation that can be performed and tested on the map in natural order and potentially even multi threaded for the case of an invalidation. The entire preparation map could be discarded as only the current area subtracted by the previous area needs to be prepared, but this would come at the cost of losing potential caching behaviour the map brings for when panning around an area. One could even solve the silly coupling of calc_back_image to some extent by adding additional tiles at the boarder of the area and re-preparing the dependant rows if panned that direction. The disadvantage is that potentially unnecessary tiles are prepared that cannot be visible due to being off the side edges of the display hence considerably more calls to calc_back_image would be made before drawing can begin in the case of a view jump.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1320
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #15 on: July 18, 2018, 07:24:08 PM »
I do not see how this dependency was handled before seeing how update_underground_intern was called by multiple threads covering separate map areas. Unless you are trying to say there was some kind of overlap hoping that the result was consistent enough to work?
update_underground_intern is called from world_xy_loop with the SYNC_X flag set. That ensures thread 1 is finished with tiles that thread 2 is dependent on before thread 2 can proceed.


In the patched main_view_t::display(), //prepare view area, the tiles are being updated in screen coord order (diagonal across the map). What was the rationale for that?
To know what tiles needed to be computed and only compute them. Otherwise it would have to prepare tiles that will not be drawn.
Simply using the estimated_min and _max values that are used to setup the preparation map for the iteration range seems to work ok; There are extra offscreen tiles checked, but the speedup from sequential access seems to make up for that. The iteration could be easily skewed into parallelogram reducing the extra tiles since the camera is at a fixed angle.


Packed bits would potentially be faster for pure testing, but might be slower for setting (both read and write) as well as for clearing areas (both read and write 1 byte at a time, no bulk memory set operation).

At the time I designed the structure I was considering that updating could possibly be performed multi threaded. Packed bits do not support multi threaded updating due to the underlying memory model.
With the dirty tiles, it was found a bool array is faster if you look only at the time to perform operations on it. Once you look bigger picture, the larger memory footprint results in an overall slowdown to the screen update. I expect the same holds true here. Bulk operation are way faster with packed bits, write one word and 32 tiles cleared/set all at once...
Packed bits support multithreading as well as bool arrays. Even with bools the threads need to be spaced far enough apart that they're not working on the same cachelines anyways. Being lazy and relying on the cpus cache coherency logic is way too slow.  But, updating such a small area is not a good candidate for multithreading anyway - the main thread would be done it's work before the helpers even got woke up to start. Would be nice if operating systems would expose some nice low latency way to wake threads. Then these high core count cpus might be worth something (for game logic type stuff, not e.g. image processing which is trivially parallelizable).


Under what conditions was this tested?
Still camera zoomed out. i.e. This is the slow down from just checking the prepartion map. Panning is subjectively more jittery than before (the buggy frametiming routine doesn't help here...), not sure how to setup a repeatable test to properly measure. But not going to until the dependance issue is solved.


I could add this optimization and it should mostly remove the overhead when the camera is not panning or invalidated.
Sensible. No need to check what can't have changed.

---
Another option is to just size the preparation map to cover then entire map. If packing the bits, you'd end up with 2M to cover a 4k x 4k map.

---
Just because the back image dependance spoils your plan doesn't make it a bug. You clearly don't like it, but do you have some other algorithm in mind?  How is a tile supposed to know if it's blocking the view of a tile behind if it doesn't know what's on the tiles behind?

----
Doesn't the preparation map need to be invalidated upon map rotation? (or rotated along with the tiles)

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #16 on: July 18, 2018, 07:55:51 PM »
Would be nice if operating systems would expose some nice low latency way to wake threads.
I know at least Windows has gotten something that is at least more light-weight than the old ones. However, the biggest hurdle is probably that you need to wait for the scheduler to reschedule. Running the scheduler too often is also bad for performance (and power consumption for idle cores). Maybe thread pools work rather well in this regard, as they can pick a new task when the previous one ends, in user space. But you need to keep them fed all the time, or the thread goes to sleep or is even killed, which means it takes time to get it going again. However, a thread pool that is willing to trade CPU cycles for performance could always busy-wait.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #17 on: July 19, 2018, 12:13:26 AM »
Quote
update_underground_intern is called from world_xy_loop with the SYNC_X flag set. That ensures thread 1 is finished with tiles that thread 2 is dependent on before thread 2 can proceed.
Yet...
Quote
Every now and then trunk also displays the chopped portal.
So honestly I am not sure. Still if one were to switch to a pure rectangular area based update and take the 50% more tile updates one could probably factor in an extra column/row to avoid this and iterate in correct order.
Quote
Simply using the estimated_min and _max values that are used to setup the preparation map for the iteration range seems to work ok; There are extra offscreen tiles checked, but the speedup from sequential access seems to make up for that. The iteration could be easily skewed into parallelogram reducing the extra tiles since the camera is at a fixed angle.
Problem is that elevation changes what tiles are visible on screen so unless one does what is currently done or take the extreme bounds one will end up potentially missing tiles. If one keeps it square one can also remove the need for the prep map entirely and instead use a prep area and only update the difference. More tiles prepared but very little overhead if drawing static terrain.

Quote
Bulk operation are way faster with packed bits, write one word and 32 tiles cleared/set all at once...
Except when processing with multiple threads since the memory model requires whole bytes be modified, or worse. Or when dealing with partial clears since then both a read and write are required with some masking logic. Also I would imagine that the memory bandwidth used of ~64kb per frame read when far zoomed out with pak64 was trivial compared with the potentially 4MB+ written and read to draw the screen in case of a dirty screen at any zoom level.

Quote
Packed bits support multithreading as well as bool arrays. Even with bools the threads need to be spaced far enough apart that they're not working on the same cachelines anyways.
Except that the initial idea of having the draw threads process them would not result in a single thread processing 32, 64 or however many tiles next to each other. Hence why they were avoided. Until I found out that idea would not be viable.
Quote
Another option is to just size the preparation map to cover then entire map. If packing the bits, you'd end up with 2M to cover a 4k x 4k map.
Which goes back to the original problem of coupling a visual operation to map dimensions. Performance for changing to underground mode or slice should not directly depend on map size.

Quote
Just because the back image dependance spoils your plan doesn't make it a bug. You clearly don't like it, but do you have some other algorithm in mind?  How is a tile supposed to know if it's blocking the view of a tile behind if it doesn't know what's on the tiles behind?
It is doing any of this to start with? Look at underground slice view and you can see objects behind what is meant to be solid wall... For example place a 4 high wall of artificial wall in front of a rail line with trains and change to a slice just below the surface of the wall. You will see both rail line and trains through the wall.

I tried to find an example of it hiding something as it did say it did so with the one test of the calculate back wall method. However I did not even notice visual changes in a fairly complex map when the entire test was disabled from calculate back wall method. I am guessing that pak128 tunnel might mechanically require it, but I am unsure as to what it exactly is.

As mentioned before the entire draw process is very arcane and cryptic at times. For example why does it even need to pre-compute back walls when it still has to compute them at draw time? The pre computed ones only cover 2 height, with any extra height, eg to support 4 high artificial wall, having to come from the draw time computation which is drawn below the pre-computed back walls. Computing these back walls at draw time is not trivial either as it still has to look up corners of at least 2 other tiles which likely are not sequential memory reads.
Quote
Doesn't the preparation map need to be invalidated upon map rotation? (or rotated along with the tiles)
Not entirely sure. As far as I am aware multiplayer maps cannot be rotated. Single player maps when rotated should be recomputing the entire map as before so technically all tiles are prepared afterwards. This change was specifically targeting underground mode performance for now because that is a huge problem with Simutrans Extended which uses the same/similar code for changing underground mode and level.

I will look into moving the system to a purely prepared area based system. This should almost completely remove the memory bandwidth usage outside of actual tile preparation. One can also add sufficient buffer tiles that hopefully the masking occurs correctly. This may mean preparing more tiles but otherwise it would have considerably lower overhead and do so more efficiently at times.
« Last Edit: July 19, 2018, 08:52:02 AM by DrSuperGood »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9353
  • Languages: De,EN,JP
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #18 on: July 19, 2018, 02:08:22 PM »
Hmm, I wonder how could this work with more than one level, since all information is passed as coord 2d, e.g elevated ways and cliffs.

The is_halt_flag in flags in grund.h is obsolete, since a test for halthandle != 0 is even faster (no need to sort out bits). Maybe repurpose this flag as clean (or dirty) flag for the recalculation.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #19 on: July 19, 2018, 03:54:07 PM »
Quote
Hmm, I wonder how could this work with more than one level, since all information is passed as coord 2d, e.g elevated ways and cliffs.
It currently is done tile by tile. This means all grounds of a tile are refreshed as has always happened. The patch above does suffer from bad ordering however, something my next attempt might do better.
Quote
The is_halt_flag in flags in grund.h is obsolete, since a test for halthandle != 0 is even faster (no need to sort out bits). Maybe repurpose this flag as clean (or dirty) flag for the recalculation.
The next approach I am trying does not even need to use that flag as it keeps track of an area rather than individual tiles. If I must fall back to the previous approach I might try using that instead.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #20 on: July 26, 2018, 03:35:47 AM »
Attached is new patch which is hopefully more efficient. It uses a rect approach to prepare areas as described earlier. When panning the rect is fragmented before processing so only the new tiles get prepared. A boarder of 5 tiles is added to the bottom and left of the logical tile map so that back wall obscuring tests will be correct. When the camera is stationary only the view bounds are tested unlike before.

I have not been able to resolve the graphical error with tunnels in sliced view. Honestly I am not even sure what is causing it since I do prepare tiles in the correct order and even prepare extra tiles to accommodate obscuring needing the backwall information. In any case the graphical glitch is not present when not in underground view which for now could be considered good enough.

The graphical error appears to be coming from the accurate grund_t object draw routine, display_obj_all. If the inaccurate one display_obj_all_quick_and_dirty is used, eg during fast forward, the tunnels appear perfectly. Hence I have some reason to believe that their buggy display may not be directly related to the tile preparation system. Specifically it appears to be an issue with clipping as the tunnel entrance roof is drawn but partly masked so it fails to draw completely. This only occurs when the underground view is at the same slice as the tunnel entrance way. To be honest I do not really have any idea what may be causing this at the moment so I cannot be sure that it is not something I introduced by accident.
« Last Edit: July 26, 2018, 07:06:08 AM by DrSuperGood »

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #21 on: August 04, 2018, 07:11:08 AM »
Any feedback on the patch?

I think the graphic issue shown earlier was because the working screenshot was taken at the slice above the road while the broken graphic one was at the road level. It seems pretty reliably reproduceable and as far as I can tell has to do with the clipping logic in the accurate draw method. If one fast forwards the graphics appear as should as that uses a more speed orientated draw function. The only other possible cause would be the "optimizations" and "code quality improvements" I made early but those were already commited.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9353
  • Languages: De,EN,JP
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #22 on: August 04, 2018, 04:32:04 PM »
Sorry, I am on holiday and thus time for code review is even more limited. I will see, what I can do.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #23 on: August 04, 2018, 06:18:37 PM »
Quote
Sorry, I am on holiday and thus time for code review is even more limited. I will see, what I can do.
No problem. I was just worried the updated patch was overlooked due to a lack of replies and hence why I bumped it just incase.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4533
  • Languages: EN, DE, AT
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #24 on: August 07, 2018, 01:36:47 PM »
Split topic, tunnel clipping problems can be discussed here https://forum.simutrans.com/index.php/topic,18408.0.html

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9353
  • Languages: De,EN,JP
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #25 on: August 24, 2018, 08:40:56 AM »
I did some profiling and I think this can go in as it is. No change oberved for normal play. Just do not include premap in simworld ...

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #26 on: August 26, 2018, 08:35:30 PM »
Quote
Just do not include premap in simworld
That include was left in by accident and has been removed. It did not throw a compile error when I built because I kept the file from the previous attempt. Anyway removed.

This has now been merged into main. Enjoy faster underground mode/slice changes!

Offline ACarlotti

  • *
  • Posts: 299
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #27 on: September 01, 2018, 08:42:21 PM »
This patch seems to have introduced CRLF line endings in dataobj/rect.cc, whereas LF seems to be the norm. (At least, this is the impression I get from the Git mirror, accessed from Linux).

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #28 on: September 01, 2018, 09:48:09 PM »
Quote
This patch seems to have introduced CRLF line endings in dataobj/rect.cc, whereas LF seems to be the norm. (At least, this is the impression I get from the Git mirror, accessed from Linux).
The files are inconsistent from what I can tell. MSVC seems to default to the one type but some files use the other.

I still wonder why there are different line endings in 2018...

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #29 on: September 01, 2018, 10:31:33 PM »
I still wonder why there are different line endings in 2018...
Because there are still legacy applications. Each committer must therefore ensure that svn:eol-style is set correctly for all new files. And not mess up for existing files. And someone must have introduced this property for all files initially.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #30 on: September 02, 2018, 02:50:07 AM »
So what new line style is the standard for Simutrans source code?

Offline ACarlotti

  • *
  • Posts: 299
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #31 on: September 02, 2018, 08:00:04 AM »
With the exception of rect.cc, every single other code file currently uses LF line endings (i.e. Unix style \n). There are also a small number of other text files that use CRs - a complete list (which may include some that aren't from the source repository) is:

Code: [Select]
LICENSE.txt
dataobj/rect.cc
themes.src/white blue/themes-condensed.tab
themes.src/white blue/themes-large.tab
simutrans/pak/config/menuconf.tab
simutrans/pak/config/simuconf.tab
simutrans/pak/config/speedbonus.tab
simutrans/pak/text/citylist_ru.txt
simutrans/text/cn/baum_build.txt
simutrans/text/cn/color.txt
simutrans/text/en/jump_frame.txt
simutrans/text/en/underground.txt
simutrans/text/id/general.txt
simutrans/text/id/simutrans.txt
simutrans/text/ja/enlarge_map.txt
simutrans/text/ja/load_relief.txt
simutrans/text/ja/monorailtools.txt
simutrans/text/ja/password.txt
simutrans/text/ja/privatesign_info.txt
simutrans/text/ja/removal_tool.txt
simutrans/text/ja/scenario.txt
simutrans/text/ja/signal_spacing.txt
simutrans/text/ja/simutrans.txt
simutrans/text/ja/trafficlight_info.txt
simutrans/text/ru/airtools.txt
simutrans/text/ru/baum_build.txt
simutrans/text/ru/citybuilding_build.txt
simutrans/text/uk/airtools.txt
simutrans/text/uk/baum_build.txt
simutrans/text/uk/bridges.txt
simutrans/text/uk/citybuilding_build.txt
simutrans/text/zh/baum_build.txt
simutrans/text/zh/color.txt

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #32 on: September 02, 2018, 08:14:24 AM »
I checked a couple of central files, and they all had svn:eol-style=native. That causes files to get whatever line separators that particular system uses when checking out. Inside the repository, they will be normalized to LF. That seems like a sensible choice, but I can't find anything stating that it is official.

Of cource, svn:eol-style should only be put on text files. Other files should have svn:mime-type set.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2558
  • Languages: EN
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #33 on: September 02, 2018, 03:01:11 PM »
Urg was not aware one had to set the EoL setting for new files. Turns out only if one creates appropriate auto properties will one not have to do this.

I pushed a commit which should fix the EoL for that file. Although many of the program code files do have the EoL style property set, many others of them do not. I recommend that a commit be made placing native EoL on all *.c and *.h files. One might as well also set their mime to text while at it, although it appears SVN is doing a good job at auto detecting that.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5419
  • Languages: EN, NO
Re: Terrible performance changing undergroud view mode or slice level.
« Reply #34 on: September 02, 2018, 04:29:18 PM »
Text is likely the default. It is the most common thing to version control. In fact, support for anything else is pretty minimal. (Although TortoiseSVN comes with a nice bundle of merge tools for various binary file types.)