News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Thread sanitiser results

Started by jamespetts, May 27, 2024, 11:19:17 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

jamespetts

Thanks to Ceeac and Neroden for pointing me to the useful resource that is Thread Sanitiser and then helping me to get it working. Investigating with this tool, I have found the cause of a large number of data races. How to fix those data races is a much more difficult topic, and I should be grateful for any input that people may have.

The main problem seems to be in the private car route finder. Here is an example of the relevant output from Thread Sanitiser:

WARNING: ThreadSanitizer: data race (pid=175500)
  Write of size 1 at 0x7b34000a6c4f by main thread:
    #0 obj_t::set_flag(obj_t::flag_values) vehicle/../boden/../dataobj/../obj/simobj.h:119 (simutrans-extended+0x13a746)
    #1 objlist_t::set_all_dirty() dataobj/objlist.cc:647 (simutrans-extended+0x1b4aea)
    #2 grund_t::set_all_obj_dirty() vehicle/../boden/grund.h:310 (simutrans-extended+0x673734)
    #3 vehicle_t::hop(grund_t*) vehicle/vehicle.cc:1729 (simutrans-extended+0x7272ab)
    #4 air_vehicle_t::hop(grund_t*) vehicle/air_vehicle.cc:1301 (simutrans-extended+0x6f1552)
    #5 vehicle_base_t::do_drive(unsigned int) vehicle/vehicle.cc:334 (simutrans-extended+0x71b46d)
    #6 convoi_t::sync_step(unsigned int) /home/james/Documents/Simutrans/simutrans-extended/simconvoi.cc:1351 (simutrans-extended+0x5f0af5)
    #7 karte_t::sync_list_t::sync_step(unsigned int) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:4203 (simutrans-extended+0x6b9528)
    #8 karte_t::sync_step(unsigned int, bool, bool) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:4276 (simutrans-extended+0x6b9847)
    #9 interrupt_check(char const*) /home/james/Documents/Simutrans/simutrans-extended/simintr.cc:113 (simutrans-extended+0x64fcc4)
    #10 karte_t::step() /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:5163 (simutrans-extended+0x6d3604)
    #11 karte_t::interactive(unsigned int) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:10960 (simutrans-extended+0x6d7fb6)
    #12 simu_main(int, char**) /home/james/Documents/Simutrans/simutrans-extended/simmain.cc:1729 (simutrans-extended+0x65d5f8)
    #13 sysmain(int, char**) sys/simsys.cc:1105 (simutrans-extended+0x6e490b)
    #14 main sys/simsys_s2.cc:1108 (simutrans-extended+0x773ac1)

  Previous read of size 1 at 0x7b34000a6c4f by thread T6:
    #0 obj_t::get_owner() const obj/simobj.cc:138 (simutrans-extended+0x21c3de)
    #1 private_car_destination_finder_t::check_next_tile(grund_t const*) const /home/james/Documents/Simutrans/simutrans-extended/simcity.cc:5990 (simutrans-extended+0x5c4f91)
    #2 route_t::find_route(karte_t*, koord3d, test_driver_t*, unsigned int, unsigned char, unsigned int, int, unsigned int, unsigned int, bool, route_t::find_route_flags) dataobj/route.cc:811 (simutrans-extended+0x1e1e09)
    #3 stadt_t::check_all_private_car_routes() /home/james/Documents/Simutrans/simutrans-extended/simcity.cc:2756 (simutrans-extended+0x5d6f51)
    #4 check_road_connexions_threaded(void*) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:1610 (simutrans-extended+0x6bb9b8)

  Location is heap block of size 208 at 0x7b34000a6c40 allocated by main thread:
    #0 operator new(unsigned long) ../../../../src/libsanitizer/tsan/tsan_new_delete.cpp:64 (libtsan.so.0+0x8f162)
    #1 grund_t::rdwr(loadsave_t*) boden/grund.cc:341 (simutrans-extended+0x198292)
    #2 boden_t::boden_t(loadsave_t*, koord) boden/boden.cc:23 (simutrans-extended+0x18c960)
    #3 planquadrat_t::rdwr(loadsave_t*, koord) /home/james/Documents/Simutrans/simutrans-extended/simplan.cc:273 (simutrans-extended+0x67214f)
    #4 karte_t::rdwr_gamestate(loadsave_t*, loadingscreen_t*) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:8538 (simutrans-extended+0x6cf6fe)
    #5 karte_t::load(loadsave_t*) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:9204 (simutrans-extended+0x6d4687)
    #6 karte_t::load(char const*) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:8952 (simutrans-extended+0x6d670d)
    #7 simu_main(int, char**) /home/james/Documents/Simutrans/simutrans-extended/simmain.cc:1581 (simutrans-extended+0x65cf08)
    #8 sysmain(int, char**) sys/simsys.cc:1105 (simutrans-extended+0x6e490b)
    #9 main sys/simsys_s2.cc:1108 (simutrans-extended+0x773ac1)

  Thread T6 (tid=176434, running) created by main thread at:
    #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x605b8)
    #1 karte_t::init_threads() /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:2120 (simutrans-extended+0x6c3751)
    #2 karte_t::load(loadsave_t*) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:9525 (simutrans-extended+0x6d5888)
    #3 karte_t::load(char const*) /home/james/Documents/Simutrans/simutrans-extended/simworld.cc:8952 (simutrans-extended+0x6d670d)
    #4 simu_main(int, char**) /home/james/Documents/Simutrans/simutrans-extended/simmain.cc:1581 (simutrans-extended+0x65cf08)
    #5 sysmain(int, char**) sys/simsys.cc:1105 (simutrans-extended+0x6e490b)
    #6 main sys/simsys_s2.cc:1108 (simutrans-extended+0x773ac1)

SUMMARY: ThreadSanitizer: data race vehicle/../boden/../dataobj/../obj/simobj.h:119 in obj_t::set_flag(obj_t::flag_values)
==================
==================

What appears to be happening is this: the private car route checker is running simultaneously with sync_step(), as is intended. I had not anticipated that this would cause any problems: the private car route checker needs only to look at ways, not things using the ways, and sync_step() and the things in it do not ever need to change ways, as opposed to vehicles, pedestrians, etc., on them.

However, this is not, apparently, how the code sees it. Each ground tile has a list of objects. The ways and vehicles on the ways are in that list of objects. Thus, when, for example, a pedestrian or vehicle moves onto or off a tile, that triggers a write operation on that list of objects. This can occur simultaneously with a read operation on that list of objects for the private car route finder. This, in turn, creates a data race.

I am not entirely confident in re-engineering something as fundamental as the list of objects on a tile, as that is likely to have ultra-complex ramifications in extremely large numbers of parts of the code about which I know little or nothing. I should be very grateful for any thoughts that anyone might have as to how to overcome this issue effectively without losing the multi-threading benefit of the private car route finder.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

On a different issue, I am also getting a lock order inversion warning which might explain the thread deadlocks. This occurred as I was shutting down the game at the end of testing:

WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=175500)
  Cycle in lock order graph: M481 (0x7b0c00000210) => M482 (0x7b0c00000420) => M524 (0x7b0c00002a30) => M481

  Mutex M482 acquired here while holding mutex M481 in main thread:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4240 (libtsan.so.0+0x53908)
    #1 dbus_bus_register <null> (libdbus-1.so.3+0x145a9)
    #2 simu_main(int, char**) /home/james/Documents/Simutrans/simutrans-extended/simmain.cc:947 (simutrans-extended+0x65a928)
    #3 sysmain(int, char**) sys/simsys.cc:1105 (simutrans-extended+0x6e490b)
    #4 main sys/simsys_s2.cc:1108 (simutrans-extended+0x773ac1)

    Hint: use TSAN_OPTIONS=second_deadlock_stack=1 to get more informative warning message

  Mutex M524 acquired here while holding mutex M482 in main thread:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4240 (libtsan.so.0+0x53908)
    #1 dbus_connection_send_with_reply_and_block <null> (libdbus-1.so.3+0x14363)
    #2 simu_main(int, char**) /home/james/Documents/Simutrans/simutrans-extended/simmain.cc:947 (simutrans-extended+0x65a928)
    #3 sysmain(int, char**) sys/simsys.cc:1105 (simutrans-extended+0x6e490b)
    #4 main sys/simsys_s2.cc:1108 (simutrans-extended+0x773ac1)

  Mutex M481 acquired here while holding mutex M524 in main thread:
    #0 pthread_mutex_lock ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:4240 (libtsan.so.0+0x53908)
    #1 <null> <null> (libdbus-1.so.3+0x12d98)

SUMMARY: ThreadSanitizer: lock-order-inversion (potential deadlock) (/lib/x86_64-linux-gnu/libdbus-1.so.3+0x145a9) in dbus_bus_register
==================

Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

The count of vehicles passed on a way could be different which (in standard) slightly affects the route cost for that route.

jamespetts

Quote from: prissi on May 28, 2024, 06:28:57 AMThe count of vehicles passed on a way could be different which (in standard) slightly affects the route cost for that route.
Thank you for that - that does perhaps explain why I have been unable to have multiple threads of this running stably in a network game. What I am thinking of doing, unless anyone has any better ideas, is to make this cease to be concurrent with sync_step, but to allow several threads to run in parallel (which should hopefully be possible in a network game if it should cease to be concurrent). This will adversely affect performance - to an extent, this can be compensated for by making each thread process fewer routes per step, but this is still not ideal.

If anyone can think of any way of being able to have this algorithm run concurrently without data races, that would be helpful. If it is apparent that there are no realistic ways of doing this by the time that I get back to my normal development environment and have time to do this, I will have to modify this code to make it non-concurrent as set out above.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

If you ignored congestion maluses (don't care for previously passed cars or then number of cars on that route), then you should be able to run it in parallel. Assume your drivers are in the 90ies, when the announcement of congestion lagged behind the actual situation ...

jamespetts

Quote from: prissi on May 28, 2024, 01:05:19 PMIf you ignored congestion maluses (don't care for previously passed cars or then number of cars on that route), then you should be able to run it in parallel. Assume your drivers are in the 90ies, when the announcement of congestion lagged behind the actual situation ...
Thank you for that. We would have to do it based on a previous month's archived congestion - I cannot immediately recall whether those data are stored.

But even if this were done, surely there would still be data races because of the use of objlist as described above?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

neroden

Hmm.  While I'm thinking about it, where is the work done to alter ways (deletion or creation of ways, building or removal of stops on ways)?  If any of this ends up running in parallel with sync_step or the private car routefinder, you'd get the sometimes-but-not-always reproduceable desyncs we've been getting.

Yeah, I think you'll need to make this not run in parallel with sync_step.  Changing the list of objects on a tile to two separate lists would be a big lift, and it certainly isn't going to be quick.  I suspect there would still be issues even if you tried to do that.

In the long run you could probably do it with additional encapsulation but that's going to take years.

jamespetts

Quote from: neroden on June 04, 2024, 02:21:25 PMHmm.  While I'm thinking about it, where is the work done to alter ways (deletion or creation of ways, building or removal of stops on ways)?  If any of this ends up running in parallel with sync_step or the private car routefinder, you'd get the sometimes-but-not-always reproduceable desyncs we've been getting.

Yeah, I think you'll need to make this not run in parallel with sync_step.  Changing the list of objects on a tile to two separate lists would be a big lift, and it certainly isn't going to be quick.  I suspect there would still be issues even if you tried to do that.

In the long run you could probably do it with additional encapsulation but that's going to take years.
The player interaction alteration of ways is done in step(), which is generally not threaded with things that make use of ways, so I do not think that this - by itself - is the issue.

The objlist design is unfortunate. I can see that major surgery would be required to get iterating over and testing ways working without actually writing data to them.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

TurfIt

Quote from: jamespetts on June 04, 2024, 08:50:34 PMThe objlist design is unfortunate. I can see that major surgery would be required to get iterating over and testing ways working without actually writing data to them.
That seems unnecessary overkill to try changing something as fundamental as the objlist. While having free-running threads would be most performant, I don't think it'd take too much to properly synchronize them.

I believe extended is using last months data for congestion avoidance pathing (standard uses current month), and assuming nothing is actually changed by sync_step() that changes the private car routing (you would need to check carefully), then a mutex in the right spot should work...

A rw_mutex write locked in sync_step() in the hop() routines, and I think for vehicle make_smoke() creation should be the only things modifying the objlist (again need to check carefully).  No need for mutex on the read ops in sync_step() if only a single thread.  Private car thread(s) can do the read lock wherever it accesses the objlist.

That should be a light enough mutex to not tank performance.

prissi

Furthermore, the ways stay always at the same index in obj_list. So accessing that statistic from the last month should be thread safe, no matter what write access happened from cars and pedestrians in between.

I suspect some other conditions. How many threads allows extended? The drawing code (in standard) will write out of bounds if more than 12 are used. I just recently fixed that. (But that would not crash a server.)

TurfIt

Quote from: prissi on June 05, 2024, 05:03:50 AMFurthermore, the ways stay always at the same index in obj_list. So accessing that statistic from the last month should be thread safe, no matter what write access happened from cars and pedestrians in between.
I'm not really following this...  The objlist is in the grund_t object, so an empty ground tile has an empty objlist.  If the ground tile has a single way on it, then the objlist has capacity=1, and the obj.one union member is valid pointing directly at the way obj.  When sync_step() has a vehicle hop() into the tile, objlist capacity=2, and the objlist has to convert to using the obj.some member with memory allocated on the freelist for the array of objs.  Can't have the private car thread reading obj.one while the sync_step() thread is writing obj.some???

prissi

@TurfIt
Oh, yes, ugly. This happens only the first time a vehicle is jumping on that tile and everytime after a save/reload, so indeed, that will be a problem.
One could remove the objlist union for extended.

jamespetts

I have made an attempt at using a mutex to address these problems. This is more complex than it looks, since there are huge numbers of calls to the methods that add and remove objects to and from objlist, and I do not know whether I have got all of the relevant ones, since it is very difficult to tell without an extreme amount of work for which I do not have time at present specifically which ones are called from sync_step. Even moving the cursor over a tile seems as though it might trigger this, although I have not tested it exhaustively.

Unfortunately, as my development computer is not the same one as that on which I can run Thread Sanitizer, I have not yet been able to test whether this has reduced or eliminated data races, and I do not now have time to test for that this evening.

Edit: Incidentally, even if it were not for the union, surely the problem would be the same, as sync_step might be adding/deleting a cursor/highlight object, some smoke, a pedestrian, etc., at the same point as the thread running the private car route check is reading the objlist, giving rise to undefined behaviour?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

Yes, and adding something for the second time (or removing it). Mutexes will slowdown things to a crawl, and single-threaded code will likely performs better (using that many mutex will make the code efficiently single threaded).

If you do not care about memory consumption too much, just use the miniarray as objectlist and remove the union. Unless you a handling an object that is destroyed while accessing it (which is not needed to access way statistics), the objlist at position
  • will remain the same.

neroden

James: I would advise dropping your attempts to implement a mutex and reverting to the last version before you tried this.

I think we should switch to removing the union and implementing everything with the objectlist (including the one-item case) for reasons of cleanliness anyway.  We don't care about memory usage much.  In terms of speed, we replace an if-then on every check with an extra memory reference in the one-object case; I suspect this won't be a significant difference but it's worth benchmarking.

This eliminates the actual write-write conflict.

Then we are left with a write-read conflict, but at least this reduces the issue substantially.  I think we need to make a much simpler write lock which only locks when actually changing the objlist, and the private car code needs to check that to avoid data corruption.

The objlist approach is kind of suboptimal.  Ways are a very big deal in Simutrans; arguably they should have their own list.

jamespetts

For reference, the additional mutexes were all added in this commit.

This has not solved all the losses of synchronisation (albeit I suspect these to be multi-causal), but we have not had any thread deadlocks since implementation. We have had a report of a transient freezing issue, but this does not appear to be deadlock in the sense that the server recovers. Neroden suspects that this is related to the mutexes.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

neroden

Thanks for the reference -- at least they're all in one commit.  I don't think this is at all the right approach to the problem, not even slightly, but it makes it easy to revert if they're in one commit.

jamespetts

Quote from: neroden on June 23, 2024, 09:11:53 AMThanks for the reference -- at least they're all in one commit.  I don't think this is at all the right approach to the problem, not even slightly, but it makes it easy to revert if they're in one commit.
Yes - if we need to, we can just revert the commit.

I have noticed that the server has not unpaused since you reconnected a moment ago. Is this what you were describing on the server's thread as what you believe to be a symptom of the mutex commit?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

neroden

That's what's been happening today.  It may be totally unrelated to the mutexes.  That was my second suspicion, because "throwing in random mutexes" is a good way to freeze programs.

I don't know why it's not unpausing.  That's definitely what's happening (the log is showing it).
 
There was a previous bug involving the server not unpausing, which was solved, so my *first* suspicion was that some recent patch had accidentally reverted the solution to that.... but you said there hasn't been a patch in the last 24 hours.

jamespetts

Quote from: neroden on June 23, 2024, 09:17:49 AMThat's what's been happening today.  It may be totally unrelated to the mutexes.  That was my second suspicion, because "throwing in random mutexes" is a good way to freeze programs.

I don't know why it's not unpausing.  That's definitely what's happening (the log is showing it).
 
There was a previous bug involving the server not unpausing, which was solved, so my *first* suspicion was that some recent patch had accidentally reverted the solution to that.... but you said there hasn't been a patch in the last 24 hours.
Hmm - there's a difference between the server not unpausing and the server having got into a thread deadlock. If it got into a thread deadlock, it would remain frozen until restarted manually, albeit the client might see the same behaviour (i.e., the client not unpausing because the server is in a thread deadlock at the point when it should be telling the client to unpause).

Are you still sitting in the client with the server paused, or have you been able to reconnect?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

neroden

The client sees a frozen server and cannot reconnect.  The server does not respond.  Eventually I quit, and attempt to relaunch, and the server does not respond.  After several minutes, the server may finally start moving again.

I think you aren't facing a *proper* deadlock where the threads *never* resolve; what you're facing is a *serialization* where the entire server hangs for ten minutes waiting for one really slow thread to finish before it can keep going.  You can easily make such a situation by sloppy addition of mutexes.

neroden

So, this is the approach I would take to fixing the problem (after reverting James's commit):

(1) Eliminate the obj_one thing and have a full list for every grund_t.  Test to make sure the speed and memory of this remains acceptable.

(2) Add a "peek" routine which looks at the ways on a grund_t, without modifying them, and is robust to modifications happening while it's looking.  (It should "fail safe").  Rewrite the private car pathfinder to use this "peek" routine rather than the standard access routines.  I'm pretty sure it's absolutely necessary to do step #1 first, because I don't know how to do a "peek" when the internal data format could be either of two formats and might change underneath us.

neroden

Hmmm.  So what we want is a situation where stuff can be added to or removed from the objlist, and the "peek at ways" routine behaves itself (and isn't reading undefined memory).  This is rather dependent on the objlist's internals.  There's absolutely no way it can handle the addition or deletion of ways, of course.

The guarantee we require is that we can follow the various pointers to the relevant element for the array which is the way, without any of those changing.  If the ways are always in position 0 and 1 in the array, *and the array is never relocated, grown, or shrunk*, then we're OK, but if it's relocated by being grown or shrunk, we're sunk.

I believe the only time the objlist is gone through for all the objects, treating ways as Just Like Anything Else, is for display purposes.  Is this right? 

I'm actually sort of tempted to redefine the way the objlist works to have two special slots for ways.  It would use more memory (two pointers!) for any objlist which doesn't have ways.  (That could be half a gig in an 8192x8192 map.)  Everyone has lots of memory now, right?  It would also involve some logic and arithmetic in routines like bei() -- check each of the way pointers to see if it's null or not, and offset the index into the internal array by 0, 1, or 2, respectively.  That's cheap.

jamespetts

Quote from: neroden on June 23, 2024, 09:30:33 AMThe client sees a frozen server and cannot reconnect.  The server does not respond.  Eventually I quit, and attempt to relaunch, and the server does not respond.  After several minutes, the server may finally start moving again.

I think you aren't facing a *proper* deadlock where the threads *never* resolve; what you're facing is a *serialization* where the entire server hangs for ten minutes waiting for one really slow thread to finish before it can keep going.  You can easily make such a situation by sloppy addition of mutexes.
That makes sense - thank you for that explanation. The private car route threads are indeed slow, so this would explain what was happening. Given that this is what the problem appears to have been, I have reverted the mutex commit as we obviously need something more sophisticated here.

I do not know the answer to your question about how often that objlist is gone through for all objects (although note that display routines are multi-threaded*), as I have not really investigated objlist's internals, although I believe that it is also gone through for all objects when destroying the world (which happens whenever a player loads a new game, and for the joining player in a network game).

In terms of performance, we will have to test. What we do know is that memory bandwidth is critical, whereas absolute memory usage is less critical. A large map will take a lot of memory. The current Bridgewater-Brunel map, for example, has 8,448 x 2,004 = 25,293,312 tiles, which, if each required 16 more bytes (from an additional 2x 64-bit pointers), would require an additional 385 megabytes of memory. The current game in its current state takes 218.8 megabytes of memory, so, for a relatively lightly developed map, this would be a significant increase. However, the previous Bridgewater-Brunel saved game in its September 2023 (game date November 1993) takes 6.5 gigabytes, so this would be far less significant in that context. I suspect that increasing the memory usage in this way and adding a small computational overhead is likely to be worthwhile if we can really fix these issues as a result of this.

One other thing that we should also be able to do if we can fix this is let the private car route finding algorithm run in parallel as well as concurrently, which should speed up private car route finding, making the game more responsive. Currently, while players are connected, the game only runs one private car route finding thread in the background because running any more will give rise to losses of synchronisation for reasons that I was previously not able to establish but which I now strongly suspect are linked to this issue. In a network game when no clients are connected and the game is paused, multiple private car route finding threads will run in parallel, but it would be good to be able to do this while the game is running too.


* Simutrans uses entirely software rendering. Within the last year, from what I have read somewhere (I forget precisely where), some people have managed to get GL rendering mostly working in Simutrans, except that it does not render the special colours for night and player colours. I do wonder whether those might be features worth dropping to get GL rendering, especially since Pak128.Britain uses only some of those features, and those that it does use (player colour) are used intermittently and are not fully compatible with our rendering workflow, and these old features are not properly compatible with alpha blended anti-aliasing.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

As the server does not render anything, rendering is not the most CPU intensive part. Also there are three renderer (assuming extended has taken this parts) the simple one with clipping errors on bridges and long vehicles, and the ~5 times slower one with directional clipping. If CPU is lacking, it falls back to the simple renderer (also used on fast forward or when catching up on a server).

I would just use a more clever algorithm:

You could have also predefined routes for every town to all the next towns without passing another city first. That is very fast, like a flood fill of ways. Then let the citycars follow a route to a town centre and from there to the next city. The citycar would have just a list of cities needed to pass (and the current part of the route). This way the routes only need recalculation should a vehicle hit a missing road, and only for that city. And probably once a month for a city.

neroden

Soooo, while this would increase the memory usage, most of the time that memory would be blank.
way_primary
way_secondary
rest_of_list
Here are the cases:
-- tile with nothing on it: nullptr, nullptr, nullptr
-- tile with one way and nothing else: pointer to way, nullptr, nullptr
-- tile with two ways and nothing else: pointer to way, pointer to way, nullptr, nullptr
-- tile with no ways but other stuff (trees, buildings): nullptr, nullpter, pointer to rest of list
-- tile with one way and other stuff (vehicle, signal, sign, stop, pedestrians): pointer to way, nullptr, pointer to rest of list
-- tile with two ways and other stuff (vehicle, signal, sign, stop, pedestrians): pointer to way, pointer to way, pointer to rest of list

...that's a lot of null pointers. 

I don't think this would affect the bandwidth for network transfers at all.

I believe we send the game in save-game format when we send it from server to client.  We don't need to change the save-game format at all -- only the in-memory format.  It looked like this would require some work in objlist_t::rdwr, but I'm looking at it... and it already doesn't save ways.  They're saved in grund_t separately (probably a historical reason) so most of the work is already done.

neroden

While it would seem straightforward enough to make a "peek" routine which failed safe once this restructure was done, I'm thinking through possible gotchas.

During sync_step, if I'm understanding this correctly:

We're guaranteed that the grund_t object won't be deleted or created (or relocated). 
We're guaranteed that it won't add or delete a way_t object.
We're guaranteed that those way_t objects won't be changed (is this correct?)
So the way_primary and way_secondary pointer slots should be unchanging:
-- if NULL, stay NULL
-- if not null, continue pointing to the same thing
And the things they are pointing to (the way objects) should be unchanging too.

They should be effectively read-only from the start of one sync_step to the end of it.

To verify this, we would want to add a verification lock, the "don't change the ways" verification lock, which is locked at the start of sync_step and unlocked at the end of sync_step.  Then in the functions on grund_t which add or remove ways, or functions on way_t which alter them, check the lock, and if it's set (for now) throw an error into the log, so that we can spot any violations of this constant.  This may require a little bit of retooling of the mutator function interface, or it might not require much at all.

prissi

You could also introduce a new ground with ways. Somehow, we have this already for elevated ways, tunnels, bridges, and water, so it would be consequent to have a ground with ways ...