News:

Use the "Forum Search"
It may help you to find anything in the forum ;).

Crash when server force-syncs while client edits schedule

Started by freddyhayward, May 19, 2020, 12:17:40 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

freddyhayward

This crashes the server and desyncs the clients and happened to bridgewater-brunel recently. I don't currently have the time to set up a debug build but here are the steps to reproduce:
1. start local simutrans server
2. open simutrans client
3. connect client to server
4. open line dialog on client, select truck lines, click 'new line'
5. select at least two road tiles as stops - actual stops or waypoints will do
6. open second client
7. connect second client to server
result: server crashes, both clients get desync messages.

jamespetts

I have just been looking into this. This is a very bizarre problem: it appears to occur as a result of memory corruption of a sort that I find entirely inexplicable. The crash occurs in line 2597 of simworld.h, which is the getter method for sync_steps. This is a read access violation, the problem being that the memory address of "this" (i.e. the karte_t object) is invalid

This is called by line 639 of network_cmd_ingame, which is passed a pointer to "welt" (the karte_t object representing the world) by its caller, network_broadcast_world_command_t::execute, which is in turn passed the pointer to the world by line 10418 of karte_t::process_network_commands(), which simply passes the "this" pointer. For reasons that I do not understand, nothing is available of the call stack above this, so I cannot see the memory address of anything further up in the call stack.

The "this" pointer that is passed is the invalid memory location. However, karte_t::world is a different memory address, which is valid and is the correct memory address for the world.

This suggests some fundamental memory corruption of the sort that cannot be traced by a debugger alone. Unfortunately, the tool that I would normally use in such situations, Dr. Memory, currently itself has a bug that stops it from working at all. I am thus mystified as to how even to begin trying to fix this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

freddyhayward

I wonder if an interim fix could be done, that does not address the problem of memory corruption but instead forcibly closes any schedule dialogs that might be open just before a force-sync. That would of course be an inconvenience, but better than the alternative of server crashes and desyncs.

ceeac

The crash should be fixed now: #180. However, the client editing the schedule still gets desynced.

Quote from: jamespetts on May 23, 2020, 10:59:41 PMThis suggests some fundamental memory corruption of the sort that cannot be traced by a debugger alone. Unfortunately, the tool that I would normally use in such situations, Dr. Memory, currently itself has a bug that stops it from working at all. I am thus mystified as to how even to begin trying to fix this.
If you use Linux/mingw64/WSL, you can use clang's address sanitizer feature; just add -fsanitize=address to both compiler and linker flags. See also here.

jamespetts

Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.