News:

Simutrans.com Portal
Our Simutrans site. You can find everything about Simutrans from here.

Server crashes

Started by Vladki, April 24, 2020, 08:52:05 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Vladki

These crashes happen quite often on stephenson-siemens. Unfortunately I've not found any particular scenario when it happens, but may be related to the frequent desyncs. It usually happens at "desync party" when players get desynced within seconds after connecting. Debug output:

I suspect the private car routing code, and that it does not behave consistently on server and clients.


Message: network_command_t::rdwr:       write packet_id=16, client_id=0
Message: packet_t::send:        sent 18 bytes to socket[6]; id=16, size=18
Message: packet_t::send:        sent 18 bytes to socket[7]; id=16, size=18
Message: network_command_t::rdwr:       write packet_id=16, client_id=0
Message: packet_t::send:        sent 18 bytes to socket[6]; id=16, size=18
Message: packet_t::send:        sent 18 bytes to socket[7]; id=16, size=18
Message: void convoi_t::hat_gehalten(halthandle_t halt):        Convoy (2108) Metrolink T68 departing from stop Mapleinghall Village Stop at step 20233. Its departure time is calculated as
Message: void convoi_t::hat_gehalten(halthandle_t halt):        Convoy (1241) MCW Metrorider (long) departing from stop Cheppike Manga Fields Stop at step 20233. Its departure time is calculated as
ERROR: bool haltestelle_t::fetch_goods():       A convoy's arrival time is not in the database
For help with this error or to file a bug report please see the Simutrans forum:
http://forum.simutrans.com
Message: void convoi_t::hat_gehalten(halthandle_t halt):        Convoy (1177) BHC AP1-88 100 series hovercraft departing from stop Peache Old Railway Station & Harbour at step 20233. Its departure time is calculated as
Message: network_command_t::rdwr:       write packet_id=16, client_id=0
Message: packet_t::send:        sent 18 bytes to socket[6]; id=16, size=18
Message: packet_t::send:        sent 18 bytes to socket[7]; id=16, size=18
Message: network_command_t::rdwr:       write packet_id=16, client_id=0
Message: network_command_t::rdwr:       read packet_id=8, client_id=3
Warning: nwc_tool_t::rdwr:      rdwr id=8 client=0 plnr=10 pos=977,969,-13 tool_id=8217 defpar=g,257,7,0,0,12,0|2|1000,758,-16,0,0,60,-1,1|989,940,-10,0,0,60,-1,0|989,947,-10,0,0,60,-1,0|985,962,-11,0,0,0,-1,0|977,971,-15,0,0,0,-1,0|967,969,-14,0,0,66,-1,1|979,970,-15,0,0,0,-1,0|985,960,-11,0,0,0,-1,0|989,945,-10,0,0,60,-1,0|989,938,-10,0,0,60,-1,0| init=1 flags=0
Warning: network_check_activity():      received cmd id=8 nwc_tool_t from socket[7]
Warning: karte_t::process_network_commands:     kicking client due to checklist mismatch : sync_step=161826 server=[ss=0 st=0 nfc=0 rand=0 halt=0 line=0 cnvy=0 ssr=0,0,0,0,0,0,0,0 str=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 exr=0,0,0,0,0,0,0,0 sums=0,0,0,0,0,0,0,0initiator=[ss=161826 st=20228 nfc=2 rand=1992861996 halt=3459 line=1 cnvy=2049 ssr=1293004503,1992861996,0,0,0,0,0,0 str=3187586794,3187586794,3187586794,3187586794,3187586794,3187586794,3187586794,3187586794,3187586794,3187586794,3187586794,1875258805,3684296852,61809096,5705560,3187586794 exr=0,0,0,0,0,0,0,0 sums=3754686143,1236061344,0,0,0,0,0,0
Message: socket_list_t::remove_client:  remove client socket[7]
Message: packet_t::send:        sent 18 bytes to socket[6]; id=16, size=18
*** stack smashing detected ***: <unknown> terminated

jamespetts

Thank you for the report. This is going to be exceedingly difficult to fix without a reliable reproduction case. "Stack smashing" is related to buffer overflows, but where and why they occur will need to be tracked down very precisely before I will be able to fix this, and I will need a reliable reproduction case to be able to do this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

I have pushed a fix to a memory corruption error. I do not know whether this will solve this problem, but I should be grateful if you could re-test and confirm after to-morrow's nightly build whether you can still reproduce this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Crashed just a while ago

jamespetts

Quote from: Vladki on May 01, 2020, 08:50:35 PM
Crashed just a while ago

Thank you for letting me know. Without knowing a lot more about the circumstances, it is extremely difficult to tell whether this is related to the original problem. As previously written, without a reliable reproduction case (and without fixing this problem as a side effect of fixing another problem), this will be fantastically difficult to fix.

Are you (or is anyone) able to give any information about the circumstances in which this occurs, or a backtrace?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Do I need special build for the backtrace?

It usually happens during desync party. Just about the time when I give up and stop playing, I find out that the server crashed.

Mariculous

#6
From my observations it happens immediately after connecting.

Quote from: Vladki on May 02, 2020, 10:36:01 AMDo I need special build for the backtrace?
Generally no, but a backtrace from a build without debug symbols won't be quite useful (effectively useless except for some masochistic people who like to read plain assembler code operating on plain memory) If you have a core dump from a build without debug symbols and you know how it was compiled, you can however compile it again with debug symbols, extract the debug symbols from the binary and load the core dump with these debug symbols.
Once loaded, you can generate a backtrace or even observe programms state at that time.

Beware that coredumps can be quite big as it's simply programs main memory at a point in time dumped to disk.

Vladki

Is it enough to compile with DEBUG=1 in config.default?

Mariculous

Yes. A backtrace generated from such a build might be useful.

In detail, DEBUG=1 (or greater) will add the -g flag, which tells gcc to generate debug symbols.

Vladki

server started with debug build, let's have a desync party

jamespetts

I should be grateful if people could test whether Ceac's fix, now incorporated (and which will be present from to-morrow's nightly build onwards) fixes this issue.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

May I ask whether there have been any more server crashes since the 4th of May 2020?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Yes, there were some. Unfortunately I was unable to configure the server to leave a coredump...