News:

Do you need help?
Simutrans Wiki Manual can help you to play and extend Simutrans. In 9 languages.

[BUG] Server crashes when client aborts connection

Started by freddyhayward, May 31, 2019, 03:36:14 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

freddyhayward

Steps to reproduce, case 1:
* open two clients
* connect to server using client A
* while client A reads "server preparing game" or "transferring game": stop client A using task manager / system monitor
* attempt to connect to server using client B

Expected behaviour:
* client B should be able to connect to server

Actual behaviour:
* client B reads "server did not respond!"

Steps to reproduce, case 2:
* open two clients
* connect to server using client A
* attempt to connect to server using client B
* while client B reads "server preparing game" or "transferring game": stop client B using task manager / system monitor

Expected behaviour:
* client A should remain connected

Actual behaviour:
* client A reads "Lost synchronisation with server"

jamespetts

Thank you for your report.

I am afraid that I have been unable to reproduce this with a local server in either case 1 or case 2 outlined above.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

DrSuperGood

In case 1 there should be a ~30 second delay before Client B will see the server. This is due to TCP timeout, assuming the server can cope with players leaving while joining.

Mariculous

I could reproduce case 2 under linux using pkill -9 instead to kill the client while it was in "server preparing game" state.
In this case the server terminated without any useful information using -debug 2

Stderr:
Tue Aug 20 17:40:43 CEST 2019: Warning: network_check_activity():       received cmd id=4 nwc_join_t from socket[8]
Tue Aug 20 17:40:43 CEST 2019: Warning: nwc_sync_t::do_command: sync_steps 16593

Obviously after this, trying to connect to the server (case 1) will show "server did not respond!" because it is not running anymore.

In addition: I encountered a simmilar problem a few days ago, when I was trying to connect from a slow network connection.
After a long time of waiting I got the message "not enough bytes transfered" and the server did also terminate.
I could not reproduce this so i did not start a bug report.

jamespetts

Thank you for testing this. Can I ask you to try running a debug build with gdb to see if you can get a backtrace when running a server locally and connecting to it with a client on the loopback interface (127.0.0.1)?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

Sure I can but it can take a few days.
Also, I am not a C++ dev so gdb is new to me.

jamespetts

Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

#7
I complied a debug build and reproduced scenario 2 again.
It seems to terminate without an exception so I don't know how to create a backtrace using some gdb stuff.
Are the used compile settings correct and how can I create a traceback when it terminates without an exception?

I used these settings for the compile:
COLOUR_DEPTH = 16
OSTYPE = linux
DEBUG = 1
MSG_LEVEL = 4
OPTIMISE = 1
PROFILE = 2
MULTI_THREAD = 1
VERBOSE = 1
FLAGS = -DUSE_C -fno-delete-null-pointer-checks -fno-strict-aliasing -std=c++11


dome@tuxigIII:~/games/simutrans-ex> ./simutrans-extended-dev -use_workdir -singleuser -debug 5  -objects Pak128.Britain-Ex -server &> server.log&
[1] 32520
dome@tuxigIII:~/games/simutrans-ex> sleep 1
dome@tuxigIII:~/games/simutrans-ex> ./simutrans-extended-dev -use_workdir -singleuser -debug 5  -objects Pak128.Britain-Ex -load net:localhost &> client.log&
[2] 32527
dome@tuxigIII:~/games/simutrans-ex> kill -9 $!
[1]-  Datenübergabe unterbrochen (broken pipe)                 ./simutrans-extended-dev -use_workdir -singleuser -debug 5 -objects Pak128.Britain-Ex -server &> server.log
[2]+  Getötet                ./simutrans-extended-dev -use_workdir -singleuser -debug 5 -objects Pak128.Britain-Ex -load net:localhost &> client.log


produced logs were too large to attach so you can find them there:
https://dome.xileks.de/simutrans/server.log
https://dome.xileks.de/simutrans/client.log

However, it doesn't seem to be pretty useful.

jamespetts

You need to run Simutrans-Extended using GDB (see here for a general tutorial for GDB).

That means that you must type "gdb ./simutrans-extended" and then, at the GDB command prompt, type "run -server". This will start Simutrans-Extended in server mode. You can then connect to it with another client on the same machine to test for the error. When the error arises on the version running in GDB with a server, it will generate a backtrace. It would help if you could copy and paste that into this forum.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

Wooooow gdb seems to be just as simple as java or scala debugging using netbeans.

Thread 1 "simutrans-exten" received signal SIGPIPE, Broken pipe.
0x00007ffff7bca51a in send () from /lib64/libpthread.so.0
(gdb) bt
#0  0x00007ffff7bca51a in send () from /lib64/libpthread.so.0
#1  0x00000000005fe25d in network_send_data (dest=12, buf=0x2d444cb1 "!", size=33, count=@0x7fffffffa43a: 0, timeout_ms=250) at network/network.cc:666
#2  0x000000000060d1c5 in packet_t::send (this=0x2d444ca0, s=12, complete=true) at network/network_packet.cc:118
#3  0x00000000005ff91f in network_command_t::send (this=0x7fffffffa490, s=12) at network/network_cmd.cc:70
#4  0x00000000006030d5 in nwc_sync_t::do_command (this=0x1ec95ad0, welt=0x23e6b340) at network/network_cmd_ingame.cc:776
#5  0x00000000007c5926 in karte_t::do_network_world_command (this=0x23e6b340, nwc=0x1ec95ad0) at simworld.cc:10320
#6  0x00000000007c535e in karte_t::process_network_commands (this=0x23e6b340, ms_difference=0x7fffffffb38c) at simworld.cc:10266
#7  0x00000000007c600c in karte_t::interactive (this=0x23e6b340, quit_month=2147483647) at simworld.cc:10424
#8  0x000000000075c62f in simu_main (argc=4, argv=0x7fffffffdc08) at simmain.cc:1382
#9  0x000000000076f6f0 in sysmain (argc=4, argv=0x7fffffffdc08) at simsys.cc:825
#10 0x000000000083abb3 in main (argc=4, argv=0x7fffffffdc08) at simsys_s.cc:729

TurfIt

SIGPIPE would be correct for this case - the client was killed after all, forcibly closing the socket.
Note by default gdb will see the signal and stop the program, even if the program would've properly handled it. So gdb needs to be configured to not interfere to debug such programs.

However, I actually don't see anywhere in Simutrans where SIGPIPE is being handled, set to ignore, or set to not raise... (whichever method is supported on a particular OS.)

Mariculous

Oh well so that's maybe not the error we are looking for, it's just some kind of automatic breakpoint, when I read the docs correctly :/
I'll have a look at it again tomorrow, It's pretty late.

jamespetts

Thank you for this: that is most helpful.

Can I check that you are running gdb on the server, not the client?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

Yep, it is the server, not the client.
run -server -singleuser -use_workdir

jamespetts

This is very odd. This is not something that can be investigated in this remote way, I think. Also, it is not clear that this is actually an Extended specific problem, since the SIGPIPE is very likely to be related to networking code, and the networking code is unchanged from Standard.

Can anyone confirm whether this can be reproduced in Standard?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

Sadly, for some unknown reason, I can't start a simutrans standard server currently. It will just freeze after loading the map.

However, I just asked players on my server to confirm this bug had happened in the past when we were playing simutrans standard and one of them confirmed this.

I know this is not a pretty relyable source, so I will try to reproduce this, but don't expect this to happen before monday.

jamespetts

That is very helpful - thank you. I will transfer this to the Standard bug reports forum as this appears to be reproducible in Standard. (If that turns out to be an error, it can be transferred back here).
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ters

Quote from: TurfIt on August 21, 2019, 11:05:44 PMHowever, I actually don't see anywhere in Simutrans where SIGPIPE is being handled, set to ignore, or set to not raise... (whichever method is supported on a particular OS.)

The best method in my opinion is to pass the flag MSG_NOSIGNAL to send. There may be other sources of SIGPIPE that it might be best not to ignore. MSG_NOSIGNAL isn't supported on Windows (Winsock doesn't use signals). I could not find information about MacOS. (Linux supports it only since 2.2, but that is almost as old as Simutrans.)