News:

Simutrans.com Portal
Our Simutrans site. You can find everything about Simutrans from here.

[11.35] Disconnection

Started by Sarlock, August 18, 2014, 05:07:12 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Sarlock

I think I've figured something out, and this likely pertains to both Standard and Experimental, though I haven't played a Standard server to verify.

I often get disconnections when someone else joins the game, or when the server does a periodic game save.  The key element is that when I am not doing anything at the time, the disconnections don't seem to occur (or at least very infrequently).  It seems to occur when I am in the middle of performing a task(s) when someone else connects or the game saves.  When the connection is complete, I disconnect.  It has taken me months to establish a possible link... could this be a possibility?

I find that usually there are several operations that I had sent to the server that were not completed on the server side when I reconnect... so it is seemingly an issue related to the sending of commands from the client to the server, then those commands are "lost" because by the time the server receives them it is beginning the connection process.

Or perhaps this isn't the cause at all :)  Just a working theory.

EDIT:
Here is the log from one:

Message: packet_t::send: sent 165 bytes to socket[492]; id=7, size=165
Message: packet_t::send: sent 43 bytes to socket[492]; id=16, size=43
Warning: network_receive_data: error 10054 while receiving from [492]
Message: socket_list_t::remove_client: remove client socket[492]
Warning: karte_t::interactive: lost connection to server
Warning: karte_t::network_disconnect(): Lost synchronisation with server. Random flags: 0
Warning: nwc_routesearch_t::reset: all static variables are reset
Current projects: Pak128 Trees, blender graphics

jamespetts

Interesting. I do not know the finer details of the network architecture: I do know that the game state is saved into a file by the server when a player joins: this file is then loaded by the joining client, whilst each already joined client saves and loads its own file without receiving game state data from the server.

The games are then kept synchronously with each other by being deterministic based on the data transmitted by the server. User commands are sent to the server and are executed on all clients in the same designated time slot. This means that a client that is lagging behind the server will send its commands to the server to execute in the future as far as that client is concerned, meaning a delay between the time of the command being sent to the server and the command being executed on the client (as the client will only execute commands when told to do so by the server, to ensure that it is done at exactly the same game time on all clients). The further that the client lags behind the server, the longer the delay, which is why people running on slow computers can get very long delays indeed if they cannot keep up with the server.

If, at the time of somebody joining, there are commands queued on the server waiting to be executed, what ought to happen is that these commands are kept in the queue and executed for everybody after the client joins. I do not know whether this in fact happens, however, nor how the game actually handles this situation: the commands might be thrown away, which is not ideal, but should not of itself cause a disconnexion. Quite why this is happening I am not sure: I should be interested in the views of anyone who knows more about this than I do as to whether this does happen in Standard, and why it happens at all (if indeed it does).
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Sarlock

Indeed.  This is not a checklist mismatch disconnection, so the client and server are seemingly handling this correctly -- the extra client commands would be discarded with the save/load cycle and both client and server should be operating from an identical situation, albeit that the client's last few commands were lost.

This particular logged disconnection was just with a regular server backup/save cycle.  I was in the process of building and lost around 5-6 commands that were queued.  Presumably this is a rare occurrence with most servers as there isn't as much map-related lag on the client and server making for larger time gaps (response time between performing an action and seeing the result is typically 3-5 seconds with this server).  As such we may only be seeing this on a regular basis now because of the size of the game.

I suspect it's something simple... finding what that simple thing is, however, is less simple.
Current projects: Pak128 Trees, blender graphics

DrSuperGood

This only happens in Experimental. The RC version of standard and even Dev version do not suffer from this (never noticed it). Or maybe they are just unprone to it so that it is improbable?

What I guess is happening is if at the exact time of save any orders currently being carried out are corrupted or lost so that they are not executed/in the right order on clients and server. I am often in the middle of things when the game saves in experimental and that is an instant disconnect after. In standard this happens a lot as well but I am not disconnected (the order is lost instead).

Also of note is that any schedule modifications at the time of save are also lost in experimental where as in standard the window remains open and even the schedule modifications remain intact. I am guessing standard has done some major code overhauls relating to orders in general that have not been ported to experimental.

prissi

Different orders due to closing windows would indeed lead to desync. But also if the server sends out commands and issues the reload command while the client is already ahead causes desync. Imagine building a longer road (the most demanding command for a server) This just when this is finished the server is lagging and gets a connection request. It does so, and sends out a reload request to the clients. The clients however are already past that point => desync from executing in the past (or something along that in the the log).