News:

Simutrans Chat Room
Where cool people of Simutrans can meet up.

Network suggestion - better handling of desyncs for player

Started by Ashley, January 17, 2011, 12:38:09 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Ashley

From playing Simutrans online a fair bit I have found one thing which is especially annoying. Sometimes the game de-syncs, which is fine actually, I think the way Simutrans handles this by letting you just re-join is very good. However the process of re-joining could be a lot smoother. When you get de-synced you have to click "Play Online" again, then select or type in your server name, then click to join the game again. Would it be possible to have the game display a "reconnect to server?" dialog when you get de-synced which will reconnect you to the last server you were joined to?

I don't think automatic re-joining is a good idea given that it is disruptive to other players if a client is constantly getting de-synced with the load/save on joining occurring. Some slightly quicker way to get back into a game would be very nice though.

Oh and would it also be possible for the game client to remember which player you were playing as last, and automatically switch to that player upon re-joining the server? I always end up accidentally building some stuff as the "human" (player 1) before realising and switching back.

Thanks :)
Use Firefox? Interested in IPv6? Try SixOrNot the IPv6 status indicator for Firefox.
Why not try playing Simutrans online? See the Game Servers board for details.

prissi

Of course I though about that too, but:

desyncs are programm errors and are not supposed to happen. Thus they should (and actually do) occur only rarely. People should rather report, if they found something, which might trigger a desync. (They are unfourtunately very hard to debug.) If there is a reconnect dialogue, it would take out the incentive to find the cause.

Since you can join a game with a non-matching version or a non-matching pak by explicitely specifing the server or using the command line (which is fine, so some people can connect to their server with a shortcut), an immeadiate reconnect dialogue would just hide the cause.

Furthermore after desync, the game state is "resetted" and thus the player is player 1. The game (at the moment) at least does not know as which player you were connected.

Ashley

Hm, I see your logic Prissi, you're right that people should feel motivated to report these errors or we'll never fix them. One thing I was thinking might be useful would be for the game to automatically log at debug level 3 to some kind of internal cache. When a desync or other error happens it could write the last 5 minutes of logging out to a sort of dump file, along with any other game state information which might be useful for debugging. An in-game dialog could then prompt people to send this crash data for analysis by the dev team. Kind of like the error reporting mechanisms built into other software products. This would be a very useful feature for a server too since it's sometimes hard to pin down exactly when such problems occur and find the right bit of the log file. Also logging at debug 3 all the time produces big logs.

In the case of being disconnected due to incorrect game version/pak mismatch the reconnect dialog could be disabled since the client had never successfully connected to begin with.
Use Firefox? Interested in IPv6? Try SixOrNot the IPv6 status indicator for Firefox.
Why not try playing Simutrans online? See the Game Servers board for details.

prissi

At no place during connection to a server a pak version is queried. This check only happens via game info. THus the motivation to go via game info.

About debugging desync: These are some nice ideas, but for debugging a desync the server logs and client logs are not very useful, since those very unlikely show the cause. Rather a vertain gaming event or a certain map would trigger frequent desyncs.

And for sending them in: I think we are way too few people for log file analysis. (And they do not tell anything about desyncs, unfourtunately.)

Ashley

I think perhaps writing a guide for people to report desyncs and the kind of information that is useful to debug them might be a good idea - I will put together a sticky post. I'll also try and report more on specific game events/map combinations which cause them to occur frequently since I do see a few of these.

Maybe as a middle-ground, could the client at least remember the last server it connected to so that when you go into "Play Online" the servers list has highlighted (and automatically fetches the game info for) the last server you joined? That would reduce the number of clicks by one at least. :)
Use Firefox? Interested in IPv6? Try SixOrNot the IPv6 status indicator for Firefox.
Why not try playing Simutrans online? See the Game Servers board for details.

prissi

OK, the server should be kept during reconnect to the previous connected server.

sdog

Quote from: prissiThus they [desyncs] should (and actually do) occur only rarely. People should rather report, if they found something, which might trigger a desync. (They are unfourtunately very hard to debug.)

Perhaps you should have made this a bit more clear. :-)
How often is rarely? At some points it happened quite rarely for me (less than once in 15 min), but typically it used to be in the range of about once every two to three minutes.

Typical things causing desyncs in order of decreasing likelyness (subjective estimate of likelyness):
renaming lines (almost 1)
sending convois out of a depot (>0.8%)
changing schedules
opening schedule dialogues
building ways
removing ways

quite often without apparent cause

played on timothy's pak128 server with
r4184  (openpak128 1124)

i used the server 'transatlantically' (never bothered to ping the server). I've been alone on the server with the exception of a brief period with another player. Tested over about 5 hours dispersed over the last week.

paco_m

Our experience (3rd network game with around 5 players) is that you can change a lot with the simuconf.tab parameters

Quoteserver_frames_ahead = 30
We had this set to 8 before playing during hours without desyncs, however some of our players were not able to play at all with this configuration. Values between 50 and 100 caused very frequent desyncs (2-3min), the current setting works quite well with some desyncs - can be 2 within 10 minutes and the next two hours none.

prissi

server_frames_ahead can be only set on the server. Default is 4.

desync occur typically every few hours if I play on the gift (which, btw. runs with frames_per_second=15, which I recommend to enable also netbooks to join.) I you have much more desyncs, your connection is either very slow (long ping) or changing latency very strongly or you are using paks/setting in simuconf.tab that are different but we fail to transfer per savegame yet.

If you have a plain vanilla installation and there are many desync, then you should try to increase "additional_client_frames_behind" to something like 5 or so (default 0). Your commands will take longer to be executed, but should be more stable.

(If the desync happen with experimental, then do debugging there. A desync of the game engine could have various reasons. Especially with the many more parameters settings of experimental. They all must be transferred via savegame or the game will desync immeadieately.)

@sdog: Could you post your simuconf.tab to see if something is very different in your setup?

TurfIt

My experience playing on gift.physik.tu-berlin.de largely matches sdog for the trouble areas. Ping ~135ms, min 127, max 160 is typical. Mostly virgin Win_SDL nightly/pak 64 nightly version to match the server - Setting single user mode, pak selection, water animation off, show month = 4, show tree info off.

If I'm the only one connected, almost no desyncs - no real pattern to those that occur. If others are connected:
Line Management - 60% desync chance for any action (new line, update line, delete line). Desync cause - sim_rand mismatch.
Vehicle Depot - 50% desync chance for any action involving lines/schedules. Desync cause - sim_rand.
Both above dialogs open increase to 99% chance.
Way building/removing - ~80% chance to desync atleast one other player, or get desynced if they are the ones building. Desync cause - wanted to execute in past. (interestingly it's *always* 2 sync_steps in the past)

Further - way building at times results in very different results between the client and server. i.e. everything looks good on client, vehicles dispatched using the new way. A desync eventually occurs that can't be traced to any action (sim_rand usual cause). Upon rejoining the way just built bears no resemblance to what the client seen pre desync - adjacent tiles not connected, strange 'spurs' off the built way, way taking a completley different route., etc. The vehicles that the client seen running along the way just fine are likely stuck with 'no route' indicators.

Updating lines is also an extremely risky action. If while the schedule dialog is open, another client joins forcing the dialog to close, that specific line is now 'poisoned '. Even selecting the line from the list in the Line Management dialog results in an immediate sim_rand desync. This persists until the server is restarted. I've also had the line take on the complete schedule from the next line in the list. eg. I'm updating '(2) Line', somebody joins, (2) Line now has the same stops as (3) Line did. The trucks on (2) line are now trying to head across the water as (3) Line is for boats!.

Once any line has become 'poisoned', Assigning lines to vehicles in the depot quite often results in the wrong line being assigned. eg. I assign (4) Line (not necessarily the poisoned line), start the vehicle, desync (sim_rand), rejoin, and find (5) Line was assigned by the server instead of (4) Line by my client.


Ashley

I've seen the same line "poisoning" behaviour also. It seems that I get almost no desyncs when playing on a server with nobody else, but then they get more frequent the more people join.
Use Firefox? Interested in IPv6? Try SixOrNot the IPv6 status indicator for Firefox.
Why not try playing Simutrans online? See the Game Servers board for details.

prissi

A desync it often discovered too late, since the random_counter is not compared that often. Thus a line may have been assigned to a certain vehicle, but on the server this has a different number. Also a different result of way building means the random counter differed even before.

Please try the 4196 or higher, as in those more paremeter are transferred to ensure we did not forget something.

Dwachs

I would love to see a server-log and server-savegame right after such poisoning has happened.

Is there an easy way to reproduce that on a local machine? Eg server on localhost and several clients with a small map.

@prissi and Timothy: what versions are your server running (sdl, gdi, allegro, etc) ?
Parsley, sage, rosemary, and maggikraut.

paco_m

I had this also twice with R4175 - to reproduce it just modify a line when another player is joining.
Server: linux 64bit, self compiled without GUI (posix option)

prissi

QuoteI had this also twice with R4175 - to reproduce it just modify a line when another player is joining.

THis sound like a bug. Please, everyone report such things as a bug. It is very difficult to test all combinations.

gift runs on self-compiled FreeBSD (from 2004, thus only IPv4) with posix backend and manually set revision (since lack of svn there).

Ashley

Quote from: Dwachs on January 19, 2011, 09:39:47 AM
I would love to see a server-log and server-savegame right after such poisoning has happened.

Is there an easy way to reproduce that on a local machine? Eg server on localhost and several clients with a small map.

@prissi and Timothy: what versions are your server running (sdl, gdi, allegro, etc) ?

My server is compiled with the posix backend, so no graphics at all.
Use Firefox? Interested in IPv6? Try SixOrNot the IPv6 status indicator for Firefox.
Why not try playing Simutrans online? See the Game Servers board for details.

Dwachs

Quote
Further - way building at times results in very different results between the client and server. i.e. everything looks good on client, vehicles dispatched using the new way. A desync eventually occurs that can't be traced to any action (sim_rand usual cause). Upon rejoining the way just built bears no resemblance to what the client seen pre desync - adjacent tiles not connected, strange 'spurs' off the built way, way taking a completley different route., etc. The vehicles that the client seen running along the way just fine are likely stuck with 'no route' indicators.

That should work better now (server and client needs update, no backward compatibility in network transfer).

If you come across the line bug again, please leave a note (a log, a sve ).
Parsley, sage, rosemary, and maggikraut.

TurfIt

Other player joining and starting a new convoi from his depot knocks my rand out of sync with server.
Was connected by myself >1 hr for no problem prior. i.e. server with only one client is fine.



Message: network_command_t::rdwr: read packet_id=7, client_id=0
Warning: network_check_activity(): received cmd id=7 nwc_check_t from socket[612]
Message: NWC_CHECK: time difference to server 17
Message: network_world_command_t::execute: do_command 7 at sync_step 279809 world now at 279800
Message: gui_textarea_t::recalc_size(): reset size to 346,99
Warning: karte_t::interactive: sync_step=279808  server=[rand=3939584664 halt=202 line=45 cnvy=157] client=[rand=3939584664 halt=202 line=45 cnvy=157]
Message: gui_textarea_t::recalc_size(): reset size to 368,33
Message: gui_textarea_t::recalc_size(): reset size to 389,77
Message: gui_textarea_t::recalc_size(): reset size to 11,0
Message: gui_textarea_t::recalc_size(): reset size to 389,154
Message: gui_textarea_t::recalc_size(): reset size to 389,143
Message: gui_textarea_t::recalc_size(): reset size to 389,121
Message: network_command_t::rdwr: read packet_id=7, client_id=0
Warning: network_check_activity(): received cmd id=7 nwc_check_t from socket[612]
Message: NWC_CHECK: time difference to server -16
Message: network_world_command_t::execute: do_command 7 at sync_step 280065 world now at 280057
Message: gui_textarea_t::recalc_size(): reset size to 352,33
Warning: karte_t::interactive: sync_step=280064  server=[rand=1460483625 halt=202 line=45 cnvy=157] client=[rand=1460483625 halt=202 line=45 cnvy=157]
Message: gui_textarea_t::recalc_size(): reset size to 368,44
Message: gui_textarea_t::recalc_size(): reset size to 11,0
Message: gui_textarea_t::recalc_size(): reset size to 389,132
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=211,387,-1 wkzid=8219 defpar=D07B16C730C74731B1D2CBBF049322753B311B96 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280142 current sync_step=280133  wkz=8219 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280142 world now at 280133
Message: nwc_tool_t::do_command: steps 280142 wkz 8219 init
Message: nwc_tool_t::do_command: id=27 init=1 defpar=D07B16C730C74731B1D2CBBF049322753B311B96 flag=0
Message: gui_textarea_t::recalc_size(): reset size to 346,99
Message: gui_textarea_t::recalc_size(): reset size to 389,121
Message: gui_textarea_t::recalc_size(): reset size to 346,88
Message: gui_textarea_t::recalc_size(): reset size to 389,132
Message: gui_textarea_t::recalc_size(): reset size to 389,121
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=216,97,0 wkzid=8218 defpar=a,243,94,1,0,T21_bus init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280310 current sync_step=280302  wkz=8218 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280310 world now at 280302
Message: nwc_tool_t::do_command: steps 280310 wkz 8218 init
Message: nwc_tool_t::do_command: id=26 init=1 defpar=a,243,94,1,0,T21_bus flag=0
Message: depot_t::buy_vehicle(): T21_bus
Message: depot_t::buy_vehicle(): vehiclebauer 07CACDF8
Message: depot_t::buy_vehicle(): appended 1 vehicle
Message: convoi_t::add_vehikel(): at pos 0 of 4 totals.
Message: automobil_t::set_convoi(): 07C6D780
ERROR: vehikel_t::set_convoi(): pos_prev is illegal of convoi 157 at 243,94,1
Please report all errors to
team@64.simutrans.com
Message: convoi_t::add_vehikel(): now 1 of 4 total vehikels.
Message: network_command_t::rdwr: read packet_id=7, client_id=0
Warning: network_check_activity(): received cmd id=7 nwc_check_t from socket[612]
Message: NWC_CHECK: time difference to server 12
Message: network_world_command_t::execute: do_command 7 at sync_step 280321 world now at 280312
Message: gui_textarea_t::recalc_size(): reset size to 329,99
Warning: karte_t::interactive: sync_step=280320  server=[rand=3293087439 halt=202 line=45 cnvy=158] client=[rand=3293087439 halt=202 line=45 cnvy=158]
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=216,94,0 wkzid=8216 defpar=l,157,9 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280366 current sync_step=280358  wkz=8216 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280366 world now at 280358
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=214,94,-1 wkzid=8218 defpar=b,243,94,1,157 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280373 current sync_step=280365  wkz=8218 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280373 world now at 280365
Message: nwc_tool_t::do_command: steps 280366 wkz 8216 init
Message: nwc_tool_t::do_command: id=24 init=1 defpar=l,157,9 flag=0
Message: nwc_tool_t::do_command: steps 280373 wkz 8218 init
Message: nwc_tool_t::do_command: id=26 init=1 defpar=b,243,94,1,157 flag=0
Message: convoi_t::start(): Convoi (157) Empire T-21 Bus wechselt von INITIAL nach ROUTING_1
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=217,97,0 wkzid=8218 defpar=a,243,94,1,0,T21_bus init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280384 current sync_step=280376  wkz=8218 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280384 world now at 280376
Message: nwc_tool_t::do_command: steps 280384 wkz 8218 init
Message: nwc_tool_t::do_command: id=26 init=1 defpar=a,243,94,1,0,T21_bus flag=0
Message: depot_t::buy_vehicle(): T21_bus
Message: depot_t::buy_vehicle(): vehiclebauer 07CACD38
Message: depot_t::buy_vehicle(): appended 1 vehicle
Message: convoi_t::add_vehikel(): at pos 0 of 4 totals.
Message: automobil_t::set_convoi(): 07C6DB08
ERROR: vehikel_t::set_convoi(): pos_prev is illegal of convoi 158 at 243,94,1
Please report all errors to
team@64.simutrans.com
Message: convoi_t::add_vehikel(): now 1 of 4 total vehikels.
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=216,94,0 wkzid=8216 defpar=l,158,9 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280396 current sync_step=280388  wkz=8216 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280396 world now at 280388
Message: gui_textarea_t::recalc_size(): reset size to 389,110
Message: nwc_tool_t::do_command: steps 280396 wkz 8216 init
Message: nwc_tool_t::do_command: id=24 init=1 defpar=l,158,9 flag=0
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=214,94,-1 wkzid=8218 defpar=b,243,94,1,158 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280407 current sync_step=280399  wkz=8218 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280407 world now at 280399
Message: nwc_tool_t::do_command: steps 280407 wkz 8218 init
Message: nwc_tool_t::do_command: id=26 init=1 defpar=b,243,94,1,158 flag=0
Message: convoi_t::start(): Convoi (158) Empire T-21 Bus wechselt von INITIAL nach ROUTING_1
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=216,97,0 wkzid=8218 defpar=a,243,94,1,0,T21_bus init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280418 current sync_step=280410  wkz=8218 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280418 world now at 280410
Message: nwc_tool_t::do_command: steps 280418 wkz 8218 init
Message: nwc_tool_t::do_command: id=26 init=1 defpar=a,243,94,1,0,T21_bus flag=0
Message: depot_t::buy_vehicle(): T21_bus
Message: depot_t::buy_vehicle(): vehiclebauer 07CACF18
Message: depot_t::buy_vehicle(): appended 1 vehicle
Message: convoi_t::add_vehikel(): at pos 0 of 4 totals.
Message: automobil_t::set_convoi(): 097404B8
ERROR: vehikel_t::set_convoi(): pos_prev is illegal of convoi 159 at 243,94,1
Please report all errors to
team@64.simutrans.com
Message: convoi_t::add_vehikel(): now 1 of 4 total vehikels.
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=216,94,0 wkzid=8216 defpar=l,159,9 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280430 current sync_step=280422  wkz=8216 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280430 world now at 280422
Message: gui_textarea_t::recalc_size(): reset size to 389,121
Message: nwc_tool_t::do_command: steps 280430 wkz 8216 init
Message: nwc_tool_t::do_command: id=24 init=1 defpar=l,159,9 flag=0
Message: network_command_t::rdwr: read packet_id=6, client_id=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=1 plnr=4 pos=213,94,-1 wkzid=8218 defpar=b,243,94,1,159 init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[612]
Message: nwc_tool_t::execute: append sync_step=280439 current sync_step=280431  wkz=8218 init
Message: network_world_command_t::execute: do_command 6 at sync_step 280439 world now at 280431
Message: nwc_tool_t::do_command: steps 280439 wkz 8218 init
Message: nwc_tool_t::do_command: id=26 init=1 defpar=b,243,94,1,159 flag=0
Message: convoi_t::start(): Convoi (159) Empire T-21 Bus wechselt von INITIAL nach ROUTING_1
Message: gui_textarea_t::recalc_size(): reset size to 389,110
Message: gui_textarea_t::recalc_size(): reset size to 375,44
Message: gui_textarea_t::recalc_size(): reset size to 389,44
Message: gui_textarea_t::recalc_size(): reset size to 359,33
Message: gui_textarea_t::recalc_size(): reset size to 346,88
Message: gui_textarea_t::recalc_size(): reset size to 11,0
Message: gui_textarea_t::recalc_size(): reset size to 346,77
Message: network_command_t::rdwr: read packet_id=7, client_id=0
Warning: network_check_activity(): received cmd id=7 nwc_check_t from socket[612]
Message: NWC_CHECK: time difference to server -17
Message: network_world_command_t::execute: do_command 7 at sync_step 280577 world now at 280569
Warning: karte_t::interactive: sync_step=280576  server=[rand=915615714 halt=202 line=45 cnvy=160] client=[rand=2240997186 halt=202 line=45 cnvy=160]
Warning: karte_t::interactive: disconnecting due to checklist mismatch
Warning: karte_t::network_disconnect(): Lost synchronisation with server.

prissi


TurfIt

Current weeks gift.physik.tu-berlin.de nightly version.
Message: simmain::main():   Version: 102.3-4229  Date: Jan 27 2011