The International Simutrans Forum

 

Author Topic: Testing Simutrans-Experimental with online play  (Read 10375 times)

0 Members and 1 Guest are viewing this topic.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Testing Simutrans-Experimental with online play
« on: December 19, 2010, 12:29:18 AM »
It has been a few months now since Simutrans-Standard became compatible with playing over the internet, which has added immeasurably to the game experience. All of the networking code from Standard has been incorporated into Experimental, and painstaking efforts made to ensure that Experimental is compatible with online play. However, it has not, to my knowledge, yet been tested in online mode. Ideally, testing with the online mode would involve multiple people, at least one of whom is able to set up a server for these purposes.

I should be very interested in any volunteers (for the time being, preferably those with some experience of developing Simutrans-Experimental or at least Simutrans-Standard; ordinary play testing will come later) to test the online capabilities of Simutrans-Experimental with 9.0. Anyone interested in assisting, or who has any suggestions or thoughts about testing Simutrans-Experimental with online play should post a reply to this thread. I shall look forward to looking into this.

Offline ӔO

  • Devotees (Inactive)
  • *
  • Posts: 2345
  • Hopefully helpful
  • Languages: en, jp
Re: Testing Simutrans-Experimental with online play
« Reply #1 on: December 19, 2010, 06:16:16 PM »
I'm willing to try it.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #2 on: December 19, 2010, 06:27:00 PM »
That is very helpful. Could you look into how to set up a server?

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #3 on: December 19, 2010, 09:47:01 PM »
As a beginning, local testing should be done. I.e. start the server with "-server" and the client should join to localhost. If that works, chances are good that it will work over the network.

Offline ӔO

  • Devotees (Inactive)
  • *
  • Posts: 2345
  • Hopefully helpful
  • Languages: en, jp
Re: Testing Simutrans-Experimental with online play
« Reply #4 on: December 19, 2010, 10:44:38 PM »
some preliminary testing gives errors.

on connection I get
Code: [Select]
FATAL ERROR:
sim_new_handler()
OUT OF MEMORY or other
error allocating new object
PRESS ANY KEY

I was able to run the game in server mode for the first few tries, but they all failed to connect. Then after a while, the game simply refuses to start up in server mode with the same error message.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #5 on: December 19, 2010, 11:19:15 PM »
Thank you for that. I am going to have to try running this in debug mode, I think.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #6 on: December 20, 2010, 01:10:09 AM »
Fixed that error (see the -devel branch on Github), but I now get a rather harder to fix error on the server as soon as I start a game relating to the function restoring active windows. Any assistance in fixing that one would be appreciated!

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #7 on: December 20, 2010, 08:34:57 AM »
Just comment out saving windows for testing seems ok.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #8 on: December 20, 2010, 11:10:45 AM »
Thank you for the suggestion, Prissi. I've tried that, and it will now allow me to connect a client to a (local) network game. Now I get other difficulties: starting a vehicle in a depot does not synchronise: I start it on one client, and it still shows as being in the depot for all other clients. Adding a stop on a tramway (where that stop is owned by another player, using the new feature allowing joint use of private stops on public ways) causes both clients to lose synchronisation with the server.

I think that some heavyweight work is going to be required to make Experimental work well over a network. Volunteers to assist will be much appreciated!

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #9 on: December 20, 2010, 12:17:42 PM »
All you actions need to be caused by calling a werkzeug_t, see wimwerkz.* for the definitions. Here are also the routines to be called, when convois starts. Just look at how standard alters a schedule or renames a stop for simple actions.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #10 on: December 20, 2010, 10:21:20 PM »
Hmm - I'd already incorporated all of these things into Experimental. I just went through very carefully with the MSVC++ debugger with Experimental and Standard side-by-side all the code that is executed when, running as a server, both Standard and Experimental starts a vehicle in a depot: both executed identically.

Edit: Even when I don't change anything, on a very simple map, there are very frequent desyncs. What's the best way of diagnosing network problems; is it better, for example, to run the debugger on the client or the server (assuming that the "client" and the "server" are both on one machine for testing purposes using 127.0.0.1 as you suggested earlier)?
« Last Edit: December 20, 2010, 10:36:17 PM by jamespetts »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #11 on: December 21, 2010, 10:40:45 AM »
Debugger would not help you, I am afraid, only brain helps you ... Look for all stuff that uses a timer and not the ticks value of game time. A sync step and a step will have always exactly the same amount of gam ms time, no matter how long it took for their execution. This is needed, as the real time on server and client might be very different.

Offline Combuijs

  • Web Team
  • Devotee
  • *
  • Posts: 1394
  • Maintainer of maps.simutrans.com
    • Combuijs
  • Languages: EN, NL
Re: Testing Simutrans-Experimental with online play
« Reply #12 on: December 21, 2010, 10:54:06 AM »
Quote
only brain helps you

 ;D

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #13 on: December 21, 2010, 11:29:03 AM »
Use my brain? That's what I was afraid of.

Can you confirm - if the server and client are out of sync (but the client has not yet detected this and disconnected), might that result in commands executed on one not being processed on the other? I have found that, when I start a convoy in a depot on the server, this sometimes will and sometimes will not be executed on the client end: the nwc_tool_t::do_command(karte_t* welt) will always be executed, but, when the game has been connected for a while, it will not have an effect on the client. I have not yet succeeded in getting a client starting of convoy in a depot to be recognised on a server, but that may just be because I have not tried it early enough, before desynchronisation. What I do know is that infrastructure building and deleting (roads, etc.) always seems to work reliably.

As to timer functions: I know that Knightly used what he called a "performance counter" when implementing his new routing algorithm, and he was at the time unsure whether it would be compatible with network mode. He said at the time that he would test it when the network mode came to be implemented, but he has since left the forum and I cannot get in touch with him. I do not know whether this is the problem (either wholly or in part), which is part of why it would be helpful to know whether desynchronisation can cause a failure to respond to commands such as starting a convoy from a depot. Unfortunately, I know little about Knightly's highly sophisticated routing system, so may have difficulties in making it compatible with networking if that did turn out to be the problem. Any assistance from anyone in that regard would indeed be welcomed!

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Testing Simutrans-Experimental with online play
« Reply #14 on: December 21, 2010, 11:55:28 AM »
I think we will be forced to do goods route calculations only on server. And it must transfer changes in connexions table to clients.


Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #15 on: December 21, 2010, 12:11:04 PM »
Inkelyad - interesting idea. How easy would this be to implement, do you think? One problem that I foresee is that the amount of data necessary to communicate the various connexions might be too much for a reasonable connexion if the map is very large (each station's reconnexion to each other station each month or each time that the network is edited, whichever is more frequent). Although I think that we need first to discover whether there is an incompatibility with the routing/performance counter in the first place.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #16 on: December 21, 2010, 03:11:29 PM »
Remember: Nowadays the server is the machine with the lowest computer power available. Even a netbook has more CPU power available than a cheap virtual server.

But to the questions: Since the random counter is submitted with every command, a desynced command will be dropped. For debugging, you can activate the random mode. (See current submit.)

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4608
  • Languages: EN, DE, AT
Re: Testing Simutrans-Experimental with online play
« Reply #17 on: December 21, 2010, 07:11:47 PM »
I think we will be forced to do goods route calculations only on server. And it must transfer changes in connexions table to clients.
The current implementation of the network mode is actually the other way round: Everything is done on every connected client as well as on the server at the same in-game time in precisely the same order.

Some random things one has to ensure that the above paradigm works:
== make sure all clients run under the same settings: all values in simuconf.tab's, the pak files, etc that influence the simulation have to be identical. Thats why (a) all settings are saved in savegame and transfered, (b) checksums about used pakset are exchanged before allowing connection.
== ensure that all variables are read or resetted during loading a game.
== do not rely on the local time to control the simulation (ie measuring time with calls to dr_time() )
== all user interaction that influences the game (building etc) has to be sent to server and sent back to clients.
== calls to random number generator have to happen in identical order on all clients. No user interaction should call the random number generator. There was also a nasty bug related to this: if you call 'some_function( simrand(10), simrand(10) )' the order of the evaluation of the arguments is compiler-dependent, leading to unexpected differences.
== make sure all lists you use are sorted, such that the order of elements in this list is the same on every client. Otherwise taking a random element out of this list will give you different elements on different clients.

Edit: added something.
« Last Edit: December 21, 2010, 07:22:06 PM by Dwachs »

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4608
  • Languages: EN, DE, AT
Re: Testing Simutrans-Experimental with online play
« Reply #18 on: December 21, 2010, 07:19:44 PM »
I am afraid, only brain helps you ...
Use my brain? That's what I was afraid of.
:D :D

James, the debugging of this network stuff is much harder than debugging a 'simple' crash, where one runs the debugger, which is able to catch the moment the crash happens. With network communication it is much harder to identify the point, where something starts to go wrong. Things that crash server and/or clients can be catched with debugger (run client or server in the debugger depending on what you are going to debug).

But there are now other types of bugs: command that are not executed (or magically get messed up). Synchronisation loss of random number generator. Or timing problems: client is too fast or too slow.

For debugging such network related bugs, I used debug output and compare that of server and client with the expected situation. Thats why there are so many debug-messages in the network code. Adding debug output is the way to go. There is no general receipe :(

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Testing Simutrans-Experimental with online play
« Reply #19 on: December 21, 2010, 09:04:11 PM »
Everything is done on every connected client as well as on the server at the same in-game time in precisely the same order.
How time synchronisation works then? What is relationship between real-world ms, in-game ticks and steps?

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #20 on: December 21, 2010, 09:14:20 PM »
Every ticks of a frame is called with the time is should take: I.e. 25 frames_per_second => sync_step is ALWAYS called with delta_ms of 40. The ratio step to sync_step to step is 1:4, thus every step is called with 160ms. The pause between sync_step is adjusted in such a way, that the are nearly every 40 ms. However, also the delay from the server is measured, and if the ping to the server gets longer, the client tries to gradually run faster and vice versa.

However, SE does not need to bother with anything of this, as long as it never uses dr_time() within sync_step or step() but uses get_zeit_ms(). (But of course all things within sync_step will get the same number.)

Ptrhashtables are evil and must not be used for random access, as the objects are likely in a completely different order due to being at different addresses. Also any references using pointers as indexes will give different results.

Any thing that affects the game state should either go into step or sync_step, or should be carried out a werkzeug_t. Any voilation of this rule will lead to fast desyncs.

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Testing Simutrans-Experimental with online play
« Reply #21 on: December 21, 2010, 10:13:48 PM »
Ptrhashtables are evil and must not be used for random access, as the objects are likely in a completely different order due to being at different addresses. .

simhalt.h:
Code: [Select]
quickstone_hashtable_tpl<haltestelle_t, connexion*> **connexions;
tpl/quickstone_hashtable_tpl.h:
Code: [Select]
class quickstone_hash_tpl {
public:
        static uint32 hash(const quickstone_tpl<key_t> key)
        {
                return (uint32)(size_t)key.get_rep();
        }
:-[

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #22 on: December 21, 2010, 11:01:00 PM »
As long as there are no random calls but just faster access to the tables, this may be ok. But it looks indeed suspicious. But in this case you could simply use the entry instead of the rep. It will be the same on all clients (if there are no other errors).

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #23 on: December 22, 2010, 01:19:58 PM »
Prissi,

thank you very much for that. That is most helpful. Here, for reference, are the results of a brief audit of the occurrences of the various potentially problematic features:

1. dr_time() inside step() or sync_step()

Code: [Select]
void karte_t::step()
{
DBG_DEBUG4("karte_t::step", "start step");
unsigned long time = dr_time();
...
last_step_nr[steps%32] = dr_time();

This is the same in Standard, however, so I am assuming that this is not a problem.

Some occurrences of the following type in karte_t::laden():

Code: [Select]
#ifdef DEBUG
DBG_MESSAGE("rebuild_destinations()","for all haltstellen_t took %ld ms", dr_time()-dt );

// reroute goods for benchmarking
dt = dr_time();
#endif

This seems to be Experimental specific, but the value dt is used only for outputting debugging information, so it would not seem to be a potential sync issue.

In the constructor for the haltestelle_t class:

Code: [Select]
// Added by : Knightly
inauguration_time = dr_time();

This looks suspicious.

In path_explorer_t (a class specific to Simutrans-Experimental, one of Knightly's routing classes)

Code: [Select]
refresh_start_time = dr_time();

...

start = dr_time(); // start timing

...

diff = dr_time() - start; // stop timing

This pattern occurs more than once. This also looks suspicious.

There are some other instances of dr_time() in some of the networking code and some of the GUI code, but none of this has been touched by Experimental and seems to be outside step() or sync_step() and so is not likely a sync issue.

Conclusion: This issue is a potentially live problem for synchronisation in Experimental.

2. Pointer hash tables

In gui_convoy_assembler.h

Code: [Select]
/**
* A helper map to update loks_vec and waggons_Vec. All entries from
* loks_vec and waggons_vec are referenced here.
* @author Volker Meyer
* @date  09.06.2003
*/
ptrhashtable_tpl<const vehikel_besch_t *, gui_image_list_t::image_data_t *> vehicle_map;

Although gui_convoy_assembler_t is a Simutrans-Experimental specific class, the above code is identical to that in depot_frame_t in Standard, so this should be safe.

In simcity.h

Code: [Select]
// Hashtable of all cities/attractions/industries connected by road from this city.
// Key: city pointer.
// Value: journey time per tile (equiv. straight line distance)
// (in 10ths of minutes); 65535 = unreachable.
// @author: jamespetts, April 2010
ptrhashtable_tpl<stadt_t*, uint16> connected_cities;
ptrhashtable_tpl<const fabrik_t*, uint16> connected_industries;
ptrhashtable_tpl<const gebaeude_t*, uint16> connected_attractions;

This is specific to Experimental and looks suspicious.

In powernet.h, there is:

Code: [Select]
static ptrhashtable_tpl<powernet_t *, powernet_t *> loading_table;

But this code is unchanged from Standard, so I am assuming that this is clean.

There is also the quickstone hash issue pointed out by Inkelyad in the previous post, but I am not sure whether Quickstone hashes are as problematic as pointer hashes in this context - Prissi mentioned something about using the entry instead of the rep for these (as the "rep" is the memory address itself, which, of course, will be different on different clients). This will need further investigation, too.

Conclusion: This issue is a potentially live problem for synchronisation in Experimental.

3. Interaction

I have not conducted a full audit on this at this juncture, but I think that I have already made all interaction network safe some time ago when the network code for interaction was first introduced: certainly vehicle replacing (the most significant new sort of interaction for Experimental) has a network safe interface in the style of the other network based interactions.

Conclusion: This issue is a probably not an issue in Experimental.

Edit 4. (Addendum) quickstone_hashtable_tpl

In halt_detail.cc

Code: [Select]
for (uint i=0; i<warenbauer_t::get_max_catg_index(); i++)
{
const quickstone_hashtable_tpl<haltestelle_t, haltestelle_t::connexion*> *connexions = halt->get_connexions(i);

if(!connexions->empty())
{
buf.append("\n");
offset_y += LINESPACE...

Although some of this code is Experimental specific, it is GUI code only, so is probably clean.

In path_explorer.cc

Code: [Select]
// Save the halt list in an array first to prevent the list from being modified across steps, causing bugs
for (uint16 i = 0; i < all_halts_count; ++i)
{
halt_iter.next();
all_halts_list[i] = halt_iter.get_current();

// create an empty connexion hash table if the current halt does not already have one
if ( connexion_list[ all_halts_list[i].get_id() ].connexion_table == NULL )
{
connexion_list[ all_halts_list[i].get_id() ].connexion_table = new quickstone_hashtable_tpl<haltestelle_t, haltestelle_t::connexion*>();
}
}

This is Experimental-specific and ultimately called from step(), so this is potentially an issue; similarly in path_explorer.h

Code: [Select]
// structure for storing connexion hashtable and serving transport counter
struct connexion_list_entry_t
{
quickstone_hashtable_tpl<haltestelle_t, haltestelle_t::connexion*> *connexion_table;
uint8 serving_transport;
};

In simhalt.cc (constructor):

Code: [Select]
// Knightly : create the actual connexion hash tables
for(uint8 i = 0; i < max_categories; i ++)
{
connexions[i] = new quickstone_hashtable_tpl<haltestelle_t, connexion*>();
}

and also:

Code: [Select]
// Modified by : Knightly
// Purpose   : To withhold creation of pathing data structures if they are not needed
if (welt->get_einstellungen()->get_default_path_option() == 2)
{
paths = NULL;
open_list = NULL;

search_complete = NULL;
iterations = NULL;

has_pathing_data_structures = false;
}
else
{
paths = new quickstone_hashtable_tpl<haltestelle_t, path* >[max_categories];
open_list = new binary_heap_tpl<path_node*>[max_categories];

search_complete = new bool[max_categories];
iterations = new uint32[max_categories];
for(uint8 i = 0; i < max_categories; i ++)
{
search_complete[i] = false;
iterations[i] = 0;
}

has_pathing_data_structures = true;
}

and also:

Code: [Select]
waiting_times = new koordhashtable_tpl<koord, waiting_time_set >[max_categories];
connexions = new quickstone_hashtable_tpl<haltestelle_t, connexion*>*[max_categories];

and also in void haltestelle_t::prepare_pathing_data_structures():

Code: [Select]
if (!current_halt->has_pathing_data_structures)
{
// create data structures
current_halt->paths = new quickstone_hashtable_tpl<haltestelle_t, path* >[max_categories];
current_halt->open_list = new binary_heap_tpl<path_node*>[max_categories];

current_halt->search_complete = new bool[max_categories];
current_halt->iterations = new uint32[max_categories];

current_halt->has_pathing_data_structures = true;
}

for (uint8 i = 0; i < max_categories; i++)
{
// reset all flags to trigger recalculations
current_halt->search_complete[i] = false;
current_halt->iterations[i] = 0;
current_halt->paths_timestamp[i] = 0;
current_halt->connexions_timestamp[i] = 0;
current_halt->reschedule[i] = true;
current_halt->reroute[i] = true;
}

I imagine that there are many more references to those quickstone hashtables in the body of simhalt.cc and path_explorer.cc. This looks problematic, in so far as quickstone hashes are a problem.

Conclusion: This is potentially an issue for Experimental, if the use of Quickstone hashtables is itself a problem even if the get_rep() call is avoided.
« Last Edit: December 22, 2010, 01:36:49 PM by jamespetts »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #24 on: December 22, 2010, 01:42:56 PM »
You can use any hashtables, as long as the key is the same on all clients. WIth enry (resp get_id() this is ensured for all quickstone_hashtables.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #25 on: December 22, 2010, 02:16:24 PM »
Ahh, so quickstone hashtables are all right as long as I don't use .get_rep()?

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #26 on: December 22, 2010, 02:26:21 PM »
Yes, the tool get the convoi/halt ids as intergers from the parameters. THus all convois and halts use the same quickstone entry.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #27 on: December 22, 2010, 03:09:13 PM »
Ahh,
Yes, the tool get the convoi/halt ids as intergers from the parameters. THus all convois and halts use the same quickstone entry.

So I should use an inthashtable of the [type]handle_t objects using the [type]handle_t.get_id() function instead of using a quickstone hashtable?

Offline inkelyad

  • Devotees (Inactive)
  • *
  • Posts: 482
  • Languages: EN, RU
Re: Testing Simutrans-Experimental with online play
« Reply #28 on: December 22, 2010, 03:19:31 PM »
Too complicated. Just replace get_rep in quickstone_hash_tpl::hash to get_id.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #29 on: December 22, 2010, 05:33:25 PM »
Inkelyad,

ahh - I only recently realised that quickstone hashtables were my invention in the first place (and so are specific to Simutrans-Experimental)! I have now changed .get_rep() to .get_id() in the hashtables, and nothing seems broken so far, so this is a step forward. Thank you for that tip!

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #30 on: December 23, 2010, 04:47:14 PM »
To update those following the thread here, I have had a reply from Knightly, who says:

Quote
The dr_time() calls in path_explorer_t are for timing the duration it takes to run a certain number of iterations in a single phase, in a single step. Obviously, if different clients (or the server) perform different number of iterations on the same step, changes to the world (karte_t) will be out of sync. The only feasible way to sync the path explorer is to sync the number of iterations performed on each step, for all the phases in the path explorer. That means, much like frames per second, the slowest computer will determine how many iterations to run per step, for each phase.

Inkelyad's suggestion to perform route search only on server and have the server send the search results back to the clients is not very practical IMHO. For small maps with a small number of stops, that's fine; but the network traffic load would be rather heavy for large maps with over a thousand of stops. (You know, the number of paths grow quadratically with the number of stops.)

I have not yet fully understood how the network mode handles joining, client leaving, etc. There has to be an array of lists on the server to keep track of the phase iteration counts per client (including the server itself). Whenever a client wants to adjust a phase iteration count, it has to be sent to the server for comparison (find the min); if change is needed, the server will broadcast the change to all clients.

It's not an easy task, and requires a very thorough understanding of how the network mode works. I may look into the network code again, but there is no guarantee that I can come up with a feasible solution -- what I have said above should work theoretically, but one needs to research all relevant code to confirm whether it is really technically feasible.

And later:

Quote
The mechanism of path explorer relies on real world time, not in-game time returned by get_zeit_ms() which is scaled with game speed. Network games can also be set to run at different game speeds, thus such simple replacement doesn't work.

But of course, you may use the distributed search instead -- that should be network safe.

(The distributed search is much slower than the centralised search, which is why the centralised search was written in the first place).

Edit: Updates. Firstly, Knightly has further replied as follows:

Quote
I would also be glad, if I manage to make centralised search works under network mode. I have never ventured to write anything related to network mode, and all network code is written by Prissi and Dwachs. I will let you know later on whether I succeed or not.

But just to warn you of this -- most probably, you should be aware that, client joining (even for spectators) will involve game saving and reloading for all connected clients. That means, using centralised search will have the added penalty of recalculating all paths upon client joining. For small and sparsely connected games, it might not be a serious nuisance; but for large and heavily connected games, this might annoy the active players. There will not be any solution for this problem.

So, there may well be some progress on the centralised search front. Meanwhile, I have managed to get the distributed search working again by defining the USE_INDEPENDENT_PATH_POOL preprocessor directive when compiling, which uses a memory pool that I wrote for the pathing rather than the one built into Simutrans (which is faster, I understand, and used to work fine; I cannot understand why it works no longer). Happily, even on SDog's enormous map (here), the speed seems reasonable, so this may be a feasible interim solution.
« Last Edit: December 23, 2010, 06:05:07 PM by jamespetts »

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #31 on: December 29, 2010, 02:41:00 PM »
Simutrans-Experimental 9.1  has been released, and, with it come a number of improvements relating to networking. A number of basic problems have been fixed and I have managed to run a very simple game with a single 'bus route in a town connected using the network mode without interaction for several hours without desyncs, so it is working at a very basic level (whereas it was not in version 9.0). For the time being, the distributed pathing system is used in network mode (and the centralised system disabled by default), but Knightly is looking into making the centralised system compatible with network mode in the future.

However, there are still frequent desyncs, especially when there is player interaction or a more developed map is used. I am not yet experienced in tracking down desyncs. Having eliminated the issues helpfully identified by Prissi above, there still seem to be a number of desync problems, so the issues must go further than that. Reading the German forum (with the help of Google translator), Prissi mentioned something there about why he thought that Experimental may have trouble with networking. He wrote:

Quote from:
Prissi
Mal abgesehen, dss der Netzwerkmodus mit Simutrans-Experimetnal wohl auf absehbare Zeit nicht funktionieren wird, da dort nicht auf saubere Trennung zwischen Sync-Steps und Steps geachtet wurde und mit lokalen Zeiten hantiert wird (oder wurde).

Apart, dss the network mode with Simutrans Experimetnal will probably not work for the foreseeable future because there has not been paid to clean separation between Sync-steps and steps and handles local time is (or was).

(The latter paragraph is the Google translation of the former). However, because this is roughly translated with Google, I am not sure that I fully understand the sense of it. Perhaps Prissi can help - does this relate to the dr_time issue that you described above, or is this a separate problem in relation to separation between sync_step() calls (and methods called from within sync_step()) and things that rely on an individual client's timing in some other way than by use of dr_time()?

As to the logs, they are not quite as helpful as I hoped that they might be, and I am wondering whether I have the logging level set too low. Currently, my client log output looks like this:

Code: [Select]

... (messages during initialisation and loading omitted) ...

Warning: network_check_activity(): received cmd id=5 nwc_ready_t from socket[316]
Warning: nwc_ready_t::execute: set sync_steps=2754 map_counter=2930317
Warning: karte_t::network_game_set_pause: steps=688 sync_steps=2754 pause=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 1 clients connected. init=1 exec=1 flags=0
Warning: network_check_activity(): received cmd id=6 nwc_tool_t from socket[316]
Warning: nwc_tool_t::execute: append sync_step=2757 current sync_step=2754  wkz=8224 init
Warning: network_world_command_t::execute: do_command 6 at sync_step 2757 world now at 2754
Warning: nwc_tool_t::do_command: steps 2757 wkz 8224 init
Warning: command: id=32 init=1 defpar=Now 1 clients connected. flag=0
Warning: network_check_activity(): received cmd id=7 nwc_check_t from socket[316]
Warning: network_world_command_t::execute: do_command 7 at sync_step 2817 world now at 2799
Warning: karte_t::interactive: client: sync=2817  rand=1033484102, server: sync=2816  rand=1033484102
Warning: network_check_activity(): received cmd id=7 nwc_check_t from socket[316]
Warning: network_world_command_t::execute: do_command 7 at sync_step 3073 world now at 3063
Warning: karte_t::interactive: client: sync=3073  rand=-1319591947, server: sync=3072  rand=1114866420
Warning: karte_t::interactive: random number generators have different states
Warning: karte_t::network_disconnect(): Lost synchronisation with server.
ERROR: convoi_t::refresh_routing(): Halt in schedule does not exist
Please report all errors to
team@64.simutrans.com

while the server logs look like:

Code: [Select]

... (messages during initialisation and loading omitted) ...

Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 0 clients connected. init=1 exec=0 flags=0
Warning: nwc_tool_t::execute: send sync_steps=5  wkz=8224 init
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 0 clients connected. init=1 exec=1 flags=0
Warning: nwc_tool_t::execute: append sync_step=5 current sync_step=4  wkz=8224 init
Warning: network_world_command_t::execute: do_command 6 at sync_step 5 world now at 4
Warning: nwc_tool_t::do_command: steps 5 wkz 8224 init
Warning: command: id=32 init=1 defpar=Now 0 clients connected. flag=4
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (24,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (69,94)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (184,69)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (95,218)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (9,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (106,65)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (125,52)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (63,199)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (16,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (399,89)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (255,118)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (295,115)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,82)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,75)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (409,74)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (414,91)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (380,170)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,125)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (359,257)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (354,262)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (348,255)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (354,255)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (87,219)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (91,211)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (494,463)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (404,334)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (389,349)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (502,458)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (500,465)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (191,69)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (182,61)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (178,71)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (402,355)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (389,342)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (414,315)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (423,317)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (423,324)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,323)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,330)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (71,87)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (76,93)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (84,93)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (77,101)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (77,84)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (249,129)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (256,127)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (242,124)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (500,450)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (416,76)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,83)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,99)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,108)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (407,109)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (398,109)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (397,102)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,97)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,83)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,92)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,100)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (402,105)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (31,206)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (88,204)!
Warning: convoi_t::laden_abschliessen(): convoi ((3) MR 2441 Class 0-6-0) is broken => realign
Warning: convoi_t::laden_abschliessen(): cnv 3 is currently too long.
Warning: convoi_t::laden_abschliessen(): convoi ((13) RCH 10t Bulk Wagon) is broken => realign
Warning: convoi_t::laden_abschliessen(): cnv 13 is currently too long.
Warning: convoi_t::laden_abschliessen(): convoi ((73) LMS Class 3F "Jinty" 0-6-0T) is broken => realign
Warning: convoi_t::laden_abschliessen(): cnv 73 is currently too long.
Warning: network_check_activity(): received cmd id=1 nwc_gameinfo_t from socket[3804]
Warning: network_check_activity(): received cmd id=2 nwc_join_t from socket[3804]
Warning: network_world_command_t::execute: do_command 3 at sync_step 1334 world now at 1333
Warning: nwc_sync_t::do_command: sync_steps 1334
ERROR: convoi_t::refresh_routing(): Halt in schedule does not exist
Please report all errors to
team@64.simutrans.com
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (69,94)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (184,69)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (95,218)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (9,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (106,65)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (125,52)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (63,199)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (16,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (399,89)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (255,118)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (295,115)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,82)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,75)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (409,74)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (414,91)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (380,170)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,125)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (359,257)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (354,262)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (348,255)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (354,255)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (87,219)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (91,211)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (494,463)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (404,334)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (389,349)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (502,458)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (500,465)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (191,69)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (182,61)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (178,71)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (402,355)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (389,342)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (414,315)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (423,317)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (423,324)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,323)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,330)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (71,87)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (76,93)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (84,93)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (77,101)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (77,84)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (249,129)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (256,127)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (242,124)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (500,450)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (416,76)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,83)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,99)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,108)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (407,109)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (398,109)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (397,102)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,97)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,83)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,92)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,100)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (402,105)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (31,206)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (88,204)!
Warning: karte_t::network_game_set_pause: steps=333 sync_steps=1334 pause=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 1 clients connected. init=1 exec=0 flags=0
Warning: nwc_tool_t::execute: send sync_steps=1337  wkz=8224 init
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 1 clients connected. init=1 exec=1 flags=0
Warning: nwc_tool_t::execute: append sync_step=1337 current sync_step=1336  wkz=8224 init
Warning: network_world_command_t::execute: do_command 6 at sync_step 1337 world now at 1336
Warning: nwc_tool_t::do_command: steps 1337 wkz 8224 init
Warning: command: id=32 init=1 defpar=Now 1 clients connected. flag=4
Warning: network_receive_data: connection [3804] already closed
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 0 clients connected. init=1 exec=0 flags=0
Warning: nwc_tool_t::execute: send sync_steps=2061  wkz=8224 init
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 0 clients connected. init=1 exec=1 flags=0
Warning: nwc_tool_t::execute: append sync_step=2061 current sync_step=2060  wkz=8224 init
Warning: network_world_command_t::execute: do_command 6 at sync_step 2061 world now at 2060
Warning: nwc_tool_t::do_command: steps 2061 wkz 8224 init
Warning: command: id=32 init=1 defpar=Now 0 clients connected. flag=4
Warning: network_check_activity(): received cmd id=1 nwc_gameinfo_t from socket[3748]
Warning: network_check_activity(): received cmd id=2 nwc_join_t from socket[3748]
Warning: network_world_command_t::execute: do_command 3 at sync_step 2754 world now at 2753
Warning: nwc_sync_t::do_command: sync_steps 2754
ERROR: convoi_t::refresh_routing(): Halt in schedule does not exist
Please report all errors to
team@64.simutrans.com
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (184,69)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (95,218)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (9,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (106,65)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (125,52)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (63,199)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (16,197)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (399,89)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (255,118)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (295,115)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,82)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,75)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (409,74)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (414,91)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (380,170)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,125)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (359,257)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (354,262)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (348,255)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (354,255)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (87,219)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (91,211)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (494,463)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (404,334)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (389,349)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (502,458)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (500,465)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (191,69)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (182,61)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (178,71)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (402,355)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (389,342)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (414,315)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (423,317)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (423,324)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,323)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,330)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (71,87)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (76,93)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (84,93)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (77,101)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (77,84)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (249,129)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (256,127)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (242,124)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (500,450)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (416,76)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,83)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,99)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (415,108)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (407,109)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (398,109)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (397,102)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (401,97)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,83)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,92)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (408,100)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (402,105)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (31,206)!
Warning: planquadrat_t::set_halt(): assign new halt to already bound halt at (88,204)!
Warning: karte_t::network_game_set_pause: steps=688 sync_steps=2754 pause=0
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 1 clients connected. init=1 exec=0 flags=0
Warning: nwc_tool_t::execute: send sync_steps=2757  wkz=8224 init
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=koord3d invalid wkzid=8224 defpar=Now 1 clients connected. init=1 exec=1 flags=0
Warning: nwc_tool_t::execute: append sync_step=2757 current sync_step=2756  wkz=8224 init
Warning: network_world_command_t::execute: do_command 6 at sync_step 2757 world now at 2756
Warning: nwc_tool_t::do_command: steps 2757 wkz 8224 init
Warning: command: id=32 init=1 defpar=Now 1 clients connected. flag=4
Warning: network_receive_data: connection [3748] already closed
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=346,248,0 wkzid=8224 defpar=Now 0 clients connected. init=1 exec=0 flags=0
Warning: nwc_tool_t::execute: send sync_steps=3085  wkz=8224 init
Warning: nwc_tool_t::rdwr: rdwr id=6 client=0 plnr=255 pos=346,248,0 wkzid=8224 defpar=Now 0 clients connected. init=1 exec=1 flags=0
Warning: nwc_tool_t::execute: append sync_step=3085 current sync_step=3084  wkz=8224 init
Warning: network_world_command_t::execute: do_command 6 at sync_step 3085 world now at 3084
Warning: nwc_tool_t::do_command: steps 3085 wkz 8224 init
Warning: command: id=32 init=1 defpar=Now 0 clients connected. flag=4
ERROR: convoi_t::refresh_routing(): Halt in schedule does not exist
Please report all errors to
team@64.simutrans.com

Having tested a number of times, the desync message type always appears to be related to the random number generator having different states. If I understand things correctly, the random number generator is synchronised by means of synchronising its seed on joining a network game, the seed being the server's local time. Periodically, the program checks whether the internal state of the number generator is the same, and, if it is not, disconnects, having detected a desync. The random number generator having a different state, I assume, would be caused by being called either a different number of occasions, and/or with different parameters on the client and the server. Thus, this type of desync is a broad and rather generic type: without some way of knowing what methods have called on the random number generator between the last time that they were logged as being in sync and the desync, there is no easy way of telling what the problem is.

The question is: is there a different logging level somewhere in the code that will record this state of affairs? From those who have helped to develop networking for Standard: what sort of things (absent player interaction) would cause random number generators to record different states, and what is the most effective way of tracking down the cause(s)?

As Prissi has indicated, making Experimental network compatible is likely to be a fairly substantial task (although a great deal of the effort in making Simutrans generally work online has been undertaken by the tireless work of Prissi and Dwachs whose contribution deserves much credit). Any assistance from any Experimental developers (or any Standard developers who have had experience of network developing and who can offer some pointers) would be very welcome. Thank you to Prissi for all your help so far on this thread.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #32 on: December 29, 2010, 07:46:33 PM »
To make thing simple: Debug output will only help you a little. You must follow the flow of logic and spot simrand calls at wrong positions.

If the connection is stable but user interaction cause fast desyncs (you can set the checking intervall for desyncs to 2 fram to detect them faster), the I think the user interaction involves a call to simrandom. In principle set_random_mode should get those, if experimental still uses the standard random generator.

Another problem might be additional handles. Like the handle for a station is not created by werkzeug_t but before, getting the handles out of sync. Or, during reloading some initialisation is done using system time. The whole code needs to be looked at it carefully for such stuff.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Moderator
  • *
  • Posts: 18763
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Testing Simutrans-Experimental with online play
« Reply #33 on: December 29, 2010, 08:24:27 PM »
Thank you for that. What I find is that a client can stay connected to a (local) server for hours when there is no interaction and no (player) transport network. Even with some interaction (deleting of trees, turning on and off "allow city growth", sending messages, etc.), with no player transport network, the connection remains stable. More intensive interaction is somewhat impractical to test over long periods.

This suggests that Experimental specific features such as private car trip generation and industry density checking do not cause any network related problems.

With a player transportation network, a desync will occur in short time (a minute or two, sometimes shorter) of connexion; although this does not seem to happen with a very simple network (a single 'bus going between two stops), it does occur with only a slightly more complex network (two 'bus routes and about four or five stops).

I have noticed that the desyncs seem to occur shortly after the vehicle has stopped and loaded/unloaded; I notice that the dwell time for the vehicles appears different on the server and client when I have their respective windows open side by side. I wonder whether this could be related to the loading times feature in Experimental (where a number that is a hard-coded constant of 2000ms in Standard becomes set by a variable read from a .dat file in Experimental), but it seems odd that this should be so where a time value (in ms) is equally used in Standard (which, I presume, is calibrated from get_zeit_ms() and not dr_time(), and is, therefore, safe; has this code changed in Standard in anticipation of network play?).

I am not familiar with the set_random_mode name: is that a method name? If so, there is an aspect to Simutrans's handling of the random number generator with which I am unfamiliar (I have not changed the random number generator in Experimental, so what holds for Standard should hold for Experimental in this field). What does set_random_mode do, and when should it be called?

As for initialisations using system time, presumably that would be via calls to dr_time(), or is there another way in which reference to system time can creep in? Is this what you were referring to when you made reference to a lack of clean separation between sync_step() calls and references to local time in the German forum?

Finally, how do I go about altering the checking interval for desyncs such as you recommend above? This might well be most useful for spotting the cause more accurately.

In any event, thank you again very much for your help with this: it is much appreciated.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9595
  • Languages: De,EN,JP
Re: Testing Simutrans-Experimental with online play
« Reply #34 on: December 29, 2010, 08:43:18 PM »
Maybe it is lifetime of pedestrains. Try to switch them off at the beginning. Or it is the routing of passengers causing the desync. This would be indicated by the fact, that a slightly developed map desyncs when just joining happens.

There are network settings in simuconf.tab. One is the number frames when a sync is made without interaction. I think default is now 255.

The sync_step/step seperation is no longer important, as the time intervall those are called are exactly the same on all clients, despite the fact of them taking different times.