News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Listserver unavailability causes online game freezes

Started by Matthew, March 21, 2023, 10:45:39 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Matthew

This week the Simutrans.com/org domain, including in particular the Extended listserver, was unreachable for a day. During this period, there were frequent freezes on B-B. The pattern appeared to be that the online game operated for 15 minutes, then froze for approximately 2 minutes (I did not actually time it though, apologies), in a ~17-minute cycle. I believe that attempting to join the server during this period caused an unexplained failure (the client returned to the Main Menu without explanation), though my tests were inconclusive because at that point I was still thinking in terms of a 15-minute cycle.

My suspicion is that the server was frozen during a timeout waiting for a reply from the listserver. I guess that it could also have been the client, but I am not sure why the client would contact the listserver unless the Play Online dialogue was open, and I would expect a client-only freeze to cause a disconnect rather than a freeze-and-catch-up cycle.

To my mind, sending info from a game server to the listserver would seem to be a good use of a separate thread since it needn't interact with the game state. But James, I realize that it's now several years since you implemented multithreading and you may not want to wade into that particular swamp again!
(Signature being tested) If you enjoy playing Simutrans, then you might also enjoy watching Japan Railway Journal
Available in English and simplified Chinese
如果您喜欢玩Simutrans的话,那么说不定就想看《日本铁路之旅》(英语也有简体中文字幕)。

prissi

The IP communication of extended is at the heart the same as standard. And the whole IP communication is not multithreaded, since for clients that would not work. The listserver could be threaded as the information is not critical. But it involves a lot of work for almost no gain, because a server that cannot contact the list servers cannot be discovered.

And one can always run a server without announcements ...

BUT in the current case, the list server was working only the DNS was not. Once the IP had been queried, that connection should have stayed. Maybe the reverse lookup of the BB server failed.

ceeac

Using a thread is way overkill IMO. The main problem is that the timeout for TCP connections is a) blocking and b) way too large (I think longer that 1 minute by default?). Using non-blocking socket IO will reduce this to a more manageable level. For Standard, Prissi implemented this last year - see r10459 and subsequent commits.