News:

Simutrans Chat Room
Where cool people of Simutrans can meet up.

Server freezes

Started by jamespetts, December 20, 2020, 06:25:14 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

jamespetts

I am noticing that the server is sometimes becoming unresponsive. It appeared to happen earlier to-day and happened again this afternoon just as I attempted to join.
The log messages show as follows:
Message: check_activity():    Accepted connection from: [IP ADDRESS REDACTED]
Message: socket_list_t::add_client:    add client socket[20] at address 5247182cd
Message: network_command_t::rdwr:    read packet_id=4, client_id=0
Warning: network_check_activity():    received cmd id=4 nwc_join_t from socket[20]
Message: nwc_join_t::execute:   
Message: network_command_t::rdwr:    write packet_id=4, client_id=0
Message: packet_t::send:    sent 27 bytes to socket[20]; id=4, size=27
Message: network_command_t::rdwr:    write packet_id=5, client_id=0
Message: packet_t::send:    sent 26 bytes to socket[20]; id=5, size=26
Message: network_command_t::rdwr:    write packet_id=5, client_id=0
Message: packet_t::send:    sent 26 bytes to socket[13]; id=5, size=26
Message: packet_t::send:    sent 26 bytes to socket[9]; id=5, size=26
Message: packet_t::send:    sent 26 bytes to socket[20]; id=5, size=26
Message: network_command_t::rdwr:    write packet_id=9, client_id=0
Message: packet_t::send:    sent 201 bytes to socket[13]; id=9, size=201
Message: packet_t::send:    sent 201 bytes to socket[9]; id=9, size=201
Message: packet_t::send:    sent 201 bytes to socket[20]; id=9, size=201
Warning: nwc_sync_t::do_command:    sync_steps 150415
Message: network_command_t::rdwr:    write packet_id=6, client_id=0
Message: packet_t::send:    sent 14 bytes to socket[20]; id=6, size=14
Message: network_command_t::rdwr:    write packet_id=17, client_id=0
Warning: nwc_routesearch_t::rdwr:    rdwr limits=(1024, 247, 6170, 17978368, 304960) apply_limits=1
Message: packet_t::send:    sent 43 bytes to socket[20]; id=17, size=43
Warning: nwc_routesearch_t::transmit_active_limit_set:    transmit succeeded sync_step=150415 map_counter=19580638 limits=(1024, 247, 6170, 17978368, 304960)

I will restart the server again, but I post this here to have a record of the issue for future reference.
Edit: CPU usage is reported as 85%, so this looks like a possible infinite loop or similar rather than a thread deadlock.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Matthew

Here are the log entries before another freeze yesterday:

(Signature being tested) If you enjoy playing Simutrans, then you might also enjoy watching Japan Railway Journal
Available in English and simplified Chinese
如果您喜欢玩Simutrans的话,那么说不定就想看《日本铁路之旅》(英语也有简体中文字幕)。

jamespetts

This appears to have happened twice to-day. I have not taken the log files from those occurrences, but if anyone finds any situation where an infinite loop/freeze can be reproduced, please let me know.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

freddyhayward

To state the obvious, the last line of both logs is transmit_active_limit_set.

jamespetts

Quote from: freddyhayward on December 28, 2020, 09:53:36 PMtransmit_active_limit_set

That is part of the path explorer - but it's not a part of the code that I have ever looked into in any detail, having been written some years ago by Knightly, so its significance in this context is not immediately obvious and some detailed investigation will be required to ascertain the relevance of this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

freddyhayward

#5
This has happened again - I wonder if you could write a script that automatically restarts the server if it has been inactive for over an hour?
edit: I was somehow able to start joining - I will see whether this is successful. edit: it was successful.

jamespetts

The freeze has occurred again, this time very shortly after a new player loaded in. The end of the log file is as follows:


Warning: nwc_routesearch_t::reset:   all static variables are reset
Message: karte_t::load():   Prepare for loading
Message: karte_t::load():   Time is now: 32349
Warning: karte_t::load:   File version: 120007, Extended version: 14, Extended revision: 32
Message: nwc_auth_player_t::init_player_lock_server:   new = 32767
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,1997)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,1997)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,1998)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,1998)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,1999)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,1999)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,2000)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,2000)!
Message: nwc_auth_player_t::init_player_lock_server:   new = 32767
Warning: karte_t::load():   loaded savegame from 4/1972, next month=-1690304512, ticks=-1693078205 (per month=1<<22)
Message: simu_main():   Running world, pause=0, fast forward=0 ...
Message: check_activity():   Accepted connection from: 79.73.134.94.
Message: socket_list_t::add_client:   add client socket[7] at address 4f49865ed
Message: network_command_t::rdwr:   write packet_id=9, client_id=0
Message: network_command_t::rdwr:   read packet_id=1, client_id=4
Warning: network_check_activity():   received cmd id=1 nwc_gameinfo_t from socket[7]
Message: nwc_gameinfo_t::execute:   
Message: network_command_t::rdwr:   write packet_id=1, client_id=0
Message: packet_t::send:   sent 14 bytes to socket[7]; id=1, size=14
Message: socket_list_t::remove_client:   remove client socket[7]
Message: check_activity():   Accepted connection from: 79.73.134.94.
Message: socket_list_t::add_client:   add client socket[7] at address 4f49865ed
Message: network_command_t::rdwr:   read packet_id=1, client_id=0
Warning: network_check_activity():   received cmd id=1 nwc_gameinfo_t from socket[7]
Message: nwc_gameinfo_t::execute:   
Message: network_command_t::rdwr:   write packet_id=1, client_id=0
Message: packet_t::send:   sent 14 bytes to socket[7]; id=1, size=14
Message: socket_list_t::remove_client:   remove client socket[7]
Message: check_activity():   Accepted connection from: 79.73.134.94.
Message: socket_list_t::add_client:   add client socket[7] at address 4f49865ed
Message: network_command_t::rdwr:   read packet_id=4, client_id=0
Warning: network_check_activity():   received cmd id=4 nwc_join_t from socket[7]
Message: nwc_join_t::execute:   
Message: network_command_t::rdwr:   write packet_id=4, client_id=0
Message: packet_t::send:   sent 26 bytes to socket[7]; id=4, size=26
Message: network_command_t::rdwr:   write packet_id=5, client_id=0
Message: packet_t::send:   sent 26 bytes to socket[7]; id=5, size=26
Message: network_command_t::rdwr:   write packet_id=5, client_id=0
Message: packet_t::send:   sent 26 bytes to socket[7]; id=5, size=26
Message: network_command_t::rdwr:   write packet_id=9, client_id=0
Message: packet_t::send:   sent 201 bytes to socket[7]; id=9, size=201
Warning: nwc_sync_t::do_command:   sync_steps 2
Message: network_command_t::rdwr:   write packet_id=6, client_id=0
Message: packet_t::send:   sent 14 bytes to socket[7]; id=6, size=14
Message: network_command_t::rdwr:   write packet_id=17, client_id=0
Warning: nwc_routesearch_t::rdwr:   rdwr limits=(1024, 98304, 65536, 16777216, 256) apply_limits=1
Message: packet_t::send:   sent 43 bytes to socket[7]; id=17, size=43
Warning: nwc_routesearch_t::transmit_active_limit_set:   transmit succeeded sync_step=2 map_counter=175262 limits=(1024, 98304, 65536, 16777216, 256)

CPU usage is at 99.6% and memory at 83.3%.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

ceeac

Does the server use zstd for save files? In Standard, there was a data race bug causing hangs on load in some cases. I fixed the bug in r9406 (commit 1f61ea1), so cherry-picking the commit for Extended should probably fix the bug there, too.

jamespetts

#8
Quote from: ceeac on January 10, 2021, 12:33:14 PM1f61ea1

Thank you for that. Yes, we do use zstd for saving/loading. Can I confirm that this is the commit in question?
Edit: I have looked into this but there are some very major merge conflicts when running an automatic cherry pick that are likely to take hours to resolve. I notice that this is a huge commit, not all of it (and probably only a very small part of it) relating to the bug fix. Can you point to the line(s) of code that are just the bug fix so that I can implement this within a reasonable amount of time?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ranran(retired)

Yes, it was a huge amount so I gave up once before. According to my memorandum, the following standard commit should follow from that commit.
9406, 9407, 9411, 9412, 9463, 9473, 9474
ひめしという日本人が開発者達の助言を無視して自分好みの機能をextendedに"強引に"実装し、
コードをぐちゃぐちゃにしてメンテナンスを困難にし(とりわけ道路と建物関連)、
挙句にバグを大量に埋め込み、それを知らんぷりして放置し(隠居するなどと言って)別のところに逃げ隠れて自分のフォーク(OTRP)は開発を続けている
その事実と彼の無責任さに日本人プレイヤーは目を向けるべき。らんらんはそれでやる気をなくした(´・ω・`)
他人の振り見て我が振り直せ。ひめしのようにならないために、らんらんが生み出したバグや問題は自分で修正しなくちゃね(´・ω・`)

ceeac

Quote from: jamespetts on January 10, 2021, 01:14:49 PMCan I confirm that this is the commit in question?
Yes.

Quote from: jamespetts on January 10, 2021, 01:14:49 PMCan you point to the line(s) of code that are just the bug fix so that I can implement this within a reasonable amount of time?
Not necessary - I have managed to apply the changes myself already. :) Pull request #340 includes the changes to the io/ folder in Standard so the code in the io/ folder should be equivalent between Standard and Extended. I have only tested the changes with demo.sve from pak128.britain-ex so I'm not 100% sure that there are no regressions.

jamespetts

Excellent, thank you: now incorporated.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

Unfortunately, the fix to this appears to have caused this issue in which all Linux clients crash on attempting to save any game using zstd. I am afraid that I will need to revert this patch unless a solution can be found, and I do not have the expertise in this area of the code (nor the time to acquire such expertise) to do this myself.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

This error appears, if the libzstd is compiled without ZSTD_MULTITHREAD i.e. only single thread support, according to some googling. Since zstd is only faster than zip with multithreading (otherwise it takes longer) probably best to compile the own libzstd for the nightly yourself.

Otherwise, there is libblosc, which claims to be even fast for binary data and supports zstd and zlib out of the box (together with other, probably useless formats). https://github.com/Blosc/c-blosc2

jamespetts

Thank you for this - that is helpful. It would be very odd if the version included with the Ubuntu 20.04LTS package did not have multi-threading support enabled.

Indeed, this page refers to this package being built with ZSTD_LEGACY_MULTITHREADED_API=1, suggesting that multi-threading is enabled, albeit that a legacy interface is used. Could this legacy interface itself be the problem?

In any case, I will be reverting the patch for the time being as I do not believe that I will be able to work out the intricacies of this extremely complex problem within any reasonable amount of time.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

Unfortunately, using ZSTD_LEGACY_MULTITHREADED_API does not say if ZSTD_MULTITHREADED is defined. If not, this is just ignored. Strange.