The International Simutrans Forum

 

Author Topic: Server freezes  (Read 688 times)

0 Members and 1 Guest are viewing this topic.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Server freezes
« on: December 20, 2020, 06:25:14 PM »
I am noticing that the server is sometimes becoming unresponsive. It appeared to happen earlier to-day and happened again this afternoon just as I attempted to join.
The log messages show as follows:
Code: [Select]
Message: check_activity():    Accepted connection from: [IP ADDRESS REDACTED]
Message: socket_list_t::add_client:    add client socket[20] at address 5247182cd
Message: network_command_t::rdwr:    read packet_id=4, client_id=0
Warning: network_check_activity():    received cmd id=4 nwc_join_t from socket[20]
Message: nwc_join_t::execute:   
Message: network_command_t::rdwr:    write packet_id=4, client_id=0
Message: packet_t::send:    sent 27 bytes to socket[20]; id=4, size=27
Message: network_command_t::rdwr:    write packet_id=5, client_id=0
Message: packet_t::send:    sent 26 bytes to socket[20]; id=5, size=26
Message: network_command_t::rdwr:    write packet_id=5, client_id=0
Message: packet_t::send:    sent 26 bytes to socket[13]; id=5, size=26
Message: packet_t::send:    sent 26 bytes to socket[9]; id=5, size=26
Message: packet_t::send:    sent 26 bytes to socket[20]; id=5, size=26
Message: network_command_t::rdwr:    write packet_id=9, client_id=0
Message: packet_t::send:    sent 201 bytes to socket[13]; id=9, size=201
Message: packet_t::send:    sent 201 bytes to socket[9]; id=9, size=201
Message: packet_t::send:    sent 201 bytes to socket[20]; id=9, size=201
Warning: nwc_sync_t::do_command:    sync_steps 150415
Message: network_command_t::rdwr:    write packet_id=6, client_id=0
Message: packet_t::send:    sent 14 bytes to socket[20]; id=6, size=14
Message: network_command_t::rdwr:    write packet_id=17, client_id=0
Warning: nwc_routesearch_t::rdwr:    rdwr limits=(1024, 247, 6170, 17978368, 304960) apply_limits=1
Message: packet_t::send:    sent 43 bytes to socket[20]; id=17, size=43
Warning: nwc_routesearch_t::transmit_active_limit_set:    transmit succeeded sync_step=150415 map_counter=19580638 limits=(1024, 247, 6170, 17978368, 304960)
I will restart the server again, but I post this here to have a record of the issue for future reference.
Edit: CPU usage is reported as 85%, so this looks like a possible infinite loop or similar rather than a thread deadlock.

Offline Matthew

  • *
  • Posts: 511
    • Japan Railway Journal
  • Languages: EN, some ZH, DE & SQ
Re: Server freezes
« Reply #1 on: December 21, 2020, 08:08:09 AM »
Here are the log entries before another freeze yesterday:


Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #2 on: December 28, 2020, 09:38:20 PM »
This appears to have happened twice to-day. I have not taken the log files from those occurrences, but if anyone finds any situation where an infinite loop/freeze can be reproduced, please let me know.

Offline freddyhayward

  • Devotee
  • *
  • Posts: 594
  • Languages: EN
Re: Server freezes
« Reply #3 on: December 28, 2020, 09:53:36 PM »
To state the obvious, the last line of both logs is transmit_active_limit_set.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #4 on: December 28, 2020, 10:04:05 PM »
transmit_active_limit_set

That is part of the path explorer - but it's not a part of the code that I have ever looked into in any detail, having been written some years ago by Knightly, so its significance in this context is not immediately obvious and some detailed investigation will be required to ascertain the relevance of this.

Offline freddyhayward

  • Devotee
  • *
  • Posts: 594
  • Languages: EN
Re: Server freezes
« Reply #5 on: December 29, 2020, 01:56:56 AM »
This has happened again - I wonder if you could write a script that automatically restarts the server if it has been inactive for over an hour?
edit: I was somehow able to start joining - I will see whether this is successful. edit: it was successful.
« Last Edit: December 29, 2020, 02:53:50 AM by freddyhayward »

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #6 on: January 01, 2021, 11:20:27 AM »
The freeze has occurred again, this time very shortly after a new player loaded in. The end of the log file is as follows:

Code: [Select]
Warning: nwc_routesearch_t::reset:   all static variables are reset
Message: karte_t::load():   Prepare for loading
Message: karte_t::load():   Time is now: 32349
Warning: karte_t::load:   File version: 120007, Extended version: 14, Extended revision: 32
Message: nwc_auth_player_t::init_player_lock_server:   new = 32767
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,1997)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,1997)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,1998)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,1998)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,1999)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,1999)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6888,2000)!
Warning: haltestelle_t::rdwr():   bound to ground twice at (6887,2000)!
Message: nwc_auth_player_t::init_player_lock_server:   new = 32767
Warning: karte_t::load():   loaded savegame from 4/1972, next month=-1690304512, ticks=-1693078205 (per month=1<<22)
Message: simu_main():   Running world, pause=0, fast forward=0 ...
Message: check_activity():   Accepted connection from: 79.73.134.94.
Message: socket_list_t::add_client:   add client socket[7] at address 4f49865ed
Message: network_command_t::rdwr:   write packet_id=9, client_id=0
Message: network_command_t::rdwr:   read packet_id=1, client_id=4
Warning: network_check_activity():   received cmd id=1 nwc_gameinfo_t from socket[7]
Message: nwc_gameinfo_t::execute:   
Message: network_command_t::rdwr:   write packet_id=1, client_id=0
Message: packet_t::send:   sent 14 bytes to socket[7]; id=1, size=14
Message: socket_list_t::remove_client:   remove client socket[7]
Message: check_activity():   Accepted connection from: 79.73.134.94.
Message: socket_list_t::add_client:   add client socket[7] at address 4f49865ed
Message: network_command_t::rdwr:   read packet_id=1, client_id=0
Warning: network_check_activity():   received cmd id=1 nwc_gameinfo_t from socket[7]
Message: nwc_gameinfo_t::execute:   
Message: network_command_t::rdwr:   write packet_id=1, client_id=0
Message: packet_t::send:   sent 14 bytes to socket[7]; id=1, size=14
Message: socket_list_t::remove_client:   remove client socket[7]
Message: check_activity():   Accepted connection from: 79.73.134.94.
Message: socket_list_t::add_client:   add client socket[7] at address 4f49865ed
Message: network_command_t::rdwr:   read packet_id=4, client_id=0
Warning: network_check_activity():   received cmd id=4 nwc_join_t from socket[7]
Message: nwc_join_t::execute:   
Message: network_command_t::rdwr:   write packet_id=4, client_id=0
Message: packet_t::send:   sent 26 bytes to socket[7]; id=4, size=26
Message: network_command_t::rdwr:   write packet_id=5, client_id=0
Message: packet_t::send:   sent 26 bytes to socket[7]; id=5, size=26
Message: network_command_t::rdwr:   write packet_id=5, client_id=0
Message: packet_t::send:   sent 26 bytes to socket[7]; id=5, size=26
Message: network_command_t::rdwr:   write packet_id=9, client_id=0
Message: packet_t::send:   sent 201 bytes to socket[7]; id=9, size=201
Warning: nwc_sync_t::do_command:   sync_steps 2
Message: network_command_t::rdwr:   write packet_id=6, client_id=0
Message: packet_t::send:   sent 14 bytes to socket[7]; id=6, size=14
Message: network_command_t::rdwr:   write packet_id=17, client_id=0
Warning: nwc_routesearch_t::rdwr:   rdwr limits=(1024, 98304, 65536, 16777216, 256) apply_limits=1
Message: packet_t::send:   sent 43 bytes to socket[7]; id=17, size=43
Warning: nwc_routesearch_t::transmit_active_limit_set:   transmit succeeded sync_step=2 map_counter=175262 limits=(1024, 98304, 65536, 16777216, 256)
CPU usage is at 99.6% and memory at 83.3%.

Offline ceeac

  • Devotee
  • *
  • Posts: 219
Re: Server freezes
« Reply #7 on: January 10, 2021, 12:33:14 PM »
Does the server use zstd for save files? In Standard, there was a data race bug causing hangs on load in some cases. I fixed the bug in r9406 (commit 1f61ea1), so cherry-picking the commit for Extended should probably fix the bug there, too.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #8 on: January 10, 2021, 01:14:49 PM »
1f61ea1

Thank you for that. Yes, we do use zstd for saving/loading. Can I confirm that this is the commit in question?
Edit: I have looked into this but there are some very major merge conflicts when running an automatic cherry pick that are likely to take hours to resolve. I notice that this is a huge commit, not all of it (and probably only a very small part of it) relating to the bug fix. Can you point to the line(s) of code that are just the bug fix so that I can implement this within a reasonable amount of time?
« Last Edit: January 10, 2021, 02:18:55 PM by jamespetts »

Offline Ranran

  • Devotee
  • *
  • Posts: 1412
  • Languages: ja
Re: Server freezes
« Reply #9 on: January 10, 2021, 02:23:25 PM »
Yes, it was a huge amount so I gave up once before. According to my memorandum, the following standard commit should follow from that commit.
9406, 9407, 9411, 9412, 9463, 9473, 9474

Offline ceeac

  • Devotee
  • *
  • Posts: 219
Re: Server freezes
« Reply #10 on: January 12, 2021, 04:02:46 PM »
Can I confirm that this is the commit in question?
Yes.

Can you point to the line(s) of code that are just the bug fix so that I can implement this within a reasonable amount of time?
Not necessary - I have managed to apply the changes myself already. :) Pull request #340 includes the changes to the io/ folder in Standard so the code in the io/ folder should be equivalent between Standard and Extended. I have only tested the changes with demo.sve from pak128.britain-ex so I'm not 100% sure that there are no regressions.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #11 on: January 12, 2021, 07:58:03 PM »
Excellent, thank you: now incorporated.

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #12 on: January 13, 2021, 09:44:50 AM »
Unfortunately, the fix to this appears to have caused this issue in which all Linux clients crash on attempting to save any game using zstd. I am afraid that I will need to revert this patch unless a solution can be found, and I do not have the expertise in this area of the code (nor the time to acquire such expertise) to do this myself.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10439
  • Languages: De,EN,JP
Re: Server freezes
« Reply #13 on: January 13, 2021, 01:18:06 PM »
This error appears, if the libzstd is compiled without ZSTD_MULTITHREAD i.e. only single thread support, according to some googling. Since zstd is only faster than zip with multithreading (otherwise it takes longer) probably best to compile the own libzstd for the nightly yourself.

Otherwise, there is libblosc, which claims to be even fast for binary data and supports zstd and zlib out of the box (together with other, probably useless formats). https://github.com/Blosc/c-blosc2

Offline jamespetts

  • Simutrans-Extended project coordinator
  • Administrator
  • *
  • Posts: 20648
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Server freezes
« Reply #14 on: January 13, 2021, 06:45:38 PM »
Thank you for this - that is helpful. It would be very odd if the version included with the Ubuntu 20.04LTS package did not have multi-threading support enabled.

Indeed, this page refers to this package being built with ZSTD_LEGACY_MULTITHREADED_API=1, suggesting that multi-threading is enabled, albeit that a legacy interface is used. Could this legacy interface itself be the problem?

In any case, I will be reverting the patch for the time being as I do not believe that I will be able to work out the intricacies of this extremely complex problem within any reasonable amount of time.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 10439
  • Languages: De,EN,JP
Re: Server freezes
« Reply #15 on: January 14, 2021, 01:39:20 AM »
Unfortunately, using ZSTD_LEGACY_MULTITHREADED_API does not say if ZSTD_MULTITHREADED is defined. If not, this is just ignored. Strange.