The International Simutrans Forum

 

Author Topic: network_init_server() - Unable to bind socket to IP address  (Read 5313 times)

0 Members and 1 Guest are viewing this topic.

Offline Michael 'Cruzer'

  • Devotee
  • *
  • Posts: 196
  • Founder of pak192.comic
    • Marktplatz für Parkplätze
  • Languages: DE, EN
network_init_server() - Unable to bind socket to IP address
« on: August 28, 2014, 04:29:49 PM »
Code: [Select]
For help with this error or to file a bug report please see the Simutrans forum at
http://forum.simutrans.com
FATAL ERROR: network_init_server() - Unable to bind socket to IP address: "0.0.0.0"
Aborting program execution...

I often get this error message when stopping a server and then starting it again soon. Is there anything I can do to prevent this? (Since it blocks the start. When retrying it after ~120 seconds everything works fine again.)

I am using
Code: [Select]
kill $pid (while $pid is the var containing the correct process ID of course), is this the correct way to stop the server via script? (Server is based on a Debian 7 minimal system.)

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5454
  • Languages: EN, NO
Re: network_init_server() - Unable to bind socket to IP address
« Reply #1 on: August 28, 2014, 04:48:30 PM »
Simutrans has no support for SIGTERM that I can find. (The only proper way to shut it down that I know, is through the GUI, but then I have never done any multiplayer stuff. Maybe the nettool has a way.) kill will therefore pull the rug on the process. It is possible that the resources held by the process, but never properly released, will be held in limbo for a while, although I've never seen such behaviour. Another possibility is that Simutrans does end somewhat gracefully, and that takes some time. Or that kill sends SIGTERM, and if the process doesn't take notice, it will after a grace period, move on to more drastic measures.

Offline Philip

  • *
  • Posts: 90
  • Languages: EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #2 on: August 28, 2014, 04:57:58 PM »
Simutrans has no support for SIGTERM that I can find. (The only proper way to shut it down that I know, is through the GUI, but then I have never done any multiplayer stuff. Maybe the nettool has a way.) kill will therefore pull the rug on the process. It is possible that the resources held by the process, but never properly released, will be held in limbo for a while, although I've never seen such behaviour. Another possibility is that Simutrans does end somewhat gracefully, and that takes some time. Or that kill sends SIGTERM, and if the process doesn't take notice, it will after a grace period, move on to more drastic measures.

It's supported via SDL, if I'm reading the code correctly. The signal turns into SYSTEM_QUIT, which then sets env_t::quit_simutrans.

I suspect we could call network_core_shutdown a bit earlier after receiving the signal, though there will always be some delay while the signal works its way through the event queue.

It should unbind within a few seconds, though, not anything near the 120 seconds Michael hinted at. Something is seriously wrong if it takes that long.

Offline Michael 'Cruzer'

  • Devotee
  • *
  • Posts: 196
  • Founder of pak192.comic
    • Marktplatz für Parkplätze
  • Languages: DE, EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #3 on: August 28, 2014, 05:00:18 PM »
Quote
The only proper way to shut it down that I know, is through the GUI

That's a pity, since there is no GUI in posix build.

Quote
Or that kill sends SIGTERM

Due to the kill documentation (it's man page) it should only send one signal which is per default SIGTERM. You can force kill to send any signal by passing signal number as a parameter. I can give it a try for sending SIGKILL, but when there is no handler implemented in an application SIGKILL and SIGTERM should do the same, due to my knowledge.

Quote
It should unbind within a few seconds, though, not anything near the 120 seconds Michael hinted at. Something is seriously wrong if it takes that long.

It seems to be just some seconds in most cases. But sometimes it tooks very long (which blocks my reboot as described). I am not sure, but it may have something to do if there has been an active connection to the server while SIGKTERM is triggered.

Offline Philip

  • *
  • Posts: 90
  • Languages: EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #4 on: August 28, 2014, 05:08:04 PM »
That's a pity, since there is no GUI in posix build.

I think termination via SIGTERM is an intended feature, and the right way to terminate a server. Sending a SIGKILL instead of SIGTERM will immediately terminate the server process, without doing any cleanup or saving anything. It's a bit like taking the battery out of your device, which is not a good way to shut things down.

Due to the kill documentation (it's man page) it should only send one signal which is per default SIGTERM. You can force kill to send any signal by passing signal number as a parameter. I can give it a try for sending SIGKILL, but when there is no handler implemented in an application SIGKILL and SIGTERM should do the same, due to my knowledge.

Again, we do have a SIGTERM handler, courtesy of SDL, or kill wouldn't work at all.

It seems to be just some seconds in most cases. But sometimes it tooks very long (which blocks my reboot as described). I am not sure, but it may have something to do if there has been an active connection to the server while SIGKTERM is triggered.

That sounds like it could do with some investigation. It's possible we wait for inactive clients to time out before unbinding our server socket, which we shouldn't do.

Offline Michael 'Cruzer'

  • Devotee
  • *
  • Posts: 196
  • Founder of pak192.comic
    • Marktplatz für Parkplätze
  • Languages: DE, EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #5 on: August 28, 2014, 05:15:03 PM »
Had a look at simsys_s.cc and simsys_posix.cc and yes it seems like there isn't any handler. But it also looks like a graceful shutdown can be done like

Code: [Select]
void GetEvents() // and also GetEventsNoWait()
{
    if (sigterm_received) {
         sys_event.type = SIM_SYSTEM;
         sys_event.code = SYSTEM_QUIT;
    }
}

something like:

Code: [Select]
void posix_sigterm(int signum)
{
    printf("Received SIGTERM, exiting...\n");
    sigterm_received = 1;
}

and

Code: [Select]
// inside main()
struct sigaction action;
memset(&action, 0, sizeof(struct sigaction));
action.sa_handler = posix_sigterm;
sigaction(SIGTERM, &action, NULL);

but I don't have much knowledge about how Simutrans code works internally. That's just what I see would be the equivalent of SDL implementation. But I'll give it a try later.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5454
  • Languages: EN, NO
Re: network_init_server() - Unable to bind socket to IP address
« Reply #6 on: August 28, 2014, 05:35:17 PM »
It's supported via SDL, if I'm reading the code correctly.

Again, we do have a SIGTERM handler, courtesy of SDL, or kill wouldn't work at all.

But SDL is not part of the game here. One doesn't normally use kill to terminate GUI programs.

Had a look at simsys_s.cc and simsys_posix.cc and yes it seems like there isn't any handler. But it also looks like a graceful shutdown can be done like

Code: [Select]
void GetEvents() // and also GetEventsNoWait()
{
    if (sigterm_received) {
         sys_event.type = SIM_SYSTEM;
         sys_event.code = SYSTEM_QUIT;
    }
}

something like:

Code: [Select]
void posix_sigterm(int signum)
{
    printf("Received SIGTERM, exiting...\n");
    sigterm_received = 1;
}

and

Code: [Select]
// inside main()
struct sigaction action;
memset(&action, 0, sizeof(struct sigaction));
action.sa_handler = posix_sigterm;
sigaction(SIGTERM, &action, NULL);

I was thinking along the same lines. One might have to do some #ifdef-ing with alternate code for Windows, because I think it has it's own way of doing shutdown handlers for console programs.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1323
Re: network_init_server() - Unable to bind socket to IP address
« Reply #7 on: August 28, 2014, 05:36:56 PM »
Is there anything I can do to prevent this? (Since it blocks the start. When retrying it after ~120 seconds everything works fine again.)
No. The OS is waiting for any sockets in the TIME_WAIT state to transition to fully CLOSED. Until they're all closed, an application can't rebind; 120s is a typically timeout for this to occur.
(Yes there are socket options to force the bind, but 2 mins is not a huge wait for things to be properly cleaned up IMHO).


I am using
Code: [Select]
kill $pid (while $pid is the var containing the correct process ID of course), is this the correct way to stop the server via script? (Server is based on a Debian 7 minimal system.)
'nettool shutdown'

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2613
  • Languages: EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #8 on: August 28, 2014, 05:45:26 PM »
In a graceful shutdown situation you should send the process a signal (via a command line driver or something?) which then makes the process run through shutdown procedures which include closing of any communication sockets. Forceful process termination or other strange external shutdown procedures will always produce buggy results such as leaking live sockets (which are eventually closed).

If running a command line is too heavy weight I would advise some form of open ended pipe allowing you to send signals to the server process from a separate command line tool (which you run only when required) that gets the server to shut down gracefully.

If the server has frozen, crashed or otherwise become excessively unresponsive then forceful shutdown is the only really safe way. All open sockets and OS objects should eventually get cleaned up but this may take a while. For an automatic restart script I would recommend trying to start and if the sockets are not available then waiting 120 seconds before trying again (and repeat in a loop).

Offline Michael 'Cruzer'

  • Devotee
  • *
  • Posts: 196
  • Founder of pak192.comic
    • Marktplatz für Parkplätze
  • Languages: DE, EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #9 on: August 28, 2014, 05:52:33 PM »
Got it work like I described it above. :D

Quote
One might have to do some #ifdef-ing with alternate code for Windows

You are right. Only tested it on Mac and Linux. Don't have any Windows system available here, but due to this Stackoverflow question there isn't any SIGTERM available: http://stackoverflow.com/questions/17566800/a-windows-equivalent-to-sigaction

I guess this code should work fine on any Unix like system. So would it be correct to exclude it just from Windows? via:

Code: [Select]
#ifndef _WIN32
     // sigterm patch
#endif

EDIT: *.diff file available at http://forum.simutrans.com/index.php?topic=13912.msg138076#msg138076
« Last Edit: August 28, 2014, 06:47:37 PM by Michael 'Cruzer' »

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5454
  • Languages: EN, NO
Re: network_init_server() - Unable to bind socket to IP address
« Reply #10 on: August 28, 2014, 06:44:56 PM »
The OS is waiting for any sockets in the TIME_WAIT state to transition to fully CLOSED. Until they're all closed, an application can't rebind; 120s is a typically timeout for this to occur.

While this makes sense, and I've seen sockets hang out in TIME_WAIT for that long, I'm pretty sure I've restarted servers in less than two minutes many times. I kind of would have expected that it is the individual sockets returned by accept() for each peer that was in TIME_WAIT, not the listening socket passed as an argument to accept().

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4564
  • Languages: EN, DE, AT
Re: network_init_server() - Unable to bind socket to IP address
« Reply #11 on: August 30, 2014, 11:35:39 AM »
My experience is that this situation depends on whether a client is connected or not: If server is shutdown (by gui) while client is connected, I cannot immediately restart the server. If no client is connected during shutdown - no problem with restarting.

Offline Michael 'Cruzer'

  • Devotee
  • *
  • Posts: 196
  • Founder of pak192.comic
    • Marktplatz für Parkplätze
  • Languages: DE, EN
Re: network_init_server() - Unable to bind socket to IP address
« Reply #12 on: August 30, 2014, 11:42:32 AM »
Yes, Dwachs I agree to that.

After testing my patch provided in the link above for some days I found out, that this issue isn't fixed by a graceful shutdown. (And as you pointed out seems not be a failure of the patch, which does its work.)