News:

SimuTranslator
Make Simutrans speak your language.

2025-10-13 Server Outage

Started by Isaac Eiland-Hall, October 13, 2025, 07:50:26 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Isaac Eiland-Hall

During the afternoon, CPU on the server slowly increased, causing sites to stop responding. A reboot of the server seemed to help temporariy, but within the hour, sites became unresponsive again.

Further investigation seems to reveals >4.2GB in log files. Those files have been rotated and set t o rotate daily. Server CPU currently looks good but am monitoring.

Isaac Eiland-Hall

Update: Different log file got overwhelmed in less than 24 hours.

It appears the server is seeing increased traffic. Possibly bots trying to find vulneabilities. For now, AWStats paused and increased apache workers and it seems stable, but we did have a few minutes of downtime a couple of times this afternoon as I was working on it — in part because of two server restarts as well as the server running out of workers because it was being overloaded by writes to log files that were huge in size.

Keeping an eye on it. Of course, if problems happen when I'm asleep or at dialysis, it may be a bit, but I am actively monitoring.

Isaac Eiland-Hall

#2
Websites became briefly unavaiable this evening; I have increased the workers again.

I suspect there's some underlying issue I haven't found yet, but for now things again appear stable.

edit: Diagnostics are not turning up anything. It appears the Japanese Forum *may* be a contributor, but I don't feel confident on that yet. Next time there's a problem, I will run some information gathering before restarting the server to see if I can figure out the actual cause. Note that we did have some large log files not being rotated - those are being rotated now, so whatever part that had to play, it no longer does.