News:

SimuTranslator
Make Simutrans speak your language.

r7434 Linux 64-bit random crashes

Started by Ters, December 23, 2014, 12:46:48 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Ters

While testing how the most recent revisions work on my old Linux box, I have encountered some random crashes. They seem to come from sprintf or similar, but I have not managed to track down where in Simutrans this call is made. So far, it has only crashed when not running in the debugger. I had a fresh download of the English base translations, and a fresh check-out of pak64. I consistently had the display settings window open to see the statistics, and was zooming in and out when it happened. Since zooming in and out was all I ever did, that might be unrelated.

Update:
I managed to track it down to color_gui_t::draw, but not anything more precise yet (still not crashing in debugger). snprintf is however used consistently in that function, and I suspect that frame time might be so huge when the game starts lagging badly when fully zoomed out, that it simply overflows the buffer. Since the numbers being printed are 32-bit integers, increasing BUF_MAXLEN_MS_FORMAT in display_settings.h to 14 should be enough. If some of the values are signed, it needs to be 15. Going for 16 just to really be safe might not cost much, either.

DrSuperGood

The actual "correct" solution would be to use "snprintf" which takes the buffer length as an argument (so cannot cause a buffer overflow). Too bad that is a C++11 function... Microsoft has provided similar for years as well to mitigate the inherit safety problems of sprintf.

Something is not right here as for it to overflow it would need a time value of 5 decimal characters in length (~100 seconds) which is a bit excessive for rendering a frame (the process would be unresponsive for over a minute). Or it would need to update at more than 9,999 fames per second. Or have a very large sim loops rate or large simloops separator.

In any case there are 3 possible solutions.
1. Use a string stream of sorts appears the recommend C++ approach. (C++98? Not sure if Simutrans targets that)
2. Hardcode bounds checks. This could allow scaling of units and most useful output (changing from "ms" to "s" and ">X s"). Poor maintainability due to no coupling with buffer size.
3. Raise buffer size. Easiest solution but also will waste more memory in most cases (more memory to solve an uncommon issue).

Ters

#2
Quote from: DrSuperGood on December 23, 2014, 04:25:42 PM
The actual "correct" solution would be to use "snprintf" which takes the buffer length as an argument (so cannot cause a buffer overflow). Too bad that is a C++11 function... Microsoft has provided similar for years as well to mitigate the inherit safety problems of sprintf.

snprintf has actually been around for ages on Linux/GNU, possibly also others. Microsoft has had _snprintf for a long time as well, but without some of the crucial safety features of a proper snprintf.

In this case, the required buffer length is predictable, so I didn't suggest for Simutrans to create a pseudo-snprintf that either uses snprintf when present, or the Microsoft specific equivalent (not _snprintf, the other one I can't remember the name of).

Quote from: DrSuperGood on December 23, 2014, 04:25:42 PM
Something is not right here as for it to overflow it would need a time value of 5 decimal characters in length (~100 seconds) which is a bit excessive for rendering a frame (the process would be unresponsive for over a minute). Or it would need to update at more than 9,999 fames per second. Or have a very large sim loops rate or large simloops separator.

I'm puzzled by the huge number required for my hypothesis, but I haven't had time for running Simutrans with increased buffer size yet, so I could see what the numbers are. It does not need to frame time, it can be one of the other fields.

Update:
It's the idle time that briefly is a very large number, probably integer underflow.

TurfIt

Quote from: Ters on December 23, 2014, 04:43:34 PM
I'm puzzled by the huge number required for my hypothesis, but I haven't had time for running Simutrans with increased buffer size yet, so I could see what the numbers are. It does not need to frame time, it can be one of the other fields.

Update:
It's the idle time that briefly is a very large number, probably integer underflow.
I have seen idle time huge before - likely it's briefly negative, and with the timing change from signed to unsigned....  Never's crashed for me though, and I just assumed it was those numbers being the nonsense that they are.
Leaving only 5 chars to print a uint32 into is ridiculous, there's no reason to be so stingy on the buffer length here. Extended in r7437.

Ters

In a sense, it wasn't crashes, but a controlled shutdown by some buffer overrun check in glibc.

DrSuperGood

QuoteNever's crashed for me though, and I just assumed it was those numbers being the nonsense that they are.
It would only crash directly if it went into write blocked virtual address space. Otherwise it would corrupt a few bytes of whatever is directly after it which may or may not cause any damage depending on the struct arrangement.

Ters

Quote from: DrSuperGood on December 23, 2014, 11:38:50 PM
It would only crash directly if it went into write blocked virtual address space. Otherwise it would corrupt a few bytes of whatever is directly after it which may or may not cause any damage depending on the struct arrangement.

The idle time is in the middle of the structure, so it's not the former. Yet glibc, or perhaps some GCC functionality, somehow notices it's about to do the latter and simply aborts.

DrSuperGood

QuoteThe idle time is in the middle of the structure, so it's not the former. Yet glibc, or perhaps some GCC functionality, somehow notices it's about to do the latter and simply aborts.
VisualC++ compilers also can be made to do that (it is a standard compiler feature). It is a special compiler flag whereby it builds in array bound checks and asserts when they are violated during runtime. It is a feature intended only to be enabled for debug builds due to the obvious performance penalty (one can be certain that the bound checks are not free) and is very handy at spotting logical errors with array bounds (we all make them from time to time lol).

Ters

Quote from: DrSuperGood on December 24, 2014, 02:46:29 PM
It is a feature intended only to be enabled for debug builds due to the obvious performance penalty (one can be certain that the bound checks are not free)

But I was not using a debug build of Simutrans initially, and never one of glibc. It is possibly that buffer overruns are considered such a security risk these days, that the performance penalty is the lesser evil. Much software is also made with managed languages or similar, such as C# and Java, where bounds are always checked.

DrSuperGood

QuoteIt is possibly that buffer overruns are considered such a security risk these days
Not likely seeing how commonly used pointers still are (which are even more unsafe). They are purely there to spot errors such as you reported. Without this feature it would be very difficult to detect because the values it corrupts by overflowing might not cause any damage. In properly written C/C++ code you will explicitly bounds check when appropriate as a lot of array code is inherently safe if done correctly.

According to the GCC manual you need a special compiler command in the section of calling conventions to bound check but it appears not in use (directly at least).

Dwachs

There is already a custom implementation of vsnprintf in utils/cbuffer_t.cc, maybe we should replace all the sprintf calls by calls to this method?
Parsley, sage, rosemary, and maggikraut.

Ters

Quote from: DrSuperGood on December 25, 2014, 01:48:09 AM
Not likely seeing how commonly used pointers still are (which are even more unsafe). They are purely there to spot errors such as you reported. Without this feature it would be very difficult to detect because the values it corrupts by overflowing might not cause any damage. In properly written C/C++ code you will explicitly bounds check when appropriate as a lot of array code is inherently safe if done correctly.

The problem is that properly written C/C++ appears to be a rarity. So the strategy now seems to be to not trust any code.