News:

The Forum Rules and Guidelines
Our forum has Rules and Guidelines. Please, be kind and read them ;).

minivec_tpl overflow bug

Started by ACarlotti, February 06, 2018, 08:25:50 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

ACarlotti

At present, if you try adding an element to minivec_tpl when it already has 255 elements, count overflows to 0 (so the vector becomes empty). The diff fixes this (though the line numbers might be wrong for Standard).

I'm going to upload a commit for Extended fixing this once I've fixed a bug this was masking (and had a quick check for other such bugs).


diff --git a/tpl/minivec_tpl.h b/tpl/minivec_tpl.h
index c5b6e88e3..8c766b1b4 100644
--- a/tpl/minivec_tpl.h
+++ b/tpl/minivec_tpl.h
@@ -66,6 +66,9 @@ public:
         */
        void append(T elem, uint8 extend = 1)
        {
+               if (count == 255) {
+                       dbg->fatal("minivec_tpl<T>::append()", "already has 255 elements.");
+               }
                if (count >= size) {
                        resize( count > 255-extend ? 255 : count+extend);
                }
@@ -112,7 +115,10 @@ public:
        void insert_at(uint8 pos, T elem)
        {
                if (pos > count) {
-                       dbg->fatal("minivec_tpl<T>::append()", "cannot insert at %i! Only %i elements.", pos, count);
+                       dbg->fatal("minivec_tpl<T>::insert_at()", "cannot insert at %i! Only %i elements.", pos, count);
+               }
+               if (count == 255) {
+                       dbg->fatal("minivec_tpl<T>::insert_at()", "already has 255 elements.");
                }

                if (pos < count) {

Dwachs

Which kind of bug this was masking?
Parsley, sage, rosemary, and maggikraut.

ACarlotti

Use of append instead of append_unique in creating a list of transported classes. I've also found a probable issue in Extended's path explorer, relating to schedules with >128 entries set to be run also in reverse.

DrSuperGood

A better solution is to not clamp the resize. Resize takes uint so is 32bit (on most common platforms). The resize method also checks for overflow and throws the termination.

/**
* Resizes the maximum data that can be hold by this vector.
* Existing entries are preserved, new_size must be big enough to hold them
*/
void resize(uint new_size)
{
if (new_size > 255) {
dbg->fatal("minivec_tpl<T>::resize()", "new size %u too large (>255).", new_size);
}
// not yet used, but resize may be called anyway
if(size<=0) {
size = new_size;
data = new T[size];
return;
}

if (new_size <= size) return; // do nothing

T* new_data = new T[new_size];
for (uint i = 0; i < count; i++) new_data[i] = data[i];
delete [] data;
size = new_size;
data = new_data;
}

This prevents duplicate checks while still throwing fatal error if the problem is encountered.

Additionally the 255 constant should be replaced with a primitive type maximum value constant, magic numbers are bad.

In either case the result is a fatal error if the function is called. Cases that call the function should explicitly apply limits to 255 entries to stop this ever occurring.

Also the resize operation should take size in the form of size_t instead of int, as it is dealing with array indices.

TurfIt

Clamping the resize introduced the bug 5 years ago.... suggest revert - and make sure the resize calculation is not uint8...


Index: tpl/minivec_tpl.h
===================================================================
--- tpl/minivec_tpl.h   (revision 8378)
+++ tpl/minivec_tpl.h   (working copy)
@@ -64,10 +64,10 @@
         * Appends the element at the end of the vector.
         * if out of space, extend it
         */
-       void append(T elem, uint8 extend = 1)
+       void append(T elem, uint extend = 1)
        {
                if (count >= size) {
-                       resize( count > 255-extend ? 255 : count+extend);
+                       resize( (uint)count + extend );
                }
                data[count++] = elem;
        }



Quote from: DrSuperGood on February 06, 2018, 09:12:52 PM
Additionally the 255 constant should be replaced with a primitive type maximum value constant, magic numbers are bad.
If only there was one.... (that 'modern' C++ thingy again.)


Quote from: DrSuperGood on February 06, 2018, 09:12:52 PM
Also the resize operation should take size in the form of size_t instead of int, as it is dealing with array indices.
indicies in a special class limited to 8 bit. int is fine.

ACarlotti

You could remove the test in insert_at by casting count to uint32 (or similar) before adding 1. But it's not so straightforward in append, since it's designed to allow growing the memory allocation by a large amount in one go. Removing the clamping without adding anything else would mean that asking for 64 more entries of storage (for future expansion) when you've already used 192 entries will not work.

I also disagree about the use of a 'magic number' in this instance - it's a very small class whose size limit is not going to change (we have vector for larger stuff), and 255 is integral to the design, rather than some parameter which can be changed. This limit is also inherent in the uses of of uint8 - you're not going to call that a magic number too, are you?

And sure, calling cases should apply the correct limits. But they sometimes contain bugs, and it's better to fail properly here than to overflow and mask the bug elsewhere.

DrSuperGood

QuoteIf only there was one.... (that 'modern' C++ thingy again.)
Exactly... And it is not even modern C++, probably over a billion people have been born since it was added.
Quoteindicies in a special class limited to 8 bit. int is fine.
Fine but still kind of arbitrary as the platform limits are size_t.

Also kind of silly how it checks if an unsigned int is less than (or equal to) 0.
QuoteYou could remove the test in insert_at by casting count to uint32 (or similar) before adding 1. But it's not so straightforward in append, since it's designed to allow growing the memory allocation by a large amount in one go. Removing the clamping without adding anything else would mean that asking for 64 more entries of storage (for future expansion) when you've already used 192 entries will not work.
Should the API have that functionality tied to append? Generally vectors use power of two based expansion.

QuoteI also disagree about the use of a 'magic number' in this instance - it's a very small class whose size limit is not going to change (we have vector for larger stuff), and 255 is integral to the design, rather than some parameter which can be changed. This limit is also inherent in the uses of of uint8 - you're not going to call that a magic number too, are you?
Except it is technically the limit of uint8 so really should be in a constant named such. One might go as far as to say it should be declared as a limit for this class which gets its value from the basic type constant. It is also kind of useful for using the class so one can check out for the limit.
QuoteAnd sure, calling cases should apply the correct limits. But they sometimes contain bugs, and it's better to fail properly here than to overflow and mask the bug elsewhere.
Yes it should fail here. However users should never see it fail here as this is to catch bugs. Hence how this has likely been present for over 5 years as the bug is effectively with a validation layer rather than the API itself.

ACarlotti

So I've delved into the history of minivec_tpl, searching the forum, git history (i.e. svn history), and history.txt, and I believe the following are true.

minivec_tpl was introduced in around 2001 to reduce the space needed per square, apparently by ~8B. It does so by, among other things, using only a single byte to store the number of elements and size of memory (which saves 6B) for the object list.
By 2003, there was a microvec_tpl too. This differed from minivec in that a single element list was stored in what would otherwise be a pointer to the full vector.
By 2004 (the first available version of the code), simplan.cc was using microvec_tpl. It seems reasonable to assume that this change took place in 2003.
In 2006, microvec_tpl was removed. I think since then simplan.cc has handled its object list needs inhouse.

The only significant case I can see for using minivec_tpl instead of vector_tpl is if we are storing a very large number of very short vectors. This was clearly the case when it was storing an object list for every square, but I don't think it really justified in any other cases.

I rebuilt Extended with #define minivec_tpl vector_tpl (and a couple of other minor necessary changes) and tested the memory usage on my 640x512 game. The results (according to top) were:
vector_tpl: actual 1.055  virtual 3326
minivec_tpl: actual 1.051 virtual 3319

So a difference in memory usage of <0.5%. I think the best course of action is to just remove minivec_tpl.

Ters

Quote from: DrSuperGood on February 06, 2018, 10:04:57 PM
Generally vectors use power of two based expansion.

The name, and the 256 entry limit, suggest this is a specialization for small vectors. Specializations exist because the generic solution is not optimal. But as ACarlotti goes into, it might be a specialization that has been abused. I suggest testing memory usage on a bigger map that a mere 640x512, though.

Quote from: DrSuperGood on February 06, 2018, 10:04:57 PM
Except it is technically the limit of uint8 so really should be in a constant named such.

Perhaps, but the parameter type should still be uint8, since that is way more accessible when looking at the function declaration than a constant defined elsewhere, even if only a few lines away.

ACarlotti

I agree about testing on a larger map; I am somewhat limited though by only having 2GB of memory (I ought to get a new laptop soon), although I haven't explored how much larger I could go within this limit.

Dwachs

These minivec_tpl's are used in instances of halts, lines, convoys, schedules, and crossing-logic blocks. I do not expect that changing this to vector_tpl would increase memory consumption that much.
Parsley, sage, rosemary, and maggikraut.

jamespetts

Quote from: ACarlotti on February 07, 2018, 06:25:45 AM
I rebuilt Extended with #define minivec_tpl vector_tpl (and a couple of other minor necessary changes) and tested the memory usage on my 640x512 game. The results (according to top) were:
vector_tpl: actual 1.055  virtual 3326
minivec_tpl: actual 1.051 virtual 3319

So a difference in memory usage of <0.5%. I think the best course of action is to just remove minivec_tpl.

This is a very interesting test - however, many Simutrans-Extended games are much larger than this. I should be very interested to know what the results are with the current saved game on the Bridgewater-Brunel server, for instance.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

prissi

They were the foundation of dinglist, which is probably the only really perfomance critical of the lot. Changing this from 16 bit to 8 bit gained a lot (mostly due to better caching/memory footprint of tiles and throwing too many pedestrians away if there were too many added on a tile).

DrSuperGood

Quoteminivec_tpl was introduced in around 2001 to reduce the space needed per square, apparently by ~8B. It does so by, among other things, using only a single byte to store the number of elements and size of memory (which saves 6B) for the object list.
Be aware that the struct is only unaligned in GCC. It is aligned when built with MSVC.
T* data;
uint8 size;  ///< Capacity
uint8 count; ///< Number of elements in vector

This means in GCC it will have a size of 6B (x86) or 10B (x86-64). In MSVC it will have a size of 8B (x86) or 16B (x86-64). Note that if the GCC version is a member of another struct/class then the alignments of following non 1B or 2B members might add padding.

QuoteI rebuilt Extended with #define minivec_tpl vector_tpl (and a couple of other minor necessary changes) and tested the memory usage on my 640x512 game. The results (according to top) were:
vector_tpl uses...
T* data;
uint32 size;  ///< Capacity
uint32 count; ///< Number of elements in vector

GCC -> 12B (x86), 16B (x86-64)
MSVC -> 12B (x86), 16B (x86-64)

Memory usage is increased, but it really depends on the number of such objects that exist. One needs millions of them to even make a few MB difference.
QuoteSo a difference in memory usage of <0.5%. I think the best course of action is to just remove minivec_tpl.
Possibly a good idea. One might find that there is no noticeable performance impact as well as reading bytes is not much, if at all, faster than reading words.
QuoteI suggest testing memory usage on a bigger map that a mere 640x512, though.
Use the Experimental server game. That is excessively large and so memory usage differences are extremely noticeable.
QuotePerhaps, but the parameter type should still be uint8, since that is way more accessible when looking at the function declaration than a constant defined elsewhere, even if only a few lines away.
A static constant member of the minivec_tpl should define the maximum size/count limit. This could get its value from somewhere else, or it could be hard coded. The key is that it is now a named constant rather than 255 repeated multiple times around the header. This constant can then be used as part of the class API to prevent trying to overflow the vector instead of letting the game crash. Implementing limits this way would have the advantage that if one ever decides to change the type, eg to uint16 then it can be done transparently to usage as the API remains the same.

One could also consider breaking out the method definitions from the header into a .c file, since link time optimization inlinse such functions if appropriate anyway.
QuoteThese minivec_tpl's are used in instances of halts, lines, convoys, schedules, and crossing-logic blocks.
Which would be a few thousand instances in even a well developed game at most. Few KB extra memory usage is trivial so I agree.

However apparently they are used for other things in Experimental. So it might have a larger memory impact there than in standard.

prissi

Indeed, most of these are not used more than 100000 times at most, even in well developed games. One may indeed go with the normal vector. Or change to a std vector, following the discussion ih the other thread.

Ters

Quote from: DrSuperGood on February 07, 2018, 10:24:36 PM
Be aware that the struct is only unaligned in GCC. It is aligned when built with MSVC.
T* data;
uint8 size;  ///< Capacity
uint8 count; ///< Number of elements in vector

This means in GCC it will have a size of 6B (x86) or 10B (x86-64). In MSVC it will have a size of 8B (x86) or 16B (x86-64). Note that if the GCC version is a member of another struct/class then the alignments of following non 1B or 2B members might add padding.

I get a sizeof such a struct (a pointer followed by two byte-sized variables) to be 8 bytes with both GCC 7.2 and Microsoft C++ 17, so it appears GCC always pads the structure out to native word size (which in this case is 4 bytes), unless told otherwise. Although I didn't test minivec_tpl itself, I can't see it telling the compiler otherwise. I also think this is according to some specification. It might be the Windows ABI, as I haven't tested GCC on Linux.

Quote from: DrSuperGood on February 07, 2018, 10:24:36 PM
A static constant member of the minivec_tpl should define the maximum size/count limit. This could get its value from somewhere else, or it could be hard coded. The key is that it is now a named constant rather than 255 repeated multiple times around the header. This constant can then be used as part of the class API to prevent trying to overflow the vector instead of letting the game crash. Implementing limits this way would have the advantage that if one ever decides to change the type, eg to uint16 then it can be done transparently to usage as the API remains the same.

I'm not arguing against having a constant defined in one place over several literals scattered around, although 255 is a very obvious value for an 8-bit unsigned integer (as is 0). But when I see a function taking in a size_t, and the function is not documented to state otherwise, then I assume the implementation can work with the entire address space available on the platform. I won't go digging through the implementation looking for some constant that may or may not represent the maximum allowable argument value. Furthermore, if one goes the other way, from uint16 to uint8, you get at least warnings at compile time for passing in a too large value. It doesn't save you if the limit changes from 2048 to 1024, though.

DrSuperGood

Quoteso it appears GCC always pads the structure out to native word size (which in this case is 4 bytes), unless told otherwise.
It is being told otherwise... Or maybe it is not being told correctly otherwise?
QuoteI can't see it telling the compiler otherwise
It does...

private:
minivec_tpl(const minivec_tpl&);
minivec_tpl& operator=( minivec_tpl const& other );

T* data;
uint8 size;  ///< Capacity
uint8 count; ///< Number of elements in vector
} GCC_PACKED;

Which on GCC (not MSVC)...
# define GCC_PACKED    __attribute__ ((__packed__))
on MSVC the macro expands to nothing, hence normal alignment rules are used for MSVC.

prissi

For 32 bit I optimised sizes with GCC 3. (Using the -sizes flag) and they did not changed last time I checked (two years ago). Still sizes are most criutical for tiles and trees, buildings and vehicels, i.e. everything that ios plenty on the map. For crossings, stations, even convois, it should be not so critical.

Ters

Quote from: DrSuperGood on February 08, 2018, 09:09:07 PM
It is being told otherwise...

I just didn't see it at the time. Must have been still half asleep.

It only matters for schedules and halts, though, as padding is instead added on the outside to force correct alignment of subsequent fields in the other structures. simline_t can be packed better by putting state_color after goods_catg_index, but it is pointless as we all seem to agree. Half of the users have financial history, which would likely make the difference in structure size insignificant even in itself. crossing_logic_t, by having three minivec_tpls, will grow the most, for what that matters. (I don't have many crossings in my games.)

(Disclaimer: I might also now be still half asleep.)

ACarlotti

So it sounds like we're broadly in agreement about scrapping minivec_tpl - correct? The easiest (<10 min) way of doing this is to search/replace minivec_tpl with vector_tpl, try building it, and then fix the (very small) number of places where it won't build (usually by removing the final 'extend' argument to an append callm, such as in creating schedules; I think it's also necessary to add a boolean return value to one of the vector_tpl methods).

DrSuperGood

For standard at least that sounds reasonable for now. However I cannot comment if extended uses them in places which the size might matter.