ALLOCA query for those who have coding expertise

jamespetts · July 18, 2013, 12:11:25 AM

I am currently trying to modify the passenger generation code. One of the things that I am trying to do is increase the possible number of alternative destinations. Currently, the maximum is 15 (i.e., 16 destinations in total, a primary and 15 alternatives). I am keen to increase that to a much higher figure - several hundred at least. There are two particular places in the passenger generation code where I currently have fixed size arrays of objects (not native types) allocated on the stack, currently limited to 16, which arrays need to be at least as large as the number of destinations.

They are:

Code Select


minivec_tpl<halthandle_t> destination_list[16];

and

Code Select


destination destinations[16];

"destination" is a struct that looks like this:

Code Select


struct destination
    {
        koord location;
        uint16 type;
        destination_object object; 
    };

In order to have more than 16 destinations, I thought it a good idea to do away with fixed size arrays and allocate dynamically. Since both of these arrays are only used within a single method then discarded (albeit used multiple times in a while loop), I thought that it would be more efficient to use alloca than new/delete, so as to allocate them on the stack rather than heap (this is, after all, a somewhat performance critical part of the code).

I tried:

Code Select


size_t const n = max_destinations;
ALLOCA(destination, destinations, n);

inside the while loop (just for "destinations" initially), but found that it produced this error in MSVC++ repeatedly:

Quote
Stack area around _alloca memory reserved by this function is corrupted

Although this error was apparently not fatal (pressing "continue" allowed the program to continue to execute), this did not look good. I tried moving this code outside the loop (putting it before the loop), but the behaviour was unchanged (Dr. Memory, however, did not detect any errors). Adding then the code for the minivec to look like this:

Code Select


minivec_tpl<halthandle_t> destination_list[16];
size_t const n = max_destinations;
ALLOCA(destination, destinations, n);
ALLOCA(minivec_tpl<halthandle_t>, destination_list, n);

then caused an access violation as soon as the first attempt was made to access any data in destination_list[0].

Is there a sensible way around this, or am I better off avoiding ALLOCA here? The method in question is a new generate_passengers_and_mail() method called directly from karte_t::step().

Other possibilities include:

(1) much bigger fixed size arrays allocated on the stack inside the while loop;
(2) much bigger fixed size arrays allocated on the stack outside the while loop (allocated less often but allocated even when while loop is not entered at all during the particular method call);
(3) much bigger fixed size arrays that are members of the karte_t object (and thus heap allocated, but allocated only once);
(4) using new[]/delete to allocate these objects dynamically instead of using fixed size arrays; and
(5) using a minivec/vector allocated on the stack during the method instead of arrays (which would result in a nested minivec in one case).

Any thoughts on how best to deal with this would be much appreciated.

MCollett · July 18, 2013, 10:29:44 AM

Quote from: jamespetts on July 18, 2013, 12:11:25 AM
I thought that it would be more efficient to use alloca than new/delete,
...
Any thoughts on how best to deal with this would be much appreciated.

Don't reinvent the wheel, and don't optimise prematurely.

Simutrans is written in C++, not C: there is no need for freshly written code to go anywhere near malloc, alloc or any of that ilk. In any decent modern C++ implementation, new/delete will make use behind the scenes of memory arenas or other optimisation techniques to keep actual calls to allocate fresh heap memory to a minimum. Better still, avoid any explicit memory management by using a container from the standard C++ library.

IF you detect a real slowdown, and IF profiling shows that it is coming from this piece of code, then, and only then, you MIGHT consider some low-level memory tinkering.

Best wishes,
Matthew

jamespetts · July 18, 2013, 11:45:11 AM

Thank you for the suggestions - may I ask whether you recommend any container(s) in particular? (Would the existing Simutrans types such as minivec_tpl not suffice?) Do you think that new[]/delete[] here is better than using a single large array stored in karte_t?

Markohs · July 18, 2013, 01:54:16 PM

I think he refers to using standard already implemented bug free data structures from the STL.

http://www.cplusplus.com/reference/vector/vector/
http://www.cplusplus.com/reference/array/array/
http://www.cplusplus.com/reference/stl/

They are not used in simutrans for historic reasons, and to keep compatibility with old operating systems/compilers. But I think you can use them freely, they are supported on any non-prehistoric compiler you use nowadays, and are portable. If you require a very specific super-optimized data structure, you will probably need to write your own implementation, but for most of things, I'd just go STL, the implementations are robust and will save you lots of troubles.

jamespetts · July 18, 2013, 03:48:43 PM

Does this also mean that I should convert all existing uses of the Simutrans vectors to STL vectors? And does this also mean that you think that a vector here (including the nested vector to which I refer above) is the best solution to this issue rather than, for example, using new[]/delete[] or a much larger fixed size array?

MCollett · July 18, 2013, 09:40:02 PM

Quote from: jamespetts on July 18, 2013, 03:48:43 PM
Does this also mean that I should convert all existing uses of the Simutrans vectors to STL vectors? And does this also mean that you think that a vector here (including the nested vector to which I refer above) is the best solution to this issue rather than, for example, using new[]/delete[] or a much larger fixed size array?

In the long run, probably. I wouldn't set out to make the change for its own sake, but whenever you are rewriting code for some other reason (especially because of memory management issues), consider making the switch.
Yes. Using a vector will automagically handle all the memory management, probably more efficiently and certainly more robustly than doing it manually.

Best wishes,
Matthew

Markohs · July 18, 2013, 09:46:06 PM

Quote from: jamespetts on July 18, 2013, 03:48:43 PM
Does this also mean that I should convert all existing uses of the Simutrans vectors to STL vectors?

Well, no, I guess and hope someday simutrans standard will move some of those components to STL, maybe, I was just talking about experimental specific code. If you change Code that affects starndard, future merges will make your work way more difficult. I'd restrict STL to new structures your create. Current simutrans containers perform well enough, and there is not reason to change things that are already working good enough.

Quote
And does this also mean that you think that a vector here (including the nested vector to which I refer above) is the best solution to this issue rather than, for example, using new[]/delete[] or a much larger fixed size array?

A STL vector is able to be resized dynamically, as MCollet aready said. Wasn't that what you were asking for? I'd use it.

But ofc, this is just my personal opinion as always.

jamespetts · July 18, 2013, 10:40:36 PM

There are a great many Simutrans-Experimental specific data structures using the Standard collection classes. It seems to me highly undesirable to use multiple different types of vector in the same application at once, at least for more than a few weeks during some transitional phase. Would it not be better just to use minivecs? Is having nested minivecs a good idea, as will be necessary here?

Markohs · July 19, 2013, 07:40:38 AM

Do as you wish, james.

Having a look to minivec_tpl.h looks like that container is designed to handle up to 256 positions, since "count" it's a uint8 . It's also not going to perform very well asymptotically since most of its operations are linear in time.

I don't understand what do you refer by "nested minivecs" . Allocating the array via ALLOCA or some other similar operation it's not meant to be done, minivec has already routines to create/destroy the venctor, I don't understand why are you trying to allocate memory manually here.

I don't know the details of your implementation, just what you wrote so far, but think once you start manipulating structures with more than 200-300 elements, performance will suffer a lot if you just do linear searches. If you give me a few more details I might be able to help you more. Are destinations meant to be added/removed frequently? How often do you search in that array? Can you sort the destinations somehow or hash them using a id so access time is better?

Markohs · July 19, 2013, 07:46:08 AM

You can allways use vector_tpl, supports up to 2^32 elements

jamespetts · July 19, 2013, 10:11:09 PM

Ahh, apologies for not being a little clearer about how this is intended to be used and what I mean by "nested" here. The vectors/arrays are needed for the multiple destinations feature in passenger generation. The code first assigns the required number of destinations (randomised for each passenger packet between 1 and the maximum set in simuconf.tab, albeit that maximum is currently limited to 16) into the "destinations" array. Each destination is checked and a minivec built of all halts within walking distance of that destination. There is, currently, an array of minivecs (of size 16): for each of 16 destinations, there is an arbitrary number of stops that is reachable from that destination. This is what I meant by nested minivecs - if I were to cease to use the fixed size array of 16 destinations and use a vector or a minivec instead, I'd have to have a vector where each element in that vector is itself a vector (or minivec).

The list is then iterated until a suitable route from the origin building to the destination building can be found that is within the packet's journey time tolerance: the loop is then broken and the passenger/mail packet committed to its journey.

Does this give sufficient context to get a good idea of what might be a good solution? Thank you all for your help so far.

Markohs · July 20, 2013, 12:25:42 PM

Quote from: jamespetts on July 19, 2013, 10:11:09 PM
The list is then iterated until a suitable route from the origin building to the destination building can be found that is within the packet's journey time tolerance: the loop is then broken and the passenger/mail packet committed to its journey.

So this structures are just required to make a decision? All this data is destroyed when a route has been chosen?

jamespetts · July 20, 2013, 12:40:11 PM

Yes.

prissi · July 20, 2013, 08:21:36 PM

Just an addition to STL and templates. Simutrans vector explicitely forbid dangerous stuff like copy of vectors, which work in STL but are likely to result in errors later on (imagine a vector of pointers which copy its content). Furthermore simutrans lists are single linked, which saves a lot of memory for small types and is faster than STL; even more since it uses its own memory management.

If you number of alternative routes is fixed (and never lower than 16 or whatever), I would use either new []. This way the compiler can check out of range. If it can change, then some vector might be needed.

The problem with alloc is, that it will not call a destructor after the end of the function. Hence is is only suitable for simple arrays. However

Code Select


// just a new block for automagically destroyed minivec
{
  minivec_tpl<dext> bla[30];
 ...
}

will almost do the same and destroy the bla array properly at the end of a function.

jamespetts · July 20, 2013, 08:45:42 PM

The array of minivecs solution is what I use now, with a fixed size of 16. The number of alternative routes is not fixed, but is set randomly for each iteration of the passenger generation stepping from 0 to a maximum number specified in simuconf.tab. That maximum number currently cannot exceed 16, but the plan is to be able to make it considerably larger in consequence of removing distance ranges.

jamespetts · July 21, 2013, 10:02:36 PM

Thank you all for your help - I have just tried it with vectors and it seems to work well, with, according to my brief tests, no appreciable reduction in performance as compared to the previous fixed size arrays. (Minivecs did not work because the copy constructor was private).

Markohs · July 22, 2013, 03:46:50 PM

Quote from: prissi on July 20, 2013, 08:21:36 PM
Just an addition to STL and templates. Simutrans vector explicitely forbid dangerous stuff like copy of vectors, which work in STL but are likely to result in errors later on (imagine a vector of pointers which copy its content).

Well, you say "dangerous", I say it's versatile and more powerful. Ofc you can get bugs if you don't know what you are doing, but we are supposed to be programmers.

It's like saying adding "for" to a language when you already have "do" and "while" can lead to errors. Or adding references when you have pointers.

[

Quote from: prissi on July 20, 2013, 08:21:36 PM
Furthermore simutrans lists are single linked, which saves a lot of memory for small types and is faster than STL; even more since it uses its own memory management.

Well, single liked lists have their downsides also as you know, since insertion/removal operations can be tricky or linear in time instead of constant. Ofc if memory usage is your primary factor, single lists are the way to go, but that's not the general case. You can't transverse them efficiently in reverse order, something that can make pretty unefficient certain algorithms.

They have their uses, and are one of the most used structures, but single linked lists are not the best option *always*.

Also, STL also ship single linked lists. http://www.cplusplus.com/reference/forward_list/forward_list/

News:

ALLOCA query for those who have coding expertise