Sdog,

thank you for your input and clarification on Matthew's function: that is most thoughtful. However, I am somewhat doubtful that a logarithmic/exponential function is what is needed here rather than a positively skewed normal distribution. You refer to

*waiting* times above, whereas what I am actually trying to model are journey time tolerances, which are not quite the same thing.

Marchetti's constant holds that people tend to have a fairly uniform travel time budget of about an hour, which strongly suggests that, for any given journey, more people should be willing to spend, say, half an hour travelling than are prepared to spend only three minutes travelling, but equally, more people should be prepared to spend half an hour travelling than three hours. Is there a particular datum on which you base your suggestion to the contrary?

**Edit**: Tests that I am in the process of carrying out are showing good results with compounding/recursion of the cubed version of Kieron's algorithm. Taking two results from the cubed version, multiplying them together and dividing by the maximum has produced a more satisfactory result, with 28 passengers being willing to travel in one game month of 6:24h with a minimum journey time of 4:18h travelling and 2:52h waiting, with 1,583 passengers recording "too slow" and 226 "no route". I will try again with a multiple of three and see whether that is better still. The questions are now, assuming this system proves suitable, how to calibrate it and how best to allow it to be customised in simuconf.tab, since there is no easy single exponent for scaling. Perhaps a few numbers with different modes, such as 0 or 1 for an even distribution, 2 or 3 for the squared or cubed skewed normal distribution, 4 for a double recursion of the cubed distribution and 5 for a triple recursion, or something of the sort?

**Edit 2**: A triple recursion does not seem to work well: dividing by two produces values higher than the maximum, whereas dividing by the square always produces zeros. The best results so far have been obtained by a single recursion.

**Edit 3**: I have found a way to parameterise it, I think, although I have not had a chance to do any serious performance testing yet or see whether this works well for extremely large numbers. The code is as follows:

`/** `

* Generates a random number on [0,max-1] interval with a normal distribution

* See: http://forum.simutrans.com/index.php?topic=10953.0;all for details

* Takes an exponent, but produces unreasonably low values with any number

* greater than 2.

*/

#ifdef DEBUG_SIMRAND_CALLS

uint32 simrand_normal(const uint32 max, uint32 exponent, const char* caller)

#else

uint32 simrand_normal(const uint32 max, uint32 exponent, const char*)

#endif

{

#ifdef DEBUG_SIMRAND_CALLS

sint64 random_number = simrand(max, caller);

#else

uint64 random_number = simrand(max, "simrand_normal");

#endif

if(exponent < 2)

{

// Exponents of 1 make this identical to the normal random number generator.

return random_number;

}

// Any higher number than 3 will produce integer overflows even with unsigned. 64-bit integers

// Interpret higher numbers as directives for recursion.

uint32 degrees_of_recursion = 0;

uint32 recursion_exponent = 0;

if(exponent > 3)

{

degrees_of_recursion = exponent - 4;

if(degrees_of_recursion == 0)

{

recursion_exponent = 2;

}

else

{

recursion_exponent = 3;

}

exponent = 3;

}

const uint64 abs_max = max == 0 ? 1 : max;

for(int i = 0; i < exponent - 1; i++)

{

random_number *= simrand(max, "simrand_normal");

}

uint64 adj_max = abs_max;

for(int n = 0; n < exponent - 2; n ++)

{

adj_max *= adj_max;

}

uint64 result = random_number / adj_max;

for(uint32 i = 0; i < degrees_of_recursion; i ++)

{

// The use of a recursion exponent of 3 or less prevents infinite loops here.

const uint64 second_result = simrand_normal(max, recursion_exponent, "simrand_normal_recursion");

result = (result * second_result) / abs_max;

}

return (uint32)result;

}

The parameters are:

0 or 1 - even distribution;

2: normal distribution with slight skew (squared);

3: normal distribution with large skew (cubed);

4 normal distribution with recursion skew (squared);

5: normal distribution with recursion skew (cubed);

6 and above: normal distribution with multiple recursion skew (cubed; these values produce so extreme a skew as may be of limited usefulness).

Early experimentation appears to show that it might be possible to have a much higher value for the maximum level of tolerance (perhaps circa 1,728,000, representing four months' worth of tenths of minutes, or, if we were to start recording time in seconds rather than minutes*, 20 days) using the number 6, but the results can be somewhat erratic.

* The reason to allow some very high values is that I plan to introduce a "portals" feature, allowing for simplified/abstracted intercontinental travel without an increase in map size. This will entail extremely long journey times of multiple months thus requiring me to use a 32-bit rather than 16-bit integer type to store all journey time information. However, increasing the integer precision from 16-bit to 32-bit also allows timings to be stored in seconds rather than the current tenths of minutes. This changes the meaning of the numbers returned by the random number generator and thus the effect of the skew factor.

**Edit 5**: I have now implemented this fully and made it customisable in simuconf.tab, as well as adding an unskewed normal mode. The simuconf.tab comments should explain the operation of this system:

`# The following settings determine the way in which individual packets of passengers decide`

# what their actual journey time tolerance is, within the above ranges. The options are:

#

# 0 - Even distribution

# Every point between the minimum and maximum is equally likely to be selected

#

# 1 - Normal distribution (http://en.wikipedia.org/wiki/Normal_distribution)

# Points nearer the middle of the range between minimum and maximum are more likely

# to be selected than points nearer the end of the ranges.

#

# 2 - Positively skewed normal distribution (squared) (http://en.wikipedia.org/wiki/Skewness)

# Points nearer the a point in the range that is not the middle but is nearer to the lower

# end of the range are more likely to be selected. The distance from the middle is the skew.

#

# 3 - Positively skewed normal distribution (cubed)

# As with no. 2, but the degree of skewness (the extent to which the modal point in the range

# is nearer the beginning than the end) is considerably greater.

#

# 4 - Positively skewed normal distribution (squared recursive)

# As with nos. 2 and 3 with an even greater degree of skew.

#

# 5 - Positively skewed normal distribution (cubed recursive)

# As with nos. 2, 3 and 4 with a still greater degree of skew.

#

# 6 and over - Positively skewed normal distribution (cubed multiple recursive)

# As with nos. 2, 3, 4, and 5 with an ever more extreme degree of skew. Use with caution.

random_mode_commuting = 2

random_mode_visiting = 5

Thank you to all who have helped.