News:

Use the "Forum Search"
It may help you to find anything in the forum ;).

What formula is rescale_probability() implementing and how?

Started by userfriendly, October 23, 2018, 07:33:26 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

userfriendly

I would appreciate some help with understanding what this function does and what algorithm it's using.

// Knightly : determine the combined probability of 256 rounds of chances
uint16 rescale_probability(const uint16 p)
{
if(  p  ) {
sint64 pp = ( (sint64)p << 30 ) / 10000LL;
sint64 qq = ( 1LL << 30 ) - pp;
uint16 ss = 256u;
while(  (ss >>= 1)  ) {
pp += (pp * qq) >> 30;
qq = (qq * qq) >> 30;
}
return (uint16)(((pp * 10000LL) + (1LL << 29)) >> 30);
}
return p;
}


From what I gather, the above code calculates 1 - ( 1 - P ) ^ 256, a type of binomial distribution. Given N (256 in this case) trails of an event with probability P, what is the chance of it occurring at least once? I ran some tests and compared the results - the numbers seem to point to the above formula, but just wanted to make sure this is indeed the case.

Is the above code an optimization of the (1 - P)^256 calculation (instead of using C's pow(1 - P, 256))? Shifting to either side by 30 should be equivalent to multiplication or division by 2^30. I can't really understand how those series of shifts (and especially the ones in the while block) achieve the formula calculation.

PS: Is ss an optimized counter as well? It seems equivalent to i++ over 8 iterations. GCC 7.1 with -O3 seems to inline the whole while into 8 blocks.

Just be good to each other.

ACarlotti


Quote from: userfriendly on October 23, 2018, 07:33:26 PM
From what I gather, the above code calculates 1 - ( 1 - P ) ^ 256, a type of binomial distribution. Given N (256 in this case) trails of an event with probability P, what is the chance of it occurring at least once? I ran some tests and compared the results - the numbers seem to point to the above formula, but just wanted to make sure this is indeed the case.
That seems correct.

Quote
Is the above code an optimization of the (1 - P)^256 calculation (instead of using C's pow(1 - P, 256))? Shifting to either side by 30 should be equivalent to multiplication or division by 2^30. I can't really understand how those series of shifts (and especially the ones in the while block) achieve the formula calculation.
We don't use pow because it would involve using floating point arithmetic, and it is very hard to make floating point computation reproducible on different machines. Since multiplayer games on Simutrans involve everyone running the same simulation on their own computers, it is important that all computations are reproducible.

To account for the shifts:
The input p of the function, and the return value, are both the numerator of a fraction of the form p/10000, which represents a probability. This is converted to the form pp/(2^30), which allows the intermediate results to be computed to a greater precision (and therefore for the final rounded output to be more accurate). The shifts in the while loop are because pp and qq both represent a probability multiplied by 2^30, and hence the products represent a probability multiplied by 2^60. Since we want to store the new probability multiplied only by 2^30, we must divide by the unwanted extra factor of 2^30.

The return statement includes 1LL<<29, which is added to the numerator before division (i.e. truncation) to ensure correct rounding at this stage (this is more critical than the rounding within the while loop).

Quote
PS: Is ss an optimized counter as well? It seems equivalent to i++ over 8 iterations. GCC 7.1 with -O3 seems to inline the whole while into 8 blocks.
I have no idea why the loop counter uses shift rather than increments. Perhaps it arises from the idea that ss is the power we still need to raise pp/2^30 to.

The overall purpose of this code is indicated by the commit message and a comment added in r4602:

CHG: ensure a minimum interval between 2 consecutive expansions of a factory
// Knightly : chance to expand every 256 rounds of activities, after which activity count will return to 0 (overflow behaviour)