News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Possible additional calibration mechanism for passenger transport

Started by jamespetts, May 18, 2020, 12:21:06 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

jamespetts

I notice that the number of passengers being transported on the Bridgewater-Brunel server seems higher than one might expect for the 1750s. I notice that this has been discussed in the in-game chat. Ideally, we would want realistic numbers of passengers being conveyed for realistic reasons: it would not do to vary the number of passengers wanting to travel just on the basis of the numbers on the year in a calendar; that, after all, is not how it works in reality. What is needed is a new mechanism that will affect the numbers of passengers travelling in a realistic way and will work equally realistically in all eras.

The best candidate mechanism so far is one that targets passengers' willingness to take long journeys (measured in time) to relatively unimportant destinations. This is only relevant to visiting passengers, since commuting passengers have much lower maximum journey time tolerances in any event.

Currently, visiting passengers are generated with a randomised travel time tolerance, which can be anything from 2 minutes to tens of hours (although is more likely to be near the middle of that range using a mechanism to replicate a normal distribution). They then pick at random a destination from the list of all visiting destinations, weighted by the visitor demand of that destination. If they can reach that destination within the travel time tolerance, they go there. If not, they pick another destination and keep going until their allocated number of alternative destinations runs out. The travel time tolerance is fixed at the beginning and is not affected by the destination.

My current proposed revision is this: each visiting destination will have three new data:
(1) minimum destination journey time tolerance;
(2) maximum destination journey time tolerance; and
(3) destination journey time tolerance applicability percentage.

These will all default to 0 in the pakset, in which case, the system will behave as it does now. However, if these values be set, then the current system will be modified as follows. For each visitor destination that a given passenger checks, if that visitor destination has a journey time tolerance applicability percentage of > 0, there will be a chance, weighted by that percentage, that the passenger will use the destination journey time tolerances of that building. If this be set to 100, passengers bound for this building will always use the destination tolerances.

When a passengers use the destination tolerance, a randomised destination journey time tolerance of somewhere in between the individual building's minimum and maximum will be generated. Passengers will then only travel to this building if it can be reached in a time equal to or less than the lesser of their general journey time tolerance or the destination specific journey time tolerance.

This is intended to simulate the fact that, not only do passengers have, for each potential trip, a maximum amount of time that they are able to travel, they also have a tolerance of how long that they think it worth travelling to reach any given destination, which may be less than the amount of time that they would be able to spend travelling.

This will mean that buildings such as parish churches, small town halls, local parks (and, in the later game, small shops, the smaller sports stadia and so forth) will, if set up with appropriate data, only attract relatively local passengers, whereas more important buildings, such as castles, cathedrals, large city halls (and, in the later game, attractions such as the larger sized sports stadia) will attract passengers from far away. Because it will be possible to configure the settings so that there are only a few buildings, which occur infrequently and only in larger towns, that attract a significant number of non-local passengers in this way in the early years, and allow later buildings to be more attractive to longer journeys, this will allow an increase in passengers' willingness to travel in later times. It will also mean that long distance travel tends to concentrate on the centres of larger towns, rather than being more evenly dispersed as it is now.

I should be very grateful for feedback on this. I note that it is proposed largely because it would potentially make a large difference to balance whilst requiring only a relatively small coding effort.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ranran(retired)

I think that few people use public transportation to go to work in old age. (The distance is also short.)

EDIT:
I looked at the passengers in my carriage, they are only visitors.
Maybe they have traveled too many times?
ひめしという日本人が開発者達の助言を無視して自分好みの機能をextendedに"強引に"実装し、
コードをぐちゃぐちゃにしてメンテナンスを困難にし(とりわけ道路と建物関連)、
挙句にバグを大量に埋め込み、それを知らんぷりして放置し(隠居するなどと言って)別のところに逃げ隠れて自分のフォーク(OTRP)は開発を続けている
その事実と彼の無責任さに日本人プレイヤーは目を向けるべき。らんらんはそれでやる気をなくした(´・ω・`)
他人の振り見て我が振り直せ。ひめしのようにならないために、らんらんが生み出したバグや問題は自分で修正しなくちゃね(´・ω・`)

Vladki

@ranran - then it is ok, isn't it. In early years commuters will mostly walk (short distance), and only visitors going for longer distance use some vehicle.

@James: that sounds good, perhaps using both tolerances (traveller's and destination's) at the same time, and use smaller of them. As a traveller who can spend only 10 hours travelling, will not be able to travel to a destination 20 hours away, even if the destination's time tolerance would be even higher.

Also maybe revise my earlier suggestion on journey price tolerance. I.e. not strictly excluding passengers from using some means of transport due to being lower class than the vehicle, but do it according to the total journey price. If the expensive vehicle is the only option, and walking is too slow, even less wealthy traveller may be able to pay higher fee for short distance. But if the journey is too long, he would have to use lower class vehicle, which may not be available at the moment.   This would e.g. enable "very low" class to use "low" class trains to commute, but not to travel far away. Same with stagecoaches - wealthy people could use them to travel around the world, medium class only to visit the nearest market or church, and poor people not at all.

Mariculous

That's a difficult one, to be honest.

From my thoughts, I guess normal distribution is not a good choice.

I'd expect very many short distance visiting trips, e.g. to the local grocer or a neighbor, a fair amount of visiting trips to medium distances, few visiting trips to far-away destinations, but still also visiting trips to very far-away destinations, i.e. journey times of more than 24 hours, although these are quite rare for sure.

Normal distribution won't give us such a figure.
We have to keep in mind it's not the actual journey time itself that is normal distribiuted, it's only the maxiumum acceptable time that is normal distribiuted, so the distribution of actual journey times is closer to the above expectation.
Still there are too many medium distance journeys in the current figure.
From my observation, most passengers are generated to destinatons between 3 and 4 hours. That's definitely not what I would expect.
In any epoch,I'd expect quite local journeys to a local grocer or a nearby friend to be the most frequently ones.


About the suggestion: I do also think that journey times will be more of a matter for generic destinations than for unique destinations.
For example, a grocer or barneyard is usually not quite unique. People will usually prefer the one that is closest, although there are for sure some differences, so they won't always do so.
Friends do also most often live nearby and if they don't people will usually visit them less frequently than another friend that lives nearby.
On the other hand, monuments are quite unique. People will travel longer distances to visit such. Cologne is full of Asians visiting the Cathedral and further locations.
If you ask them why they came there, it is very unlikely they will say "oh, I always wanted to go shopping in a German supermarket, so I travelled to Cologne."
Attractions are not that unique but still likely attract passengers from farer away.

I don't agree with your suggested three parameters, however, because that sounds quite complicated for any pakset author.
I guess a more simple solution like a uniquenes factor for buildings might be more sensible.
Let's think of the maximum journey time the "willingness to travel long distances".
That means, if a maximum journey time of 2 hours was calculated and selected a destination that has a uniquenes of 4, that means it's fine for this traveller to travel to that destination if the (expected) journey time does not exceeed 8 hours.
We could set that factor to less than 1 to get some attractions only attract local passengers, just as in your suggestion.




To point this out again:
I think it would be a nice feature to somehow specify passengers' willingness to take long journeys to a specific building, but I do also think that the main cause for what we see on Bridgewater currently is the choice of the distribution function that is used to randomise maximum journey times and not the lack of such a feature.

freddyhayward

While large attractions are a better driver of tourism, smaller and generic attractions would still see a lot of long-distance travel. Of course it is difficult to simulate a holiday where tourists stay at a hotel and visit different local attractions. perhaps the train chaining mechanic deals with this partially but I don't know. in any case, a traveler to rome might visit primarily to see the coliseum, but would also want to experience restaurants and visit public squares, fountains and statues.

jamespetts

Ranran - in the 1750s, most commuting is done on foot, just as in reality. One would need more concentrated centres of employment for there to be any significant commuting by public transport. In reality, commuting by public transport was only really a thing from the mid 19th century onwards.

Vladki - the idea is indeed that it is the minimum of the two tolerances that would be used.

Freahk - how do you envisage a single integer "uniqueness factor" being used in an algorithm? I cannot immediately see a way of doing this at present. Also, what do you suggest as an alternative to a normal distribution algorithm? Normal distribution can only not be the best algorithm if there is a specific better algorithm.

Another thought, incidentally, is to allow buildings to be able to specify a visitor capacity, which would have the same algorithm as the employment capacity at present. This would prevent huge numbers of visitors wishing to visit the small local shops that we see in the pakset that are not industries, intended to represent something like a cobbler. The ultimate intention would be for local consumer industries to be the main driver of local visiting passengers, and thus to make freight transport important for passenger transport to be effective, allowing passenger transport success alone to be used as the metric for how much that towns should grow.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

That's something I also thought about.
Effectively this would be a relation in between journey time and onward trips.
A passenger that started a long journey will do some onward trips before returning home.
Onward trips are randomised when a passenger is about to go home, so relating the onward trip likelyhood to the time it takes to go home might work.
That would, however, not assign a fixed hotel to a passenger and will generally increase the number of journeys around destinations that attract people from far-away.
However, the first might be acceptable, the second even desirable.


Quote from: jamespetts on May 19, 2020, 12:45:41 AM
Freahk - how do you envisage a single integer "uniqueness factor" being used in an algorithm?
As described, calculate tolerable journey time, select a potential destination, calculate journey time to that destination, compare maximum_journey_time * destination.uniqueness < actual_journey_time

Don't think of maximum journey time as a definite time, rather think of it being a willingness to travel long times that, multiplied with uniqueness, results in an actual maximum journey time (to that specific destination)

A uniqueness of 1 would result in the exact same behavior as currently, uniqueness of greater than 1 will increase the distance passengers travel to that destination, valuses smaller than 1 will decrease it.


Quote from: jamespetts on May 19, 2020, 12:45:41 AMRanran - in the 1750s, most commuting is done on foot, just as in reality. One would need more concentrated centres of employment for there to be any significant commuting by public transport. In reality, commuting by public transport was only really a thing from the mid 19th century onwards.
In some capitals on bridgewater, such quarters exist and allowing "verly low" people to use your public transport will attract quite a lot commuters.
However, usually you don't want to do this. Especially with the very high passenger generation currently, you rather want to reassign classes to higher levels instead of lower ones.


Quote from: jamespetts on May 19, 2020, 12:45:41 AMAnother thought, incidentally, is to allow buildings to be able to specify a visitor capacity, which would have the same algorithm as the employment capacity at present. This would prevent huge numbers of visitors wishing to visit the small local shops that we see in the pakset that are not industries, intended to represent something like a cobbler.
That might be a useful adddition to the other features, but I don't expect it to work well on its own.
Further, when seperating visiting trips from shopping trips, I guess this might be a good mechanism for the new city growth. It might however not work well at all with current citygrowth.

The new city growth algorithm migt build industries, shops and housings depending on how many commuting and shopping slots are actually used.
If few shopping or commuting slots are used, it will attempt to build housings nearby, although also ensuring it doesn't build these directly into the industry quarter.
If many shopping slots are used, it will attemt to build new shops nearby the most used shops.
If many commuting slots are used, it will attempt to build new industries or shops nearby the most used ones.

Quote from: jamespetts on May 19, 2020, 12:45:41 AMAlso, what do you suggest as an alternative to a normal distribution algorithm?
As mentioned, it's difficult, especially due to the way we attempt to find a destination within journey time, but I guess a gamma distribution should work better, but I am unsure about the exact parameters.
The parameters might even difer from time to rime.

Vladki

Take this as a brainstorming ideas.

- I like the idea of visitor capacity similar to employment capacity. Actually that is the only thing about the building (whether res/com/ind/cur/mon/hq/factory/station/signalbox/... ) you can sensibly put into dat files.
- Uniqueness should be simply the number of the same objects on the current map. So in one game there may be only one Stonehenge, attracting people from all around the map, while in another it would spawn in every other village, and most people will be satisfied by visiting the local henge. On the other hand a unique villa (type=res) may appear somewhere, and attract architecture fans from far away, or the last open-air market (com/factory) offering some special local produce would attract far away customers. Or even futuristic high-level company headquarters or town hall... So the lower the number of same object, the higher the uniqueness
- For simplicity, uniqueness should only reduce the travel time tolerance. Maybe not simply by division (travel_time_tolerance / number_of_same_objects), but perhaps something that does not fall that fast - 1/sqrt ? (time / sqrt (uniqueness))
- So the original travel time tolerance would be a time a visitor is willing to travel to an absolutely unique destination

- The random distribution of travel time tolerance, should IMHO be similar to distribution of wealth. Although they are not equal, a poor man may be willing to travel long time (on foot), as well as rich man. They'll just cover different distances. But the distribution may be similar. Lets say that most people will be willing to travel for time X (the median), probably only very few people will not be willing to travel at all, or only very short time (that would be people with some disabilities). But on the other side there will always be someone willing to travel almost infinitely (definitely more than 2x). So the graph would be with sharp rise from zero to X, and then something like logarithmic going down... Perhaps something like https://en.wikipedia.org/wiki/Gamma_distribution with k=2 and theta=2

- Still thinking how to add the financial tolerance to the mix?

EDIT: on second thought, the uniqueness may be one of the most important factors in limiting the travel time tolerance. The journeys generated should only say the type of destination they are interested in:
- commuting (anything that accepts commuters)
- shopping (visitors to com/ind/factory, maybe also hq & town hall) - similar frequency as commuting
- leisure (visitors to monuments, attractions, and other friends = res) - lower frequency (weekends only - so about 1/7 or 2/7 frequency of commuting and shopping)
If the destination buildings are distributed as they are in real world, the time tolerance would be similar - there are many shops to choose from, so traveling too far does not make sense, and vice versa.

Mariculous

Quote from: Vladki on May 19, 2020, 04:26:12 PMUniqueness should be simply the number of the same objects on the current map.
I don't think so, although uniqueness might be a little missleading.

Let's take sfootball stadiums as an example again, as stonehenge should actually be monument anyway if these were not restricted to cities.
At a specific time, people will prefer a specific football stadium and they will accept long journeys to get there.
That is what uniqueness is about. I'd also argue housings have a higher uniquenes than san average shop does, but there might be quite special shops that will again have a greater uniquenes.
It doesn't actually matter how many of these exist in the world, it's simply a measurement of "how important is visiting that specific destination to the average potential visitor."

It's effect, btw. will be roughly the same as James suggestion.


However, your suggested variant of uniqueness is also not a bad one, I just don't think it would work well for housings.
Each single housing is actually quite unique. However, there will be very many houses of the same type spread around the world, so long journeys from hosing to housing will effectively never happen.


Thanks for agreeing with gamma distribution, I didn't pay too much attention to the stochastic lecture, although I passed that exam :D

Vladki

At football, usually more fans of the local team come to see the match than of the guest team. So the uniqueness must be somehow individualised. Someone will travel to far away stadium, someone not.

Mariculous

It is only the maximum tolerabe time that is induvidually adjusted by the stadium in my suggestion.
There is no mechanism at all that prevents local passengers from going there.
Fans of the local team will also very much likely consider longer journey times to get to their home stadium than they would consider to visit a fountain.
Especially larger stadiums, implied the club is successful, have a rather huge "terretory" where most inhabitants will be fans of that club.

Vladki

Quote from: Freahk on May 19, 2020, 06:17:22 PMEspecially larger stadiums, implied the club is successful, have a rather huge "terretory" where most inhabitants will be fans of that club.
That should be achieved by having more small/local/lower league football grounds - and a few big stadiums (premier league). The small, but omnipresent stadiums would reduce the travel time limit more, than the few big ones.

jamespetts

I should note that this thread was really intended to discuss some specific and limited changes to passenger generation that might have a large effect for a very small coding effort. Anything requiring major coding effort will have to go into the queue of major features and wait potentially many, many years before being implemented. The idea of having more than two different types of trips (visiting and commuting), for example, would require major re-engineering of the memory structures used to store visitor targets, and would require very significant and complex additional balancing/calibration work in the paksets. There would be likely to be very significant consequences in many other parts of the balance of passenger generation each one of which would have to be considered in great detail and dealt with, each means of dealing with which requires a currently entirely unknown amount of coding work which could in principle be very great indeed.

In relation to alternatives to a normal distribution, it is not immediately clear how a gamma distribution would be different to a normal distribution in practice (and I must confess that I do not fully understand Wikipedia's description of a gamma distribution, which seems to require an extremely high degree of mathematical knowledge to follow). May I ask why and how you think that a gamma distribution would be better?

For reference, here is the code that generates the normal function. As you will see, this can be skewed by simuconf.tab settings:


uint32 simrand_normal(const uint32 max, uint32 exponent, const char*)
#endif
{
#ifdef DEBUG_SIMRAND_CALLS
sint64 random_number = simrand(max, caller);
#else
uint64 random_number = simrand(max, "simrand_normal");
#endif

if(exponent == 0)
{
// Non-normalised random number.
return random_number;
}

if(exponent == 1)
{
// Normalised but unskewed.
random_number += simrand(max, "simrand_normal unskewed");
random_number /= 2ll; // Averaging a larger numbers will create a stronger normal peak, but it will still be unskewed.
return (uint32)random_number;
}

// Any higher number than 3 would produce integer overflows even with unsigned. 64-bit integers
// if the model was followed. Interpret higher numbers as directives for recursion.

uint32 degrees_of_recursion = 0;
uint32 recursion_exponent = 0;
if(exponent > 3)
{
degrees_of_recursion = exponent - 4;
if(degrees_of_recursion == 0)
{
recursion_exponent = 2;
}
else
{
recursion_exponent = 3;
}

exponent = 3;
}

const uint64 abs_max = max == 0 ? 1 : max;

for(int i = 0; i < exponent - 1; i++)
{
random_number *= simrand(max, "simrand_normal");
}

uint64 adj_max = abs_max;

if(exponent >= 3)
{
// This was originally a for loop, but doing this more than once
// overflows even an unsigned 64-bit integer with largeish max values.
// A single squaring can fit in a 64-bit integers even with max values
// in the millions.
adj_max *= adj_max;
}

uint64 result = random_number / adj_max;

for(uint32 i = 0; i < degrees_of_recursion; i ++)
{
// The use of a recursion exponent of 3 or less prevents infinite loops here.
const uint64 second_result = simrand_normal(max, recursion_exponent, "simrand_normal_recursion");
result = (result * second_result) / abs_max;
}

return (uint32)result;
}


In relation to uniqueness, it is not clear why the proposed algorithm would work better than that which I suggest; this would allow journey time tolerances in excess of that generated by the existing system, requiring the existing system to be completely recalibrated, which would take this outside what can sensibly be done as a small project. Given the amount of work that has already gone into calibrating the existing journey time tolerances, there would have to be extremely clear evidence that an alternative is very significantly superior in order for it to be worthwhile discarding all of that work and starting again with at least equally and probably very significantly more work. The relationship between the uniqueness factor and the overall journey time would be potentially very complex in the uniqueness model, and therefore extremely difficult to calibrate clearly.

The reason that I proposed the model that I did is because it leaves entirely untouched the existing journey time tolerance calculation, and simply adds an additional factor that limits the tolerance to certain destinations. This has the particular advantage that it simulates the balancing of utility in expending travel time to go to a particular type of destination. I do not think that measuring uniqueness by testing how many similar buildings that there are really makes sense for this, as uniqueness per se is not what causes people to be willing to travel a very long distance to go somewhere. People might be willing to travel for 5 hours to an office building where the building itself is entirely unremarkable because the person has a business meeting with the specific company in that building, yet be entirely unwilling to travel more than 1 hour to visit the International Museum of Cloth Caps even though there is only one such institution in the world. Measuring uniqueness will also cause problems when new types of (very ordinary) buildings first appear.

As to onward journeys, these already exist in the passenger generation code and have done for some years.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

I'm no expert on random functions. I just think that the journey time tolerance in real world is not so nicely symmetric as the normal distribution is. And the gamma function seem to look more fitting. But I have no hard data for that, and might be completely wrong. Anyway we can skip that, and change the function later if we find some data. IMHO the time tolerance is vaguely related to wealth - poor people cannot afford to travel long distance, so even if they'd be willing to, they can't...

I think that first building of a kind that becomes more common later, should have at least slightly higher potential of attracting people from far away. First "supermarkets" in post-communist countries surely did. So maybe not as abolute as I originally proposed, but maybe it could be used as one of the factors in your proposal. Have the applicability percentage in dat file, and calculate the destination min/max from number of occurences, or vice versa. Or it could be used to alter the parameters of the random function.

Also counting the visitors in similar way as employees could play nicely - no more visitors if the destiantion is full (especially stadiums).


Mariculous

Quote from: jamespetts on May 20, 2020, 11:33:00 PMThe idea of having more than two different types of trips (visiting and commuting), for example, would require major re-engineering...
That was just a little excursion on how "visiting capacities" could be used for the upoming citygrowth re-implementation.

Quote from: jamespetts on May 20, 2020, 11:33:00 PMMay I ask why and how you think that a gamma distribution would be better?
In comparisation to a plain normal distribution, gamma function will give us a figure that raises quickly to the maximum and then falls rather slowly and the closer it gets to zero, the slower it will fall.
That's actually what I would rather expect from journey time distribution. Keep in mind, that we don't pick the actual journey time, but a maximum journey time.

Skewed normal distribution might be an otion to get something better than the current figure without coding effort.
It might be fine for our purposes, so if we don't use them yet, we should try it out, otherwise we might want to try a larger skew before implementing something new.

Without any journey time statistics, including all modes of transport, being specific to visiting trips and providing historical data, we can only place guesses anyway.
I don't think such data is available out there.

Quote from: jamespetts on May 20, 2020, 11:33:00 PMIn relation to uniqueness, it is not clear why the proposed algorithm would work better than that which I suggest
That depends the exact meaning of "works better" here.
Functionally and from ease of implementation, both should be roughly the same.
The uniqueness paraeter would however, be much easier to use for pakset authors and will scale with *_totelance settings made in simuconf.
Setting fixed, absolute min/max times in dats is not a good idea imho as maximum journey times figures are configured in simuconf, thus the relation of journey times to buildings with this dat parameter set will break when modifying the dats.

If you don't always want to apply that factor, i.e. simulating that a specific destination might be quite important to some people, whilst for other people not more important other destinations, you could add your suggested chanche parameter in addition to uniqueness.

Actually, your suggested solution would also allow passengers to exceed the currently well calibrated journey times. If that's an argument against my suggestion, it's against your suggestion either.

Vladki

I did RTFM (simuconf.tab) and found that we already use skewed normal distribution. I didn't know that. Psoitively skewed normal distribution is what I had in mind. I just did not find it when I was looking at all the random distributions.

Quote from: Freahk on May 21, 2020, 08:23:20 PMSetting fixed, absolute min/max times in dats is not a good idea imho as maximum journey times figures are configured in simuconf, thus the relation of journey times to buildings with this dat parameter set will break when modifying the dats.

That's good point. Therefore I suggest using uniqueness factor to increase or decrease the time tolerance from simuconf.dat. Uniqueness could be either statically defined in dat file: default=100% with rare buildings having uniqueness >100%, common buildings <100%. Then just multiply the tolerances with uniqueness factor.   

Or do it as suggested before: divide by (square root of) number of ocurrences in the current game. Or do some other calculation, that could also increase the time tolerance but I don't know where to put the threshold - how many buildings of the same type should be considered as rare...

Mariculous

Quote from: Vladki on May 21, 2020, 08:55:06 PMUniqueness could be either statically defined in dat file: default=100% with rare buildings having uniqueness >100%, common buildings <100%.
That's exactly the idea, yes.

Quote from: Vladki on May 21, 2020, 08:55:06 PMOr do it as suggested before: divide by (square root of) number of ocurrences in the current game.
I still kind of like that idea, but I don't see the point why a newly built housing should attract more visitors than an old one, just because there are many houses of that old type in the world, whilst only a few af the new ones exist so far.
Same goes for shops in general. Sure, the only supermarket will attract a lot of people, but a simple grocer won't attract more just because it doesn't use the same building as all grocers before did.

I still like that feature, but I gues it needs more consideration on how to handle different buildings that are actually of the same type and entirely new buildings.
That said, I guess this would rather be a major feature than a quick fix.

jamespetts

As Vladki notes, we already use a skewed normal distribution, so it is not clear what specific algorithm change is being suggested here, what effect that that change would have, or why that effect would more accurately model how real passengers travel. I posted the relevant code extract for the current algorithm above.

I also do not understand the benefit of having destination specific journey time tolerances scaling with the general journey time tolerances. The intention is for them to be entirely separate, and for passengers' actual tolerance for any given destination to be the minimum of the two. For this reason also, therefore, I do not understand the suggestion that the proposed system would allow the current maximum to be overriden: can you elaborate?

I am afraid that I am still not persuaded by the benefit of actual prevalence of a building in the game world being a measure for destination specific journey time tolerance for the reasons given previously.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Passenger time tolerances can be easily adjusted in simuconf.tab, but per-building time tolerances are "hard-coded" in pak/dat files. So it may happen that a building that was originally designed with higher time tolerance, might end up with lower time tolerance (if user fiddles wit simuconf.tab). Therefore I think the building should have defined only % increase or decrease of what is defined in simuconf.tab. I think it would be much easier to understand for players, pak designers, etc....

Let's have an example. Most city buildings will probably not have defined any building specific time tolerances, but a big football stadium probably will. range_visiting_tolerance=10000, football stadium will have e.g. 15000. But what if the player decides that his simucitizens have plenty of spare time and are willing to travel 20000. The the football stadium will have lower time tolerance than ordinary buildings. If it would be specified as 150%, it will scale nicely and still attract people from larger distance than ordinary houses. It will be one simple and easily understandable (and implementable) option.

Mariculous

I guess everything has been said about "destination specific journey time tolerances scaling with the general journey time tolerances".
I don't see any disadvantage over dat file "hardcoded" journey times here.
Suggested chanche parameter might be interessting, but it's not restricted to either of the two suggestions.

I guess we all agree that there is a huge amount of passengers on bridgewater currently that can barely be handled with the available vehicles.
The following is not meant as a short-term "hotfix" to the current journey, but as thoughts on what is going wrong with our well balanced real-world journey figure. Obviously, it's not as realistic as what we might expect, so at least one of the related parameters must simulate reality quite badly.

So what parameters do we have that directly or indirectly affect this?
Total number of inhabitants
Inhabitant density
Inhabitant classes
Number of journeys an inhabitant attempts per day
max journey time distribution
destination selection logics
vehicle capacities
vehicle speeds

I might have missed some.

From the above I'd consider the following fixed as these are balanced to real-world data:
(Inhabitant density), Number of journeys an inhabitant attempts per day,  vehicle capacities, vehicle speeds

So the parameters left are: (Inhabitant density), Total number of inhabitants, Inhabitant classes, max journey time distribution, destination selection logics and maybe further parameters that I have missed.

- Inhabitant density of housings itself is balanced to real-world data, but the (relative) amount of buildings of different types that are actually placed in simucities might not.
- Total number of inhabitants should not cause this excessive amount of passengers, as our capitals got less inhabitants than real-world capitals of that time.
- Inhabitant classes might be a factor. Especially I am wondering if the binary "can afford it" or "cannot afford it" behavior simulates reality well enough on average. In the real-world passengers won't decide "public transport is a little faster and I can afford it", they will consider if it's worth to spend that money ust to be a little bist faster. That might be an important factor, which plays a huge role in old days where transportation is expensive and not much faster than walking, whilst a rather small factor one public transportation gets considerably faster and less expensive.
- max journey time distribution might be a factor. As discussed before, I'd expect much less passengers in the medium distance range around 3-5h, but that's simply a guess. We might need (but likely won't find) historical data about visiting journey time figures to base this in real-world data.
From my subjective feeling, there are too many journeys to 2-4 hours far-away destinations. More skew, a different distribution or suggested building specific ourney times can help here.
- destination selection logics is another point that results in a diversion between randomised max journey times and the actual journey time, but it won't result in too many passengers, rather the other way round due to journey_time <= max_journey_time, so we can safely assume that's not the cause for the current situation

jamespetts

Vladki - the problems with a proportion system are:

(1) that this does not aid real world data balancing, as the proportion of overall journey time tolerances to destination specific journey time tolerances is not something that there is reason to believe has a fixed relationship in real world data; and

(2) that the purpose of the different tolerances are quite different, so one would not necessarily want to adjust one when one adjusts the other.

One might, for example, want to increase average overall journey time tolerance in order to allow passengers to visit more often certain distant and important buildings, but not make them any more likely to visit medium-distance unimportant buildings. To do this with a proportion system, one would have to adjust the overall number upwards, then go back and adjust all the individual buildings downwards. Moreover, using the proportion system, one could not just specify individual figures in the .dat files; one would have to back-calculate the time that one wanted to achieve from the proportion for each individual building, which would greatly increase the necessary work.

Freahk - in fact, there are data on people's travel time budgets: this is a well studied area in economics. See, for example, the Wikipidia article on Marchetti's Constant and this article discussing that this constant appears to have a wide range. The theory of Marchetti's Constant is the basis for the whole passenger generation system as it is currently implemented, and I looked into this extensively when implementing this. These data suggest that people (in all eras and in all places in the world) spend, very roughly, 1.1 hours per day travelling, but that this average has quite a large range (the second article suggests 74 minutes with an 80 minute range).

I thought that it might be useful to conduct some tests to see what the average per trip tolerance is with current settings. I ran a test using the saved game from the Stephenson-Seimens server, adding up all of the visiting tolerances for passengers and dividing by the number of trips over the course of running for a few minutes: this gave the number 963, the unit of which is tenths of minutes, giving 96.3 minutes as an average visiting tolerance per trip.

The tolerance settings are different on the Stephenson-Seimens server than the Bridgewater-Brunel server. Running the same exercise on the Bridgewater-Brunel server gives the number 1615, or 161.5 minutes. If we use all trips, not just visiting trips, we get 1454, or 145.4 minutes. Reducing the minimum tolerance to 2 and maximum to 9600 gives an average of ~1370.

However, it should be noted that passengers attempt to make more than one trip per day: they attempt to make four on average. So, to get journey times within Marchetti's Constant, the average journey time should be one quarter of somewhere between 74 and 74+80 (i.e. 154), that is 18.5 - 38.5, suggesting that our average journey times are indeed too high. Of course, the average journey time tolerance should be higher than the average journey times, as actual journey times will tend on average to be less than the tolerance, so the tolerance average should be higher than this, but not necessarily by a very large amount.

I then experimented with increasing the skew for visiting passengers. It is currently set at 5, which equates to a cubed recursive skew. Setting it to 6, a multiple cubed recursive skew, and increasing the maximum journey time tolerance to 12,000 minutes, I get an average of 476, or 47.6 minutes' tolerance per journey. Multiplied by 4, this gives 1,904, or 190 minutes of total journey time tolerance per potential passenger per day. (Increasing the skew to 6 whilst retaining the previous minimum/maximum gave an average of ~37 minutes per trip).

With this latter setting (a skew of 6 and no changes to minimum/maximum travel times), I have been testing the current Bridgewater-Brunel game locally in fast-forward. Here is the game as it is after about one and a half months of running with this altered configuration, including a clear calendar month (that just passed) of the new settings. Passengers transported at Promisnster have fallen from 1,710 in the last clear month of skew = 5 to 624 in the first clear complete month of skew = 6 (ignoring the transitional month in between): that is circa 36% of previous passenger numbers. However, the fall in passenger numbers is uneven. The Docklands Light Riverway in Promister's transported numbers have gone from 3,495/mo to 1,999/mo. The Western Fairies (!?) Whittinghill and Bramingpool have gone from 871 to 321 passengers/mo. The Prominster to Pitham route has gone from 732 to 215. Oddly, I cannot measure any noticeable decline in the oceanic traffic at all; this may be because the trip takes so long that the settings have yet to have an effect on passenger numbers.

I should be grateful if those who are interested in this topic could download and test the saved game with the modified settings to see whether this appears to conform better with historical levels of passenger transport.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

What do you mean exactly by proportion system? As I had two suggestions, which both could be considered somehow proportional.
a) simple one - have only one new dat file option that allows to increase or decrease the travel time tolerance to this object - in % of travel time tolerance deined in simuconf.tab
b) the complex one adjusting time tolerance by occurence of the object in game..

I suppose that you meant a)

Quote from: jamespetts on May 23, 2020, 10:01:33 PMOne might, for example, want to increase average overall journey time tolerance in order to allow passengers to visit more often certain distant and important buildings, but not make them any more likely to visit medium-distance unimportant buildings. To do this with a proportion system, one would have to adjust the overall number upwards, then go back and adjust all the individual buildings downwards.
To achieve this it would be better to modify the dat files of the important buildings. I would expect that unimportant buildings will not have building specific time tolernace, and use the defaults anyway. (Whether the tolerances being defined as absolute or proportional)

Quote
Moreover, using the proportion system, one could not just specify individual figures in the .dat files; one would have to back-calculate the time that one wanted to achieve from the proportion for each individual building, which would greatly increase the necessary work.
Sorry I don't understand ... "proportion for each individual building" ? what is that?

Mariculous

Well I found data that stated average daily journey time around 2000 specific to many countries. I cannot remember them all but UK was slightly above 60 minutes, Germany was around 80 minutes and some variation was mentioned.

From such data we cannot derive a distribution. Much more preciese data is required for this, in the best case a dataset of a fair number of passenger journey time records.
I found exactly one source that was a little more preciese, but it was related to commuting trips only. It was not preciese enought to derive a distribution from anyways.

Simply based on an average journey time and an average range of variation, the distribution could be anything, even including uniform distribution around the average time within the variation range.
If you found any more preciese data about this, please let me know so I might try to derive a distribution from that data.


In any case, your tests sound quite promising and I'll definitely have a look into it tomorrow.

jamespetts

By "proportion system", I meant any system in which the individual buildings' journey time tolerance scales in some way in proportion to the overall journey time tolerance; this, I think, is indeed (a); I have already responded to (b).

I am not sure that I understand the idea of adjusting the important rather than unimportant buildings. The most important buildings would not have any individual limitation on journey time tolerance: the passengers' general journey time tolerance would deal with this. Only with buildings one specifically wants to mark as ones which passengers are unwilling to spend as much as their general tolerance travelling to visit are buildings that would have something other than the general tolerance specified.

As to the proportion for each individual building, if one wanted to designate a particular building as one which passengers should not be willing to travel (for example) more than an hour tor reach, with the (a) system, one would have to work out what proportion of the general maximum that an hour is and specify that proportion as the value for that building; whereas with the originally proposed system, one would just specify an hour directly, which is much easier.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

Freahk - one indeed cannot extrapolate a distribution from an average alone, but we can extrapolate a distribution from an average, a minimum and a maximum, and we can infer minimum and maximum journey times by looking at basic real world data about how long that it actually takes to get from one place to another. There was a time when it used to take the best part of a day to get from central London to Highgate, for example; trans-atlantic voyages took weeks, and longer voyages in the days of sail could take months. People were regularly willing to endure such travel times.

For minimum travel times, think of the amount of time that one might be prepared to travel from one's office to get lunch' during a 1 hour lunch' break when one needs to eat the lunch' in that time. This may depend on preferences, but may easily be as low as 10 minutes considering that this would make a 20 minute round trip and take a third of the available rest time just in travelling. It might even be lower than this.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

Ok, James. Maybe we both look at it from completely different view. I thought that the general time tolerance would be left as is, and that most buildings will not have anything specified. Only some exceptional buildings will have a special time tolerance, most often higher then the general one. E.g.: small football ground will be at the defaults, but big stadium, will have higher time tolerance, as football fans of big teams are often willing to travel far away to watch the match (football championship...).  On the other hand a local recycling and garbage collection centre, would have much lower time tolerance than the default. Therefore I thought it would be easier to say that big football stadium has e.g. 3x times (300%) higher time tolerance than small one, or that recycling centre has only 1/4the (25 %) time tolerance than the general one.  I don't mind having absolute tolerances for buildings, or even both, and let pak designers to choose.

jamespetts

Quote from: Vladki on May 24, 2020, 12:18:25 AM
Ok, James. Maybe we both look at it from completely different view. I thought that the general time tolerance would be left as is, and that most buildings will not have anything specified. Only some exceptional buildings will have a special time tolerance, most often higher then the general one. E.g.: small football ground will be at the defaults, but big stadium, will have higher time tolerance, as football fans of big teams are often willing to travel far away to watch the match (football championship...).  On the other hand a local recycling and garbage collection centre, would have much lower time tolerance than the default. Therefore I thought it would be easier to say that big football stadium has e.g. 3x times (300%) higher time tolerance than small one, or that recycling centre has only 1/4the (25 %) time tolerance than the general one.  I don't mind having absolute tolerances for buildings, or even both, and let pak designers to choose.

I chose a mechanism that lowers, rather than raises, the base tolerance for a reason, and that is so as not to disrupt the existing calibration of the journey time tolerance system so as to match Marchetti's Constant as described above. The existing system is calibrated to work for passengers who are potentially willing to make long distance journeys, and the issue was that they were doing so too much in times where there was little of interest to which to make such journeys, at least in most places, so the sensible solution is to reduce, rather than increase, the tolerance.

However, given the findings in relation to the average above, consideration will have to be given to:

(1) whether simply adjusting the skew is sufficient; or
(2) whether what we need to do is implement the suggested feature and adjust the skew and increase perhaps significantly the maximum journey time tolerance so as to get the correct ~1.1 hour journey time tolerance per 16 hour quasi-day average again but with passengers normally refusing to travel long distances to uninteresting destinations.

This may need some extensive testing. I should be interested in any observations from the Bridgewater-Brunel saved game above as a starting point for adjusting only skew.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Vladki

I think that getting the distribution function right is the most important thing. If the distribution is right, it should not matter too much what is the destination. (That should be already solved by visitor demand).

Mariculous

Quote from: jamespetts on May 24, 2020, 12:05:34 AMbut we can extrapolate a distribution from an average, a minimum and a maximum,
No, no, no, not at all!
We can at most determine parameters of a given distribution from that data.
Anyways, let's see how the new parameters behave in practice.


About relative time values against absolute values, Imho simuconf configurable values should always scale consistently with hardcoded and hard-dat-ed values.
Fixed ourney times in dats won't do this.
Do we have definte journey time figures to specific destinations?
If not, pakset authors had to place reasonable guesses on these anyways, in which case I don't see the advantage in guessing an absolute value over guessing a relative value.
In fact, both require the pakset author to have a rough understanding of what the general max journey time figure looks like.


Anyways, let's adjust the distribution properly and then see if any new feature is required, as suggested.
As mentioned in the very beginning of this thread, imho the right choice for the distribution and its parameters is the most important factor to get a realistic passenger figure.

jamespetts

Freahk - do I understand you to suggest adjusting the journey time tolerance skew on the running Bridgewater-Brunel server? Did you want to test the saved game where this had already been done first?
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

I have been running some more in depth tests to measure not just journey time tolerances on average, but actual journey times on average. On the Bridgewater-Brunel server with existing settings, the results are as follows:

Average journey time tolerance: 1500 (150 minutes)
Average journey time: 610 (61 minutes)

Multiplied by 4, this gives us 600 and 244 respectively.

Running with a skew setting of 6 in the saved game linked above, we get:

Average journey time tolerance: 432 (43 minutes)
Average journey time: 289 (29 minutes)

Multiplied by 4, this gives is 173 and 116 minutes respectively, still slightly higher than the goal of ~80 minutes actual average.

Reducing the minimum visiting journey time tolerance from 12 to 2 gives:

Average journey time tolerance: 317 (33 minutes)
Average journey time: 325 (32 minutes)

Multiplied by 4, this gives 134 and 130 minutes respectively.

With these changes, the percentage of passengers successfully reaching their destination drops from 68% to 32% in the 1757 Bridgewater-Brunel saved game. Profits of player companies are affected as follows:

Constellation Ferry:
Before: 16,596c
After: -183c

Azuma Shipping:
Before: 41,865c
After: 5,291c

East Oceanic:
Before: 70,104c
After: 30,240c

funny company:
Before: 3,099c
After: -699c

Prominster Shipping & Coaches:
Before: 52,854c
After: 8,244c

Ves Transport Corporation:
Before: 13,208c
After: 5,934c

Wystoke Ferry Company:
Before: 50,682c
After: 5,604c

I shall run some more tests, but I should in the meantime be interested in comments on these observations and results.

Edit: Further tests indicate that, with a skew of 6 and a 2 minute minimum journey time, 0.18% of trips generated have a journey time tolerance of > 10 hours. With a skew of 5 and a 12 minute minimum journey time (the current settings), 5.1% of trips have a tolerance of > 10 hours.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.


jamespetts

Quote from: Vladki on May 27, 2020, 12:05:11 AM
Why the multiplication by 4 ?

This is because Marchetti's constant is a value for daily travel time, and, on average, each unit of population in Simutrans-Extended makes approximately 4 trips per day.

For example, Prominster on the Bridgewater-Brunel map, has a population of 43,136. Each game month in 1757, this town generates circa 84,034 passenger trips. 84,034 / 43,136 = 1.94 passenger trips per unit of population per game month. Each game month is 6:24 (or 6.4 hours). We do not model a day/night cycle, so ignore inactive parts of the day. Each day is thus assumed to be 16 hours long. Each day is therefore 2.5 months. 2.5 * 1.94 = 4.85 - so we should perhaps be multiplying by 4.85 rather than 4.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

jamespetts

I have now carried out some further testing, this time on the Stephenson-Seimens saved game from the in-game year 2021.

Running with a skew of 6, a minimum journey time tolerance of 2 minutes and a maximum journey time tolerance of 20,000 minutes, walking, private car and player transport usage is all down to around 60% of previous levels (testing in particular in Daring so as to isolate the effect of congestion). Buildings reported much lower visiting success rates - I am still in the process of testing this by allowing the game to run long enough to see the pattern more clearly using the minimap. With default settings, there is very little patterning of visitor success in the Stephenson-Seimens map, since all buildings have a very high visitor transport success percentage, and the map is largely just overall green. It is important to get a good patterning based on (1) a town's own amenities within walking distance; and (2) the quality of transport (both public and private road) in that town for the planned town growth mechanism, which is based on local transport success rates.

As to absolute numbers, we first need to think of what we are aiming for. UK government statistics show that, in the 21st century, people make ~1,000 trips per year on average. That equates to approximately 2.7 trips per day. The total number of attempted trips per passenger per day is, as set out above, 4.85. Therefore, we would expect approximately 56% of those to be successful in a world with transport as good as it is in 21st century Britain. The Stephenson-Seimens map is mixed in that regard: many towns have no road access to other towns, but public transport is generally better than reality; the overall effect is difficult to calibrate precisely, but we should be aiming for something not too far off reality.

With the modified settings, we are getting a total passenger success rate of 66%, down from 76% with the default settings (a skew of 5, an 8 minute minimum journey time and an approximately 5,000 minute maximum journey time). Thus, it seems that increasing the skew and reducing the minimum journey time does produce more realistic passenger generation.

As to increasing the maximum journey time, this has given us a journey time tolerance average of 484 (48 minutes), with an actual journey time average of 488 (48 minutes). Of course, in computing whether this aligns with Marchetti's Constant, we need to take into account the actual number of trips made, not the ideal number of trips that passengers would like to make, so we must multiply the figure by 4.85 multiplied by 66%. This gives us a value of 153 minutes, which is somewhat in excess of the ~80 minutes that we are looking for.

In terms of the very long journeys, perhaps more relevant to the Bridgewater-Brunel server than the Stephenson-Seimens server, 0.81% of all journeys were recorded as having a tolerance of over 10 hours, up from only 0.18% with the maximum journey time unaltered. This suggests that increasing the maximum journey time is an effective way of allowing the very long journeys that people are sometimes willing to make, but it may be that more refinement of this figure is necessary.

For reference:

5,000 minutes = 83 hours
9,000 minutes = 150 hours
15,000 minutes = 250 hours
20,000 minutes = 333 hours
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Mariculous

Note: this post is 3 posts behind, so does not respond to the latest 3 posts.

No, these changes are bad, I'm losing too much profit! :P
Well just joking. I'd expect a fair portion of the decreased profit to be related to services that exceed the demand a lot.

I found one of two sources about journey times I had mentioned agin, so I'll do an excursion to eurostats data and will return to your observations after that.
I am sorry, the linked document is in German language, so I'll extract and translate the most important points to you.
https://ec.europa.eu/eurostat/documents/3433488/5298273/KS-SF-07-087-DE.PDF/0d50ff3c-a042-4c49-85e8-5333c92a7186

The paper is from 2007, the included data is from different years.

Page1:
In The average European citizen was using any kind of transporation one hour per day.
A UK citizen is using any kind of transportation 59.7 minutes per day, of which 36.5 minutes are spent in a private car.
It is unclear in which year that stat was recorded and if that data includes walking and bike journeys. (see page 5)

Page 2:
The average UK citizent does 2.9 journeys per day which take 63.3 minutes per day, an average distance of 31.8 km per day and an average of  in the UK.
That data was recorded from 1999-2001

In the UK, that journey time was related to the following:
50% free time activities, 20% shopping, 20% commuting, 5% business trip, 5% related to education.

It is unclear if that data includes walking and bike journeys (see page5)

Pages 3 and 4 are about the relation to journey distance and the way the data was recorded.

Page5:
This page is about passenger-distance distributed to different modes of transport. It can be noted that walking and bike data does not seem to be available for the UK.
The recorded average yearly passenger-distance was the following:
private car: 678 millard
other private motor vehicles: 10 millard
Bus: 48 millard
Rail: 52 millard
Aircraft: 9.9 millard



back to your observations:
From these it sounds like it results in much more realistic journey times on average.
It is very hard to tell if the journey time figure is also fairly realistic.

I'd expect actual journey times to be roughly half of maximum journey time on average, thus I am a little confused about your observation that actual journey times will be very closely to the maxmum tolerable time.