If we go back to the original example, of two routes, one C-D and one C-D-E then we can calculate the optimum strategy for train services (you can skip to the end if you just want a summary).

Sorry, I recalculated your figures and they are not just wrong because under lab conditions, but wrong in substance. I think it's because you didn't take the time momentum into account properly.

Let me show you my calculation (again, lab conditions and just looking at getting the pax from C to D and E, not back and not between D and E).

We have 1.000 pax/month in C for D and E (and the same amounts in the other direction), of which 800 (4/5) are for D and 200 (1/5) for E. Let's assume a convoy capacity of 200.

For easier calculation, we also assume that distances between C and D and E are all equal (let's say 50 tiles each) and a train goes 100 tiles/month.

So the starting point is:

We have 5 trains/month leaving C and going all the way to E. The distance for each train is 200 tiles, so we need 10 convoys for this. Over-capacity is 800 between D and E, so on 100 tiles, makes a over-capacity of 80'000 units ('tileseats per month').

The required tileseat capacity is 800*100+200*200=120'000. The 5 trains offer 5*200*200=200'000. The difference of course is the 80'000 again (makes 40% over-capacity).

I claim that the (technically, not gameplaywise) best loading strategy here would be the other way round than it is, so last-stop-first-served. We could only have 1 of the 5 trains a month going through to E ('E-train') and 4 of 5 just going to D ('D-train'). Statistically, there are - at each train arrival - 160 pax for D and 40 for E. Again statistically, there will always be 200 pax for E waiting when the one E-train arrives (at least from the second cycle on), and so it will not take any passengers to D. The 160 D-pax that this train leaves are well distributed to the four D-trains (next cycle). This way, all trains are always full, over-capacity is zero. D-trains only go 100 tiles, so we need 4 convoys for them, and the E-train still goes 200 tiles which requires 2 convoys of them, so we need 6 convoys instead of 10.

Lets try the same with proportional loading: Having 1 E-train and 4 D-trains a month. Again, statistically there are 160 pax for D and 40 for E at each train departure. So, D-trains go 20% empty in the first cycle. When the E-train comes last in the first cycle, there will be 160D:200E Pax waiting, so the train would load proportionally 89D:111E, leaving 71D and 89E. The pax to D will easily be picked up by the four other trains in the next cycle. But pax to E will accumulate, so the E-train will load more and more of them, but never 100%, because there will statistically always be 160 new Pax to D when it departs (unless there are so many pax for E that rounding leaves zero to D on the E-train, but we don't want so many waiting). Anyway, we see that a half departure to E would already solve the problem. We just set up one more convoys on the E-line, so 3 instead of 2. Together with the 4 E-trains, it makes 7 convoys. Tileseat capacity is 4*100*200+1.5*200*200=140.000, making only 20'000 overhead (14%), which is perfectly realistic (will be around 20% under non-lab-conditions).

Now let's try again the same with the current first-stop-first-served logic (one departure on the E-line, four on the D-line): There are always 160 new pax to D when the E-train arrives, which will be loaded first. So there will always be only 40 Pax carried to E. As the E-train only serves C once a month, there will be 160 pax/month left for E, while all D-trains are 1/5 empty. There will statistically always be new Pax to D, so we need more E-trains, just as with proportional loading. So let's add the half monthly service to E, just as in the proportional calculations. So we have 5.5 departures/month, which leaves 145 D-pax for each train which again leaves 55 seats for E-pax in the E-trains, but we need 200/1.5=133. To have 133 free seats, statistical average for D-pax must be only 67, which requires in total 10.5 D-trains (800/67-1.5)! So this is evidently worse than just having all trains operate do E (requires 13.5 convoys)!

Let's try the other way round and add more E-trains instead. Now, we don't add a half E-line-service per month, but one full E-line services to have 2. So required capacity for E is 200/2=100 in each E-train, leaving 100 for D. We now have 6 services/month, which means 133 new pax to D for each E-train. Too many. We need 6 D-trains (800/100-2). So it works, but 6 D-trains plus 2 E-trains/month requires 10 convoys, so nothing saved. Maybe add another E-train? Required capacity would only be 67 to E (200/3), leaving 133 for D, making 6 D-trains necessary (800/133). But it's 12 convoys.

In other words: Under lab conditions, the best possible result of not having all trains go through to E is as bad as having all trains go through to E. Under practical conditions, the result will always be worse.

If you look at how real life works: People will try to get on whatever train goes first into their direction. Shear probability will result (in overcrowded situations) into something like "proportional loading". Thus I am no saying you are wrong, but this proportional loading will (in the far more common situation A-B-C-D-E and A-E) more likely produce the situation that local and express line are not treated differently, i.e. both loading mostly for E and A and starving intermediate stops.

First of all: aren't you contradicting yourself? If people try to get on whatever train, why shouldn't they get on the local train in your example? I also question that this was 'far more common'. Local trains usually do not operate between major stations, but they usually cross one. Almost all subway, tram and local train networks I know (plus city and regional bus networks) build on this principle (sometimes with additional circle lines or other connections, but in core). If local trains operate between major stations, it's where population is very dense. But in this areas, pax (i.e. commuters) do use the local trains, just because they're cheaper.