News:

Do you need help?
Simutrans Wiki Manual can help you to play and extend Simutrans. In 9 languages.

[network games] add robustness against pak duplicates

Started by THLeaderH, August 30, 2018, 05:47:22 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

THLeaderH

This patch makes it possible to join a network game even if there are duplicates in pak names.

It is commonly seen that a player using mac or other OS cannot join in a network game with a customized pakset, whose server is on windows. This is because some paks duplicates in their name. If name duplicates, the checksum calculation can be done in a different order, that makes joining impossible. The order depends on the native API of directory loading.

Simutrans does not allow pak name duplicates. However, it is practically difficult to ensure that there is no name duplicates in the large customized paksets. To conquer this issue, sorting pak files is a good solution because the order of checksum calculation becomes independent from OS. In this patch, file paths are sorted before loading the files. I think this patch has little impact against the loading time because the number of pak files is much smaller than the number of add-ons, which are also sorted before the calculation of checksum.

DrSuperGood

Surely a better solution would be to solve...
Quoteit is practically difficult to ensure that there is no name duplicates in the large customized paksets
If this is encountered one could fail loading and display an error message that the pakset contains duplicate records and mention which duplicates were detected.

If one wants to allow addons to overwrite records in the pakset it extends the most logical behaviour would be to save calculating checksums until after all data has been loaded. When addons are loaded any record that duplicates a record from the base pakset is replaced with the addon record in internal data structures. If the addon contains duplicate records then the above mentioned error is printed to notify there is a mistake with the addon data. One would need to keep a set of addon record keys to detect duplicates when loading the addon data. If one wants an addon to have duplicate records with one being replaced by another then a special "load order" file would be needed which would allow addon authors to explicitly define the order the files containing the duplicates get loaded (how the conflict is resolved).

Your proposed solution is masking there being a problem of duplicate records in the pakset. It might result in hard to diagnose pakset problems such as addon authors modifying a record yet noticing no in game change due to it being a duplicate that is ignored by the arbitrary alphabetical load order.

Ters

While what DrSuperGood writes is true, there is perhaps still some merit in sorting the files if the checksum depends on the order files are iterated by the operating system. In fact, I wouldn't be surprised if this is the actual problem, not duplicates.

Ranran(retired)

Quote from: THLeaderH on August 30, 2018, 05:47:22 AMit is practically difficult to ensure that there is no name duplicates in the large customized paksets.
Is that so?

When Simutrans.exe is started with the option " -log 1 -debug 3", you can find the following warning in simu.log.


Warning: vehicle_builder_t::register_desc(): Object ************* was overlaid by addon!
Warning: hausbauer_t::register_desc(): Object ************* was overlaid by addon!


I think author of pakset can be careful of duplication.
ひめしという日本人が開発者達の助言を無視して自分好みの機能をextendedに"強引に"実装し、
コードをぐちゃぐちゃにしてメンテナンスを困難にし(とりわけ道路と建物関連)、
挙句にバグを大量に埋め込み、それを知らんぷりして放置し(隠居するなどと言って)別のところに逃げ隠れて自分のフォーク(OTRP)は開発を続けている
その事実と彼の無責任さに日本人プレイヤーは目を向けるべき。らんらんはそれでやる気をなくした(´・ω・`)
他人の振り見て我が振り直せ。ひめしのようにならないために、らんらんが生み出したバグや問題は自分で修正しなくちゃね(´・ω・`)

DrSuperGood

QuoteWhile what DrSuperGood writes is true, there is perhaps still some merit in sorting the files if the checksum depends on the order files are iterated by the operating system. In fact, I wouldn't be surprised if this is the actual problem, not duplicates.
Pakset checksum is computed after all objects are loaded hence the order they are loaded does not matter.

The problem is with records overwriting each other due to being duplicates. The order the duplicate records get loaded, and hence overwritten, is currently OS dependent as it is based on the order the files are enumerated by the OS.

THLeaderH

Quote from: Ranran on August 30, 2018, 08:49:46 AM
Is that so?

When Simutrans.exe is started with the option " -log 1 -debug 3", you can find the following warning in simu.log.


Warning: vehicle_builder_t::register_desc(): Object ************* was overlaid by addon!
Warning: hausbauer_t::register_desc(): Object ************* was overlaid by addon!


I think author of pakset can be careful of duplication.

Addon duplicates cannot be checked as long as users does not know that they have to launch simutrans with -log 1 -debug 3 option. Also, one cannot know what file the duplicated addons is in. Those are not user friendly, I think.

Ters

Quote from: DrSuperGood on August 30, 2018, 11:12:46 AM
Pakset checksum is computed after all objects are loaded hence the order they are loaded does not matter.
Unless the order they are loaded dictates the order they are stored in memory, and the order they are stored in memory dictates what the checksum is. I've had cases where a dependency on unspecified ordering has worked for a long time, because circumstances just happened to cause the ordering we needed. Until one day, they didn't.

I guess add-ons overwriting something from the base pak set is a feature, since Simutrans is made to handle it and the outcome is easily predictable. Add-ons overwriting add-ons, not so much, as Simutrans has no obvious way of prioritizing add-ons that I know of. They are just lumped together in a big mess. It is just that Simutrans has no idea if the object being overwritten comes from the base pak set or is another add-on.

prissi

The order is strictly first loading from the main folder, then loading from the addon folder. If you have addons with the same name twice in that folder, the order is indeed random. But a proper pak set should no be affected by that. You patch may still fail as sons as upper and lowercase comes into play. Should a name by exactly 8 character long, it will be returned in UPPERCASE in windows (not always, but often), even if it is in lowercase on other systems. Moreover, any non ASCII character is UTF8 in MAC and Linux, but undefined on windows unless is has a japanese codepage, where it may be shift-JIS. Even worse, objects could be merged in files, so there could be even the same file with different names.

I think failing with checksum errors, if you end up with different paks is a sensible thing to do. Maybe a new flag: -force-single-addons

Ters

All fully modern systems store filenames as Unicode (unless configured oddly). Only if you use FAT as your filesystem (without using UTF-8, like Linux allows you to do) should you run into character set issues. We've rewritten the filesystem integration to use the Unicode API on Windows, remember. The uppercasing of 8.3 filenames in Windows might still be an issue, I haven't look into that.

Frank

The current release 120.3 already has a problem between Linux and Windows.

left Windows and right Linux


Simutrans Server run under Fedora 27

the pak file is way.underground power.pak from pak64.german 112.3.10

DrSuperGood

QuoteUnless the order they are loaded dictates the order they are stored in memory, and the order they are stored in memory dictates what the checksum is. I've had cases where a dependency on unspecified ordering has worked for a long time, because circumstances just happened to cause the ordering we needed. Until one day, they didn't.
The checksum is computed for each record independently. After all records are loaded a pakset checksum is then computed by ordering all records in a well defined way and then for each record feeding the record checksum into the cryptographic hasher.

The problem being reported here is that due to duplicate addon records the checksum of the duplicated record might be different depending on OS due to platform dependant load order resulting in different records being used.
QuoteShould a name by exactly 8 character long, it will be returned in UPPERCASE in windows (not always, but often), even if it is in lowercase on other systems. Moreover, any non ASCII character is UTF8 in MAC and Linux, but undefined on windows unless is has a japanese codepage, where it may be shift-JIS. Even worse, objects could be merged in files, so there could be even the same file with different names.
Is this still the case? I thought I standardized it so that the game uses the unicode windows API (UTF-16) and converted to UTF-8 for internal usage to eliminate dependency on code pages. Or was this the one part that had to be reverted due to makeobj?
QuoteAll fully modern systems store filenames as Unicode (unless configured oddly).
Not entirely true. Windows stores them as UTF-16 while other OS like Linux store arbitrary byte strings which usually are UTF-8 (but not required to be so).
QuoteWe've rewritten the filesystem integration to use the Unicode API on Windows, remember.
Except for that one part that had to be reverted due to being shared code with makeobj... Yeh I really should make a simfilesystem file and throw it all in there and get both Simutrans and Makeobj to use it. Regret not doing this from the beginning...
QuoteThe uppercasing of 8.3 filenames in Windows might still be an issue, I haven't look into that.
I am not sure if it does this with the wide character API. In any case since Windows ignores file name case one could get around this by forcing the path strings to lower case before comparison. Any pakset that is made to depend on case sensitive file paths will be broken on Windows anyway.

As I mentioned before, a good solution to duplicate records within addon data would be a "load order" file. This could be a simple text file with 1 line per relative file path. When loading addons the files declared in the load order file are loaded in the order of declaration, with duplicates replacing each other as specified by that order. All other addon files are then loaded in arbitrary order, with duplicate records being a fatal error.

The fatal error when encountering unordered duplicate records could print the file names of the pak files that the duplicates came from. This could be done by using a map that exists during the addon loading phase which maps record name to origin file. Using the error message the addon author could then quickly create a load order file to resolve the duplicate conflict in a way that is not platform dependant or arbitrary.
Quoteleft Windows and right Linux
I assume the windows build was built using GCC like the Linux build? If it is a self built MSVC build this could be caused by another error.

Could you link to the source code for that pakset? Might help with diagnosing the cause.

Ters

Quote from: DrSuperGood on August 30, 2018, 02:55:00 PM
Not entirely true. Windows stores them as UTF-16 while other OS like Linux store arbitrary byte strings which usually are UTF-8 (but not required to be so).
I noted that it was possible to configure some systems oddly, but UTF-8 seems to be the de-facto standard for Linux these days.

Quote from: DrSuperGood on August 30, 2018, 02:55:00 PMExcept for that one part that had to be reverted due to being shared code with makeobj...
As far as I remember, it was only "reverted" for makeobj. That is, makeobj contains its own copy of the original implementation (sort of). A partial revert of that stuff simply won't work.

Quote from: DrSuperGood on August 30, 2018, 02:55:00 PMAs I mentioned before, a good solution to duplicate records within addon data would be a "load order" file. This could be a simple text file with 1 line per relative file path. When loading addons the files declared in the load order file are loaded in the order of declaration, with duplicates replacing each other as specified by that order. All other addon files are then loaded in arbitrary order, with duplicate records being a fatal error.
How many objects does your average add-on contain? I got the impression that they generally only contain a single vehicle or set of vehicles (like an EMU). In that case, you've got two things with the same name because you want both things. Explicit loading order won't help then. You need load-time renaming. Or simply tell pak authors to be more explicit or creative, and not name every truck "truck" and every house "house".

On the other hand, if you want to replace the cooling truck from pak64 food-addon with a cooler (pun not intended, but intentionally left in) cooling truck from someone else, the obvious solution would be to overwrite the file itself. If intended as a replacement, it should have the same file name as well as the same internal name.

Frank

Quote from: DrSuperGood on August 30, 2018, 02:55:00 PM
....
I assume the windows build was built using GCC like the Linux build? If it is a self built MSVC build this could be caused by another error.

Could you link to the source code for that pakset? Might help with diagnosing the cause.

Build in screen is from nigtly-page.
But also the release 120.3 GDI of Sourceforge shows this behavior.

Linux is in Repo from simutrans-germany.com - compiled 64bit from Sourceforge source zip 120.3

pak64.german 112.3.10 create by: Makeobj version 55.1 for Simutrans 112.1 and higher

I have just seen that the object name exists twice.

tunnel.underground_power.pak
way.underground_power.pak

DrSuperGood

QuoteI have just seen that the object name exists twice.

tunnel.underground_power.pak
way.underground_power.pak
Yeh this falls into the bug this thread is discussing.

If this is a complete pakset (no addons) then the solution is to rename or remove one of the duplicate object records. There is no excuse for a complete pakset to contain duplicate records as that would be a pakset error.

If it is occurring in the addons then this topic should hopefully come to a solution, eventually. My recommendation being a load order file so that addon authors can explicitly define the order duplicates get loaded in.

Ters

Add-on authors should just fix their mistakes just like pak set authors. It is add-on users that may need to write load order files when add-on makers can't come up with unique names.

Leartin

Quote from: Ters on August 30, 2018, 09:02:43 PM
Add-on authors should just fix their mistakes just like pak set authors. It is add-on users that may need to write load order files when add-on makers can't come up with unique names.
Add-on authors may use the same name as pakset authors on purpose to replace an object in the pakset, rather than add a new one. This shouldn't cause an issue by itself as long as the user correctly places them in the addon folder.
One add-on author is unlikely to use the same name twice, but you would usually mix add-ons from various authors. A smart add-on creator might start to add his name in the name of each object, but nobody ever told anyone to do that, plus it seems wrong for non-attribution creations. I don't think it's the add-on creator at fault here.
Furthermore, the issue only becomes appearent for network games. I think it can be put on the game host to make sure the custom pakset they need to spread out to people to play is able to be played. Providing tools and instructions for finding duplicates would help, though.

Ters

Quote from: Leartin on August 31, 2018, 06:14:52 AMFurthermore, the issue only becomes appearent for network games.
Certainly more apparent, as it becomes a stopping error, but it is still apparent if I install two passenger car add-ons and only find one new passenger car in the game, because both add-ons use the same (likely very generic) object name. Although it is perhaps more likely to happen with non-vehicle objects?

Quote from: Leartin on August 31, 2018, 06:14:52 AMI think it can be put on the game host to make sure the custom pakset they need to spread out to people to play is able to be played.
Ideally, the paks (and simuconf.tab?) should be downloaded from the server along with the map. However, Simutrans can't do much without a pak set, and it can't switch or modify the data once loaded.

prissi

r8565 will display a more prominent warning of overlaid objects.

Dwachs

Frank's issue is neither fixed by the patch above (it would hide such issues) nor by the commit prissi mentioned.

Pak-objects with different type but same name are perfectly fine for everything. They will not be notices by that overlay detection. The problem is in the computation of the pakset checksum, here type is not considered.
Parsley, sage, rosemary, and maggikraut.

prissi

Most easy might be to include the type string into the sum the one from objversion), using inthashtable and then iterate over this and next over the name in them. But one the other hand, objects with same name but different type will also mass up the translation. Maybe it should just store the type with the checksum and then give an error, when you try to overlay it with another object with different type. That could also allow to sort the mismatched objects by type. The start would be to add the type to pak_info.cc hashtable, and extend each obj.

void bridge_reader_t::register_obj(obj_desc_t *&data)
{
bridge_desc_t *desc = static_cast<bridge_desc_t *>(data);

bridge_builder_t::register_desc(desc);

checksum_t *chk = new checksum_t();
desc->calc_checksum(chk);
- pakset_info_t::append(desc->get_name(), chk);
+ pakset_info_t::append(desc->get_name(), obj_bridge, chk);
}
...


Ters

Quote from: prissi on September 05, 2018, 03:08:06 PMBut one the other hand, objects with same name but different type will also mass up the translation.
Frank's example looks like a legitimate case for using the same object name deliberately to get the same translation. I haven't looked at the pak files in question, but it would seem like one is the tunnel definition and the other is the tunnel's dedicated way. There is no need for these to have different names, especially translated, as they function as one.

prissi

r8577 decorates the name with the first three letters of the obj type, which should avoid any issues with same name but different type.