The International Simutrans Forum

 

Author Topic: [network games] add robustness against pak duplicates  (Read 732 times)

0 Members and 1 Guest are viewing this topic.

Offline THLeaderH jp

  • Coder/patcher
  • Devotee
  • *
  • Posts: 271
  • Languages: JP,EN
[network games] add robustness against pak duplicates
« on: August 30, 2018, 05:47:22 AM »
This patch makes it possible to join a network game even if there are duplicates in pak names.

It is commonly seen that a player using mac or other OS cannot join in a network game with a customized pakset, whose server is on windows. This is because some paks duplicates in their name. If name duplicates, the checksum calculation can be done in a different order, that makes joining impossible. The order depends on the native API of directory loading.

Simutrans does not allow pak name duplicates. However, it is practically difficult to ensure that there is no name duplicates in the large customized paksets. To conquer this issue, sorting pak files is a good solution because the order of checksum calculation becomes independent from OS. In this patch, file paths are sorted before loading the files. I think this patch has little impact against the loading time because the number of pak files is much smaller than the number of add-ons, which are also sorted before the calculation of checksum.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: [network games] add robustness against pak duplicates
« Reply #1 on: August 30, 2018, 07:59:55 AM »
Surely a better solution would be to solve...
Quote
it is practically difficult to ensure that there is no name duplicates in the large customized paksets
If this is encountered one could fail loading and display an error message that the pakset contains duplicate records and mention which duplicates were detected.

If one wants to allow addons to overwrite records in the pakset it extends the most logical behaviour would be to save calculating checksums until after all data has been loaded. When addons are loaded any record that duplicates a record from the base pakset is replaced with the addon record in internal data structures. If the addon contains duplicate records then the above mentioned error is printed to notify there is a mistake with the addon data. One would need to keep a set of addon record keys to detect duplicates when loading the addon data. If one wants an addon to have duplicate records with one being replaced by another then a special "load order" file would be needed which would allow addon authors to explicitly define the order the files containing the duplicates get loaded (how the conflict is resolved).

Your proposed solution is masking there being a problem of duplicate records in the pakset. It might result in hard to diagnose pakset problems such as addon authors modifying a record yet noticing no in game change due to it being a duplicate that is ignored by the arbitrary alphabetical load order.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #2 on: August 30, 2018, 08:42:53 AM »
While what DrSuperGood writes is true, there is perhaps still some merit in sorting the files if the checksum depends on the order files are iterated by the operating system. In fact, I wouldn't be surprised if this is the actual problem, not duplicates.

Offline Ranran jp

  • *
  • Posts: 117
  • Languages: ja
Re: [network games] add robustness against pak duplicates
« Reply #3 on: August 30, 2018, 08:49:46 AM »
it is practically difficult to ensure that there is no name duplicates in the large customized paksets.
Is that so?

When Simutrans.exe is started with the option " -log 1 -debug 3", you can find the following warning in simu.log.

Code: [Select]
Warning: vehicle_builder_t::register_desc(): Object ************* was overlaid by addon!
Warning: hausbauer_t::register_desc(): Object ************* was overlaid by addon!

I think author of pakset can be careful of duplication.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: [network games] add robustness against pak duplicates
« Reply #4 on: August 30, 2018, 11:12:46 AM »
Quote
While what DrSuperGood writes is true, there is perhaps still some merit in sorting the files if the checksum depends on the order files are iterated by the operating system. In fact, I wouldn't be surprised if this is the actual problem, not duplicates.
Pakset checksum is computed after all objects are loaded hence the order they are loaded does not matter.

The problem is with records overwriting each other due to being duplicates. The order the duplicate records get loaded, and hence overwritten, is currently OS dependent as it is based on the order the files are enumerated by the OS.

Offline THLeaderH jp

  • Coder/patcher
  • Devotee
  • *
  • Posts: 271
  • Languages: JP,EN
Re: [network games] add robustness against pak duplicates
« Reply #5 on: August 30, 2018, 11:27:11 AM »
Is that so?

When Simutrans.exe is started with the option " -log 1 -debug 3", you can find the following warning in simu.log.

Code: [Select]
Warning: vehicle_builder_t::register_desc(): Object ************* was overlaid by addon!
Warning: hausbauer_t::register_desc(): Object ************* was overlaid by addon!

I think author of pakset can be careful of duplication.

Addon duplicates cannot be checked as long as users does not know that they have to launch simutrans with -log 1 -debug 3 option. Also, one cannot know what file the duplicated addons is in. Those are not user friendly, I think.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #6 on: August 30, 2018, 12:18:42 PM »
Pakset checksum is computed after all objects are loaded hence the order they are loaded does not matter.
Unless the order they are loaded dictates the order they are stored in memory, and the order they are stored in memory dictates what the checksum is. I've had cases where a dependency on unspecified ordering has worked for a long time, because circumstances just happened to cause the ordering we needed. Until one day, they didn't.

I guess add-ons overwriting something from the base pak set is a feature, since Simutrans is made to handle it and the outcome is easily predictable. Add-ons overwriting add-ons, not so much, as Simutrans has no obvious way of prioritizing add-ons that I know of. They are just lumped together in a big mess. It is just that Simutrans has no idea if the object being overwritten comes from the base pak set or is another add-on.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: [network games] add robustness against pak duplicates
« Reply #7 on: August 30, 2018, 12:27:50 PM »
The order is strictly first loading from the main folder, then loading from the addon folder. If you have addons with the same name twice in that folder, the order is indeed random. But a proper pak set should no be affected by that. You patch may still fail as sons as upper and lowercase comes into play. Should a name by exactly 8 character long, it will be returned in UPPERCASE in windows (not always, but often), even if it is in lowercase on other systems. Moreover, any non ASCII character is UTF8 in MAC and Linux, but undefined on windows unless is has a japanese codepage, where it may be shift-JIS. Even worse, objects could be merged in files, so there could be even the same file with different names.

I think failing with checksum errors, if you end up with different paks is a sensible thing to do. Maybe a new flag: -force-single-addons

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #8 on: August 30, 2018, 01:27:33 PM »
All fully modern systems store filenames as Unicode (unless configured oddly). Only if you use FAT as your filesystem (without using UTF-8, like Linux allows you to do) should you run into character set issues. We've rewritten the filesystem integration to use the Unicode API on Windows, remember. The uppercasing of 8.3 filenames in Windows might still be an issue, I haven't look into that.

Offline Frank

  • Inactive/Retired
  • *
  • Posts: 1431
  • Languages: DE
Re: [network games] add robustness against pak duplicates
« Reply #9 on: August 30, 2018, 02:43:48 PM »
The current release 120.3 already has a problem between Linux and Windows.

left Windows and right Linux


Simutrans Server run under Fedora 27

the pak file is way.underground power.pak from pak64.german 112.3.10

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: [network games] add robustness against pak duplicates
« Reply #10 on: August 30, 2018, 02:55:00 PM »
Quote
Unless the order they are loaded dictates the order they are stored in memory, and the order they are stored in memory dictates what the checksum is. I've had cases where a dependency on unspecified ordering has worked for a long time, because circumstances just happened to cause the ordering we needed. Until one day, they didn't.
The checksum is computed for each record independently. After all records are loaded a pakset checksum is then computed by ordering all records in a well defined way and then for each record feeding the record checksum into the cryptographic hasher.

The problem being reported here is that due to duplicate addon records the checksum of the duplicated record might be different depending on OS due to platform dependant load order resulting in different records being used.
Quote
Should a name by exactly 8 character long, it will be returned in UPPERCASE in windows (not always, but often), even if it is in lowercase on other systems. Moreover, any non ASCII character is UTF8 in MAC and Linux, but undefined on windows unless is has a japanese codepage, where it may be shift-JIS. Even worse, objects could be merged in files, so there could be even the same file with different names.
Is this still the case? I thought I standardized it so that the game uses the unicode windows API (UTF-16) and converted to UTF-8 for internal usage to eliminate dependency on code pages. Or was this the one part that had to be reverted due to makeobj?
Quote
All fully modern systems store filenames as Unicode (unless configured oddly).
Not entirely true. Windows stores them as UTF-16 while other OS like Linux store arbitrary byte strings which usually are UTF-8 (but not required to be so).
Quote
We've rewritten the filesystem integration to use the Unicode API on Windows, remember.
Except for that one part that had to be reverted due to being shared code with makeobj... Yeh I really should make a simfilesystem file and throw it all in there and get both Simutrans and Makeobj to use it. Regret not doing this from the beginning...
Quote
The uppercasing of 8.3 filenames in Windows might still be an issue, I haven't look into that.
I am not sure if it does this with the wide character API. In any case since Windows ignores file name case one could get around this by forcing the path strings to lower case before comparison. Any pakset that is made to depend on case sensitive file paths will be broken on Windows anyway.

As I mentioned before, a good solution to duplicate records within addon data would be a "load order" file. This could be a simple text file with 1 line per relative file path. When loading addons the files declared in the load order file are loaded in the order of declaration, with duplicates replacing each other as specified by that order. All other addon files are then loaded in arbitrary order, with duplicate records being a fatal error.

The fatal error when encountering unordered duplicate records could print the file names of the pak files that the duplicates came from. This could be done by using a map that exists during the addon loading phase which maps record name to origin file. Using the error message the addon author could then quickly create a load order file to resolve the duplicate conflict in a way that is not platform dependant or arbitrary.
Quote
left Windows and right Linux
I assume the windows build was built using GCC like the Linux build? If it is a self built MSVC build this could be caused by another error.

Could you link to the source code for that pakset? Might help with diagnosing the cause.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #11 on: August 30, 2018, 03:42:13 PM »
Not entirely true. Windows stores them as UTF-16 while other OS like Linux store arbitrary byte strings which usually are UTF-8 (but not required to be so).
I noted that it was possible to configure some systems oddly, but UTF-8 seems to be the de-facto standard for Linux these days.

Except for that one part that had to be reverted due to being shared code with makeobj...
As far as I remember, it was only "reverted" for makeobj. That is, makeobj contains its own copy of the original implementation (sort of). A partial revert of that stuff simply won't work.

As I mentioned before, a good solution to duplicate records within addon data would be a "load order" file. This could be a simple text file with 1 line per relative file path. When loading addons the files declared in the load order file are loaded in the order of declaration, with duplicates replacing each other as specified by that order. All other addon files are then loaded in arbitrary order, with duplicate records being a fatal error.
How many objects does your average add-on contain? I got the impression that they generally only contain a single vehicle or set of vehicles (like an EMU). In that case, you've got two things with the same name because you want both things. Explicit loading order won't help then. You need load-time renaming. Or simply tell pak authors to be more explicit or creative, and not name every truck "truck" and every house "house".

On the other hand, if you want to replace the cooling truck from pak64 food-addon with a cooler (pun not intended, but intentionally left in) cooling truck from someone else, the obvious solution would be to overwrite the file itself. If intended as a replacement, it should have the same file name as well as the same internal name.

Offline Frank

  • Inactive/Retired
  • *
  • Posts: 1431
  • Languages: DE
Re: [network games] add robustness against pak duplicates
« Reply #12 on: August 30, 2018, 03:47:48 PM »
....
I assume the windows build was built using GCC like the Linux build? If it is a self built MSVC build this could be caused by another error.

Could you link to the source code for that pakset? Might help with diagnosing the cause.

Build in screen is from nigtly-page.
But also the release 120.3 GDI of Sourceforge shows this behavior.

Linux is in Repo from simutrans-germany.com - compiled 64bit from Sourceforge source zip 120.3

pak64.german 112.3.10 create by: Makeobj version 55.1 for Simutrans 112.1 and higher

I have just seen that the object name exists twice.

tunnel.underground_power.pak
way.underground_power.pak

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: [network games] add robustness against pak duplicates
« Reply #13 on: August 30, 2018, 05:18:25 PM »
Quote
I have just seen that the object name exists twice.

tunnel.underground_power.pak
way.underground_power.pak
Yeh this falls into the bug this thread is discussing.

If this is a complete pakset (no addons) then the solution is to rename or remove one of the duplicate object records. There is no excuse for a complete pakset to contain duplicate records as that would be a pakset error.

If it is occurring in the addons then this topic should hopefully come to a solution, eventually. My recommendation being a load order file so that addon authors can explicitly define the order duplicates get loaded in.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #14 on: August 30, 2018, 09:02:43 PM »
Add-on authors should just fix their mistakes just like pak set authors. It is add-on users that may need to write load order files when add-on makers can't come up with unique names.

Offline Leartin

  • Devotee
  • *
  • Posts: 1079
  • PAK-DEV P192C
  • Languages: DE, EN
Re: [network games] add robustness against pak duplicates
« Reply #15 on: August 31, 2018, 06:14:52 AM »
Add-on authors should just fix their mistakes just like pak set authors. It is add-on users that may need to write load order files when add-on makers can't come up with unique names.
Add-on authors may use the same name as pakset authors on purpose to replace an object in the pakset, rather than add a new one. This shouldn't cause an issue by itself as long as the user correctly places them in the addon folder.
One add-on author is unlikely to use the same name twice, but you would usually mix add-ons from various authors. A smart add-on creator might start to add his name in the name of each object, but nobody ever told anyone to do that, plus it seems wrong for non-attribution creations. I don't think it's the add-on creator at fault here.
Furthermore, the issue only becomes appearent for network games. I think it can be put on the game host to make sure the custom pakset they need to spread out to people to play is able to be played. Providing tools and instructions for finding duplicates would help, though.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #16 on: August 31, 2018, 07:30:41 AM »
Furthermore, the issue only becomes appearent for network games.
Certainly more apparent, as it becomes a stopping error, but it is still apparent if I install two passenger car add-ons and only find one new passenger car in the game, because both add-ons use the same (likely very generic) object name. Although it is perhaps more likely to happen with non-vehicle objects?

I think it can be put on the game host to make sure the custom pakset they need to spread out to people to play is able to be played.
Ideally, the paks (and simuconf.tab?) should be downloaded from the server along with the map. However, Simutrans can't do much without a pak set, and it can't switch or modify the data once loaded.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: [network games] add robustness against pak duplicates
« Reply #17 on: September 03, 2018, 08:26:02 AM »
r8565 will display a more prominent warning of overlaid objects.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4493
  • Languages: EN, DE, AT
Re: [network games] add robustness against pak duplicates
« Reply #18 on: September 04, 2018, 07:26:03 PM »
Frank's issue is neither fixed by the patch above (it would hide such issues) nor by the commit prissi mentioned.

Pak-objects with different type but same name are perfectly fine for everything. They will not be notices by that overlay detection. The problem is in the computation of the pakset checksum, here type is not considered.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: [network games] add robustness against pak duplicates
« Reply #19 on: September 05, 2018, 03:08:06 PM »
Most easy might be to include the type string into the sum the one from objversion), using inthashtable and then iterate over this and next over the name in them. But one the other hand, objects with same name but different type will also mass up the translation. Maybe it should just store the type with the checksum and then give an error, when you try to overlay it with another object with different type. That could also allow to sort the mismatched objects by type. The start would be to add the type to pak_info.cc hashtable, and extend each obj.
Code: [Select]
void bridge_reader_t::register_obj(obj_desc_t *&data)
{
bridge_desc_t *desc = static_cast<bridge_desc_t *>(data);

bridge_builder_t::register_desc(desc);

checksum_t *chk = new checksum_t();
desc->calc_checksum(chk);
- pakset_info_t::append(desc->get_name(), chk);
+ pakset_info_t::append(desc->get_name(), obj_bridge, chk);
}
...


Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: [network games] add robustness against pak duplicates
« Reply #20 on: September 05, 2018, 03:34:31 PM »
But one the other hand, objects with same name but different type will also mass up the translation.
Frank's example looks like a legitimate case for using the same object name deliberately to get the same translation. I haven't looked at the pak files in question, but it would seem like one is the tunnel definition and the other is the tunnel's dedicated way. There is no need for these to have different names, especially translated, as they function as one.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: [network games] add robustness against pak duplicates
« Reply #21 on: September 12, 2018, 03:00:27 PM »
r8577 decorates the name with the first three letters of the obj type, which should avoid any issues with same name but different type.