News:

Simutrans Wiki Manual
The official on-line manual for Simutrans. Read and contribute.

Support for read/write streams for save files

Started by ceeac, November 04, 2020, 03:42:18 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

ceeac

This patch adds two pieces of code: rdwr_stream_t and classify_file. rdwr_stream_t is the base class for opaque sinks/sources of data, while the classify_file() function is used to determine file format and other information such as file version for save files. While the new code could also be used for reading and writing pak files, for now I've only changed the code in loadsave_t.

When reading a save file, the idea is to first use classify_file to determine the format of a save file; the information can then be used to create the appropriate rdwr_stream_t to read the data. When writing the save file, one can create the appropriate rdwr_stream_t directly to save the data in the requested format.

The most prominent advantage of this approach is that it is easier to add new algorithms for save file compression. Another advantage is that one can now add code to send the game state directly instead of saving it to a file when a client connects. Furthermore, classify_file can be easily extended in the future to support more file formats like png,bmp etc to make it easier to determine the format of image files for the height map loader or for other purposes.

prissi

#1
I admid your effort to introduce change. But what is the gain here, apart from making error hunting more difficult for me and other long term contributors compared to a relatively error free and quite optimised existing code? Never change a running system as an engineer says, not without a good reason. Aparently I miss the point here.

Peronally replacing true by "loadsave_t::FILE_STATUS_OK" does not improve the readability for me as it draw more attention that the funktion call in front, even more since loadsave_t:: already indicates this is a file.

So please elaborate a little on your motivation to change this.

I think you energy might yield a higher outcome in forward looking activities. Forisntance you had worked on Cmake in experimental. Since Cmake is more or less requried to make an Android version, I think progress there might be more helpful. After so many efforts to have a useful interface even for tablet users, it is a shame there is still no Android port of Simutrans apart from the old proof of concept from pelya. (Also if CMake would work with Mingw, one could switch to it as well.)

There are so many other things which are eternal todos, like directional wayobj_t (which would introduce proper defined single lange roads on any pakset without the hackish way of the directional road patch, as well as eyecandy like one side guardrails, one sided caternatry, sound walls, etc.), or things like proper MacOS support, and stuff you started like seperating image loading from heightmap.

freddyhayward

Quote from: prissi on November 04, 2020, 11:52:42 PMI admid your effort to introduce change. But what is the gain here, apart from making error hunting more difficult for me and other long term contributors compared to a relatively error free and quite optimised existing code? Never change a running system as an engineer says, not without a good reason. Aparently I miss the point here.
Answered here:
QuoteThe most prominent advantage of this approach is that it is easier to add new algorithms for save file compression. Another advantage is that one can now add code to send the game state directly instead of saving it to a file when a client connects. Furthermore, classify_file can be easily extended in the future to support more file formats like png,bmp etc to make it easier to determine the format of image files for the height map loader or for other purposes.
And even if these advantages are not appreciated by standard, they would be invaluable to extended and especially on large maps (although in this situation, that would require a new topic).

prissi

First technical comment: Probing file format for zlib may not work, as there is not real header. 0x1f 0x8b will only probe for gzip compressed archive, but zlib might rather start with 0x78 with the next byte the compression. Simutrans used all kinds of compressions over the years, and it can be set in simuconf.tab. See here the "headers": https://stackoverflow.com/questions/9050260/what-does-a-zlib-header-look-like That is why Simutrans first checks for ZSTD and BZip2, and then tries to open the file with zlib (which also read normal uncompressed files). So classify as zip is not needed, all remaining files should be read via zlib.

Sending files directly makes not much sense because the game must be saved to be reloaded anyway.

My fear is always that introducing a big change might potentially introduce new errors in parts that were more or less proven. And in the past I ended up debuging code I did not know just because of change. So with my limited time, I tend to be cautious. But when coding, I admid I am very conservative, and as such may be not the most open person to such kind of patches.

Thus I would prefer to hear what the other contributors say to it.

This is now getting a little offtopic, but actually adding zstd was not very time consuming, thank you, you speak to the person who added it. Most difficult is usually the documentation around and the libraries. Also why calling files "streams" is beyond me. My data should stay, not float away. I have never heard of a "Stream manager". Those are files for users, so I would call them files. Modren programmer may call them what they like, but somehow I fail to see a stream for that.

So I appreciate the effort, am a little cautious thinking of my workload, but I will listen to the others.

prissi

In order to listen to others, could some of the developers add his thoughs to it please.

Dwachs

In my opinion, the patch looks good. The interfaces are clean. I am in favor of including it.

There is also another streaming class in memory_rw.* (used for network code). Also the checksum computation in checksum.* is some kind of stream (write only).
Parsley, sage, rosemary, and maggikraut.

prissi

Ok, then it can be submitted, after changes that catch all the zlib and gzip headers and not only gzip. (Or just opening everything with zlib).
So zlib handling for first byte 0x78 or first two bytes 1F 8B

DrSuperGood

Quote from: ceeac on November 04, 2020, 03:42:18 PMAnother advantage is that one can now add code to send the game state directly instead of saving it to a file when a client connects.
This can also be considered a disadvantage since now there is no auto save created when the client joins. This is a problem I noticed with games like Factorio which stream the state directly instead of writing to file resulting in the only saves made being explicit auto saves. Given how little overhead writing a save out is, it should be kept a default behaviour that maybe can be turned off by advanced users who know the risks involved.

Having clients also write out a copy of the game save is useful. This gives clients auto saves which can be used to help recover servers should something go wrong with them. In my opinion this should remain the default behaviour with again an option to turn such a thing off for advanced users.

Also I am unsure if this is even possible as the server has to load the save that gets created. As such it requires a persistent copy of the saved data to load. This at very least would need to be stored in memory somewhere if not written out to a file.

prissi

I think so too, but ceeac is on a (likely very very long) project to forgo the reload.

ceeac

Quote from: DrSuperGood on November 12, 2020, 12:53:53 AMGiven how little overhead writing a save out is, it should be kept a default behaviour that maybe can be turned off by advanced users who know the risks involved.
Saving could also be done in parallel to sending the game state over network. This would be essentially free since disk speeds are usually higher than network speeds.

Quote from: prissi on November 12, 2020, 12:23:01 AMOk, then it can be submitted, after changes that catch all the zlib and gzip headers and not only gzip. (Or just opening everything with zlib).
So zlib handling for first byte 0x78 or first two bytes 1F 8B
Okay, submitted the patch including the suggested changes and some additional minor fixes in r9406.

prissi

#10
Thank you.

I wonder, why there is another folder RDWR inside the new folder IO? And should by this logic the PNG reader not also go here too?

We have the other reader and write fodlers, so while IO may be ok as its own folder the rdwr is rather confusing. So for the moment I would rather get rid of the rdwr folder until the IO folder gets too crowded.

EDIT:
Your patch does not compile even on the latest MSVC. Lazy fix is to take const char * for a path, that is easy enough.

Also, why is the conversion constructor file_info_t "explicit". That one cannot be accidently called as copy constructor, and even if, where is the problem?

ceeac

Quote from: prissi on November 17, 2020, 12:15:46 AMI wonder, why there is another folder RDWR inside the new folder IO? And should by this logic the PNG reader not also go here too?
My intention is to add a separate img/ folder and to move the code in height_map_loader.cc and dr_rdpng.cc to the new folder in my next patch. The code is logically separate so IMO the files should be kept separate. Maybe the rdwr/ folder should be renamed to stream/ to prevent the confusion?

Quote from: prissi on November 17, 2020, 12:15:46 AMAlso, why is the conversion constructor file_info_t "explicit". That one cannot be accidently called as copy constructor, and even if, where is the problem?
The explicit is there to prevent accidental implicit conversions from file_type_t like this:

file_info_t foo = file_info_t::TYPE_ZIPPED

prissi

What is bad with this?
file_info_t foo = file_info_t::TYPE_ZIPPED
This gives the same result as
file_info_t foo(file_info_t::TYPE_ZIPPED)
and any release build will have the same code for that as well.

About please do not add more folders. Add a folder if there are more than 20 files or so, otherwise folders just clutter everything and make it hard to find something.

Andarix

https://github.com/Andarix/simutrans/actions

win GDI compile

win SDL2 not compile

Quote"D:\a\simutrans\simutrans\Simutrans-SDL2.vcxproj" (default target) (1) ->
(Link target) ->
  loadsave.obj : error LNK2001: unresolved external symbol "public: __thiscall zstd_file_rdwr_stream_t::zstd_file_rdwr_stream_t(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,bool,int)" (??0zstd_file_rdwr_stream_t@@QAE@ABV?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@_NH@Z) [D:\a\simutrans\simutrans\Simutrans-SDL2.vcxproj]
  classify_file.obj : error LNK2001: unresolved external symbol "public: virtual __thiscall zstd_file_rdwr_stream_t::~zstd_file_rdwr_stream_t(void)" (??1zstd_file_rdwr_stream_t@@UAE@XZ) [D:\a\simutrans\simutrans\Simutrans-SDL2.vcxproj]
  D:\a\simutrans\simutrans\Simutrans_SDL2.exe : fatal error LNK1120: 2 unresolved externals [D:\a\simutrans\simutrans\Simutrans-SDL2.vcxproj]

    28 Warning(s)
    3 Error(s)

prissi

The MSVC does not support conditional compiling, so it should not use ZSTD. But somehow you seem to have USE?ZSTD=1 deined somewhere for the SDL2 build.

Andarix

r9411 not compile win, linux and mac

linux
Quote===> HOSTCXX io/classify_file.cc
io/classify_file.cc: In function 'file_classify_status_t classify_file(const char*, file_info_t*)':
io/classify_file.cc:54:30: warning: comparison with string literal results in unspecified behaviour [-Waddress]
  else if (!path  ||  path == "") {
                              ^
io/classify_file.cc:75:28: error: 's' was not declared in this scope
   if (!classify_file_data(&s, info)) {
                            ^
make: *** [build/default/io/classify_file.o] Error 1
common.mk:50: recipe for target 'build/default/io/classify_file.o' failed

win gdi
Quote"D:\a\simutrans\simutrans\Simutrans-GDI.vcxproj" (default target) (1) ->
(ClCompile target) ->
  D:\a\simutrans\simutrans\io\classify_file.cc(75): error C2065: 's': undeclared identifier [D:\a\simutrans\simutrans\Simutrans-GDI.vcxproj]

mac
Quote===> HOSTCXX io/classify_file.cc
io/classify_file.cc:54:27: warning: result of comparison against a string literal is unspecified (use strncmp instead) [-Wstring-compare]
        else if (!path  ||  path == "") {
                                 ^  ~~
io/classify_file.cc:75:28: error: use of undeclared identifier 's'
                if (!classify_file_data(&s, info)) {
                                         ^
1 warning and 1 error generated.

prissi

Thank you, that was stupid late night fix which compiled on my MSVC, because there is ZSTD. r9412 should fix this.