News:

Simutrans Tools
Know our tools that can help you to create add-ons, install and customize Simutrans.

makeobj and CSV files

Started by PJMack, March 02, 2022, 11:51:56 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

PJMack

Attached is a patch for makeobj which adds an option output a CSV file rather than a pak file.  This is functional, but not yet complete (it cannot handle items with double quotation marks yet).  I may also expand it to allow reading directly from the CSV files as well as dat files.  As paksets are becoming larger, they are also becoming more difficult to manage, re-balance, and update for new features.  Allowing CSV files be used allows pakset data to be easily modified in spreadsheets for automated and batch processing.

This should also work with to the standard version of makeobj.

ceeac

There is already a CSV_t class in utils/csv.h/cc, which also provides functions to read and write CSV files.

wlindley

What would the usage of either of those options look like?  My guess is that the CSV file created would be filled with a bunch of things that look like:

i8, 3
i16, 27


with no meaning attached with anything, the same way Simutrans produces XML files now.  I'd love to be mistaken.  I'd love for Simutrans to produce XML saves that could be usably parsed by other tools.
I did write a .dat file parser awhile back that I'm building a web front end for.  Perl programmer since 1994 in effect:
https://metacpan.org/pod/Games::Simutrans::Pakset
...which has the great advantage of being able to refer to things by name. 

In any case, eventually abolishing the inscrutable binary *.dat and *.sve files in favor of something easily handled by external tools is a longtime goal of mine and I hope to see more about your progress.
[/code]

PJMack

Quote from: ceeac on March 03, 2022, 06:14:32 AMThere is already a CSV_t class in utils/csv.h/cc, which also provides functions to read and write CSV files.
Thank you, I had overlooked these.  The CSV_t class is appeared to only be for IO and does not allow random access or appending columns.  I will look closer into the IO code to see what can be used. 


Quote from: wlindley on March 03, 2022, 11:11:12 AMIn any case, eventually abolishing the inscrutable binary *.dat and *.sve files in favor of something easily handled by external tools is a longtime goal of mine and I hope to see more about your progress.
I believe you may be mixing up *.pak files with *.dat files.  The text based *.dat files are the raw sources for the pakset formatted in a "key=value" list of properties for each pakset object.  The CSV output of the file is a spreadsheet where the top row would be all the keys, and each subsequent row be the values for the individual objects. 

This is the output for goods on pak128.britain:
catg,obj,name,value,metric,mapcolor,speed_bonus,value[0],value[1],value[2],value[3],value[4],number_of_classes,class_revenue_percent[0],class_revenue_percent[1],class_revenue_percent[2],class_revenue_percent[3],class_revenue_percent[4],to_distance[0],to_distance[1],to_distance[2],to_distance[3],to_distance[4],weight_per_unit
"0","good","Passagiere","50",,"79","18","55","50","45","40","35","5","60","100","133","150","200","16","500","2500","5000","0","70"
"0","good","Post","57","bundles","31","15","7","6","5","2","1","2","100","250",,,,"16","32","500","1000","0","1"
"0","good","livestock","56","head","115","2","67","56","52",,,,,,,,,"16","32","0",,,"120"
"0","good","None","0",,,,,,,,,,,,,,,,,,,,"1"
"0","good","Autos","284","cars","131","15","324","284","225","198",,,,,,,,"32","48","80","0",,"1800"
"0","good","FreshFish","133","paletten","149","15","110","93","70","50",,,,,,,,"32","48","80","0",,"1000"
"2","good","Kohle","45","tonnen","208","0","58","45","23","20","0",,,,,,,"32","48","80","240",,"1000"
"2","good","Eisenerz","45","tonnen","130","0","58","45","23","20","0",,,,,,,"32","48","80","240",,"1000"
"2","good","woodchip","53","tonnen","68","0","70","53","40","28","0",,,,,,,"32","48","80","240",,"1000"
"2","good","Stone","53","tonnen","12","0","70","53","40","28","0",,,,,,,"32","48","80","240",,"1000"
"2","good","clay","53","tonnen","93","0","70","53","40","28","0",,,,,,,"32","48","80","240",,"1000"
"2","good","Cement","75","tonnen","14","0","90","75","60","35",,,,,,,,"32","48","80","0",,"1000"
"2","good","grain","75","tonnen","29","1","90","75","60","35",,,,,,,,"32","48","80","0",,"1000"
"1","good","Bucher","133","paletten",,"10","155","133","100","90",,,,,,,,"32","48","80","0",,"810"
"1","good","fruit","68","paletten","133","10","80","68","51","37",,,,,,,,"32","48","80","0",,"730"
"1","good","vegetables","65","paletten","136","15","77","65","49","35",,,,,,,,"32","48","80","0",,"700"
"1","good","newspaper","105","paletten","223","10","122","105","79","71",,,,,,,,"32","48","80","0",,"790"
"1","good","hardware","69","paletten","213","3","83","69","53","38",,,,,,,,"32","48","80","0",,"750"
"1","good","Moebel","53","paletten","152","3","62","53","40","36",,,,,,,,"32","48","80","0",,"400"
"1","good","textile","57","paletten","18","3","67","57","43","39",,,,,,,,"32","48","80","0",,"430"
"1","good","flour","63","paletten","175","3","76","63",,,,,,,,,,"32","48","80","0",,"840"
"1","good","beer","74","barrels","28","3","88","74","56","40",,,,,,,,"32","48","80","0",,"800"
"1","good","cider","73","barrels","159","3","86","73","55","39",,,,,,,,"32","48","80","0",,"780"
"1","good","pharmaceuticals","91","paletten","53","3","104","91","73","64",,,,,,,,"32","48","80","0",,"580"
"1","good","Plastik","38","paletten","5","3","45","38","30","18",,,,,,,,"32","48","80","0",,"500"
"1","good","Papier","74","paletten","15","3","88","74","56","40",,,,,,,,"32","48","80","0",,"800"
"1","good","wool","29","sack",,"3","34","29","23","13",,,,,,,,"32","48","80","0",,"380"
"1","good","china","52","paletten","7","3","60","52","41","34",,,,,,,,"32","48","80","0",,"450"
"1","good","bricks","43","paletten","38","3","57","43","33","23",,,,,,,,"32","48","80","0",,"820"
"1","good","canned_food","129","paletten","10","3","148","129","103","90",,,,,,,,"32","48","80","0",,"820"
"6","good","Stahl","75","tonnen","9","2","90","75","60","35",,,,,,,,"32","48","80","0",,"1000"
"6","good","WroughtIron","53","tonnen","90","0","70","53","40","28","0",,,,,,,"32","48","80","240",,"1000"
"6","good","Bretter","93","tonnen","94","2","110","93","70","50",,,,,,,,"32","48","80","0",,"1000"
"3","good","Oel","83","kilolitres","218","0","99","83","63","45",,,,,,,,"32","48","80","0",,"858"
"3","good","Gasoline","70","kilolitres","71","0","83","70","53","38",,,,,,,,"32","48","80","0",,"740"
"3","good","FuelOil","93","kilolitres","101","0","110","93","70","50",,,,,,,,"32","48","80","0",,"1000"
"3","good","Chemicals","66","kilolitres",,"0","77","66","53","31",,,,,,,,"32","48","80","0",,"860"
"4","good","meat","47","paletten","135","15","54","47","37","31",,,,,,,,"32","48","80","0",,"410"
"4","good","fish","48","paletten","23","15","45","38","29","21",,,,,,,,"32","48","80","0",,"410"
"4","good","milk","32","churns (x5)","215","15","51","43","32","23","0",,,,,,,"32","48","80","160","0","385"

This could be opened with (or pasted into) a spreadsheet where it can be further processed.

As to the reason I used makeobj itself is to maintain compatibly.  The makeobj reads and parses the files into a tabfileobj_t data structure (actually a wrapper containing a stringhashtable_tpl and special accessors) which is then passed onto the appropriate code to convert further into the binary pak file.  I simply take several tabfileobj_t outputs from the parser and convert them into a CSV file rather than a pak file.  This has the advantage of there only being one parser function, and the keys are interpreted exactly the same way.  My plan is to also implement code to take the CSV file and convert it into multiple tabfileobj_t inputs to pass to the appropriate code to convert further into the binary pak file.

wlindley

Very nice indeed!

Surely there is a free-software C++ library you can use to output CSV (or JSON) without having to write code to handle all the edge cases of those formats?  CSV is certainly widespread, btut JSON seems a better match for what's in a .dat file, and would also enable talking directly to some JS running in a web browser.
In any case, with an upgraded makeobj, it would be wonderful to be able to import graphics and objects from, for example, Open Game Art — I had tried some years ago to assemble some new "worlds" from that... and to use Simutrans "worlds" in other game engines.

PJMack

I was able to use simutrans existing CSV IO functions.  With the attached patch, makeobj-extended would be able to read CSV files as well as the .dat files.  I do want to do some more testing before an actual PR.


Quote from: wlindley on March 04, 2022, 11:42:26 AMCSV is certainly widespread, btut JSON seems a better match for what's in a .dat file, and would also enable talking directly to some JS running in a web browser.
In any case, with an upgraded makeobj, it would be wonderful to be able to import graphics and objects from, for example, Open Game Art — I had tried some years ago to assemble some new "worlds" from that... and to use Simutrans "worlds" in other game engines.
We appear to have different use case in mind.  One example of a use case is to quickly make a table of all stations, then use spreadsheet formulas to make the costs be a function of the capacity.  Tables in spreadsheets are extensively used in cost balancing for Pak128.Britain and this will allow the pakset to be created from the spreadsheets in a more direct manor rather than manually transferring the data back and forth from the dat files.   A JSON file would not fit the bill for this use case, however you could also add JSON export capabilities using my changes as a template.  Keep in mind that some paksets, even open sourced ones, may not be licensed to allow redistribution of assets outside of the package.  I am not a lawyer and am not qualified to give advice in that regard.

prissi

The idea with dat file was to make pak creation very easy. Indeed, compared to OpenTTD, there had been much more Simutrans objects made by users, although OpenTTD is far more widespread, and recently it got also easier to make stuff there.

Therefore, I do not think that dat files will ever go away. For CI built paks, indeed, CSV may be an alternative (although CSV by Excel is not always compatible to Simutrans, as in Germany it is colon separated, and English Excel chokes on it).

PJMack

Quote from: prissi on March 05, 2022, 11:32:53 AMTherefore, I do not think that dat files will ever go away. For CI built paks, indeed, CSV may be an alternative (although CSV by Excel is not always compatible to Simutrans, as in Germany it is colon separated, and English Excel chokes on it).
This is not indended to replace the .dat files, but provide an alternative for larger paksets.  My main target use is for the re-balancing project in Pak128.Britain, which already has several LibreOffice spreadsheets for balance.  LibreOffice does allow selecting of separators for import and export. 


I did hit a snag in that the sparse_tpl in simutrans only allows matrices with up to 32768 non-zero elements (non-blank strings in this case).  I would need to fix this before releasing the final PR.

PJMack

I fixed the sparse_tpl and pushed a PR.

Here is a full summary of usage change for makeobj: The "tocsv" flag (used as the "pak" flag) converts dat files into a single CSV file containing a header of all the keys and each row containing the values for each object.  The text parser for "tocsv" is the same as for "pak" so all comments are ignored, as are invalid lines.  When using a "pak" flag, makeobj will traverse .csv files as well as .dat file.  The csv files needs a header for the keys, and one line for each object containing the values.  Keys and Values may be quoted in the CSV file, however no trailing space may occur after the quote.  Fields are separated by commas.  Values that are blank are treated as if the entire key=value pair is missing from that entry and the default value would be used for that key.  Columns with a blank key are also ignored. 

One caveat is that if attempting to convert a dat file has an error where the value is left blank (e.g. "key="), then it would be converted into a blank on the CSV file meaning the default value.  The existing code handled "key=" to sometimes result in a value of zero for that same error.  This would only create a minor problem with an erroneous edge case, not with valid dat files.

jamespetts

Thank you for this and apologies for the delay: now incorporated.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ranran(Hibernating)

As reported in another thread, this patch causes makeobj to spam a bunch of meaningless error messages.
You can see that after reverting this commit, the flood of errors is no longer printed.
(´・ω・`)シミュトランスのアップデート履歴(日本語) (※更新停止中)
bit.ly/3AuKHHP

PJMack

I have no idea what could be going on as the patch only moves the relevant code into functions.  The only think I can suggest is to try running with valgrind to check for memory errors, but I thought I did that before submitting.

Ranran(Hibernating)

Quoteas the patch only moves the relevant code into functions.
I don't think so. I believe it also contains other added changes.
I'm not familiar with makeobj's (file creating) code, but I don't think it's that hard to pinpoint where this is happening from the added changes, but finding a solution may not be easy.
As I pointed out from Matthew's post it seems to be loading the dat twice and throws a bunch of errors on the second load. This gave me a hint.
Commenting out lines 83 to 101 of root_writer.cc in the code added by the patch eliminates bunch of error spamming and works as before.
        /*
        find.search(arg, "csv");
        FOR(searchfolder_t, const& i, find) {
            CSV_file_t infile;

            if (infile.load_file(i)) {
                tabfileobj_t obj;

                writer_init(i,arg);


                infile.reset_current_obj();
                while(infile.get_object(obj)) {
                    writer_write(separate,filename,outfp,node,obj);
                }
            }
            else {
                dbg->warning( "Write pak", "Cannot read %s", i);
            }
        }*/

dat file to successfully generate a pak file without a ton of errors but that may break paking behavior from csv. I haven't tested for that because I don't know how to pak from csv. But at least I don't think this code is involved other than generating pak from csv.
Anyway you can see that this part of the code is spamming the error message. I think the problem areas have been narrowed down. See Matthew's post for what kind of errors are spammed.
(´・ω・`)シミュトランスのアップデート履歴(日本語) (※更新停止中)
bit.ly/3AuKHHP

Ranran(Hibernating)

I stumbled upon one possibility and checked it again.
I found similar code may be present on line 163-.
So I suspect you tried to move it but didn't remove the code that should have been discarded.
In that case, just removing the code I commented out in the previous post might be the correct solution.
Can you confirm this?
(´・ω・`)シミュトランスのアップデート履歴(日本語) (※更新停止中)
bit.ly/3AuKHHP

jamespetts

I have reverted the CSV code for now - I should be grateful if anyone could confirm whether this fixes the problem.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ranran(Hibernating)

#15
Quote from: jamespetts on November 04, 2022, 09:28:25 PMI should be grateful if anyone could confirm whether this fixes the problem.
The makeobj distributed in http://bridgewater-brunel.me.uk/downloads/nightly/windows/ seems to have been modified to require libgcc_s_seh-1.dll.
But I don't think such a library is necessary as the latest standard and self-build extended versions can work without it.


EDIT:
I don't think this change is still necessary even after stopping the ability to read csv, but that revert doesn't include the csv.h change.
The revert seems incomplete. There may be some commits related other than the merge.


QuoteI found similar code may be present on line 163-.
So I suspect you tried to move it but didn't remove the code that should have been discarded.
By the way, I was hoping that removing the duplicated code would solve the problem while maintaining the ability to read csv, but judging by the results of this reversion, that alone doesn't seem to solve it.


EDIT2:
Since the commit number of extended.exe distributed has not changed from #50e6805, the change may not have been reflected.
(´・ω・`)シミュトランスのアップデート履歴(日本語) (※更新停止中)
bit.ly/3AuKHHP

jamespetts

Investigation shows that the Linux version of makeobj-extended is building and that the reason that this was not appearing on the download page was because of an error in my scripts, which I have now fixed. The latest version of makeobj-extended should now be available on the server download page.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.