The International Simutrans Forum

 

Author Topic: [Project Canceld] MakeObj Patch: Comments on all rows in dat  (Read 13606 times)

0 Members and 1 Guest are viewing this topic.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
[Project Canceld] MakeObj Patch: Comments on all rows in dat
« on: April 13, 2013, 01:24:16 AM »
Hi,

This is my first contribution. This patch makes the .dat parsing a little bit more stable by removing all trailing non readable characters ( character 0x00 up to and including 0x20).

This patch also allows comments at the end of any line in the .dat file.

Example (ignore the crappy google translations):

Obj=menu
name=GeneralTools
Image[0]=> menu_buttons.0.1   # Query tool
Image[1]=> menu_buttons.0.2   # Demolition
Image[2]=> menu_buttons.0.13  # Land Lift
Image[3]=> menu_buttons.1.13  # Land Lower
Image[4]=> menu_buttons.1.12  # Land restore
Image[5]=> menu_buttons.0.6   # Marker (plate)
Image[6]=> -                  # Block delete reservation
Image[7]=> menu_buttons.1.1   # Transformer station
Image[8]=> menu_buttons.1.2   # Found new city
Image[9]=> menu_buttons.1.3   # Plant trees
Image[10]=> menu_buttons.1.4  # Schedule maintenance
Image[11]=> menu_buttons.1.5  # Construction map editor
Image[12]=> menu_buttons.1.6  # Industry link
Image[13]=> menu_buttons.0.3  # Plant forest
Image[14]=> menu_buttons.0.4  # Replace stop
Image[15]=> menu_buttons.0.5  # Make stop public
Image[16]=> menu_buttons.0.8  # Add city car
Image[17]=> menu_buttons.5.7  # Buy house
--


I'm not sure what to do with this patch? Who/when decides when to include this in the trunk? Should I check it in somewhere?
« Last Edit: May 31, 2013, 01:19:22 AM by Max-Max »

Offline IgorEliezer br

  • Devotee
  • Administrator
  • *
  • Posts: 4085
  • Cake recipes are cool... REALLY!
    • Igor Eliezer Architect and Urban Planner/Arquiteto e Urbanista
  • Languages: PT, EN, AutoLISP, Python
I'm not sure what to do with this patch? Who/when decides when to include this in the trunk? Should I check it in somewhere?
A member of the devteam will check your patch and, if okay, it will be committed in the SVN trunk.

BTW, nice addition. :)

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
I tried to make some .dat files and had a hard time due to its sensitivity and the lack of comments =)

Okay, so all patches should be posted here and just wait for a developer to check it?
Are they moved from this forum when they have been checked?


Offline IgorEliezer br

  • Devotee
  • Administrator
  • *
  • Posts: 4085
  • Cake recipes are cool... REALLY!
    • Igor Eliezer Architect and Urban Planner/Arquiteto e Urbanista
  • Languages: PT, EN, AutoLISP, Python
Okay, so all patches should be posted here and just wait for a developer to check it?
Are they moved from this forum when they have been checked?
Yes.
Once checked, they are moved to one of the sub-forums here: http://forum.simutrans.com/index.php?board=33.0

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1324
Inline comments looks quite nice.

One thing, IMHO author attributions in the code makes things messy in the future when others make changes. Any objection to removing your name from the comments? Note: patches from others have the author shown in the svn commit message.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4587
  • Languages: EN, DE, AT
Can one still  bind the '#' key to a tool? In pak64, '#' is used to toggle the grid.

Offline Fabio

  • Devotee
  • Administrator
  • *
  • Posts: 2898
  • The Pak128 Guy
    • Visit me on Facebook
  • Languages: EN, IT, RO, FR
Maybe add an escape character?

Or require ## instead of # for inline comments?
« Last Edit: April 13, 2013, 10:27:50 AM by Fabio »

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
@Dwachs
I was not aware of the # bind situation and this patch will not work as it is now (it will count the first occurrence as the remark character).

Can you please post an example I can test with?



@TurfIt
I had seen comments from others in the code so I added mine too. But I have no problems by removing it.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5520
  • Languages: EN, NO
Maybe add an escape character?

Or require ## instead of # for inline comments?
I'd go for double comment symbols over an escape character, though it makes end-of-line comments different from those on their own lines, which must still support single comment symbol for backward compatibility.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4587
  • Languages: EN, DE, AT
@Dwachs
I was not aware of the # bind situation and this patch will not work as it is now (it will count the first occurrence as the remark character).

Can you please post an example I can test with?
The file that contains the key bindings (pak*/config/menuconf.tab) is also read by this class. Just test with pak64, press '#' and see, what happens (or not).

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
I'm almost sure it will not work. I will have a look at the problem. I'm not comfortable with a mix of two almost alike comment sequences.

I'll see what I can do...

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Question:

What is the difference between #16 and # in these two lines?

simple_tool[0]=0,#16
simple_tool[12]=12,#


Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
Personally, I would refrain from this, as this tools also reads pak set definitions, where this character might be needed in copyright strings and the like.

Furthermore, you can add these comments even now for dat files line in pak which are reading images (if you do not use a dot in the comment).

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
I have already implemented C++ comment style, both /* block */ and //inline. I also tossed in ## as inline as well.

Right now I'm testing and improving the stability a bit. I will send a new patch as soon I'm done testing...

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
As mentioned, this is not needed for makeobj. You can use it right now, if you refrain form writing somthing like "3.14" in the comments. Otherwise, modify besch_/writer/image_write.cc please.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
There are many places where whitespaces are removed. I made a trim_whitespace() function, removing all characters, to the left and right, up to and including space.

For me it feels like this should be in some sort of helper module, or string tools. Should I add it to the class  tabfile_t (where I use it) or as a global function somewhere else?

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Here is the new patch that implements C++ style comments for both makeobj and Simutrans' config files.
I left the old comment style (# and spc) intact. I didn't add the ## style because it might mess with future plans...

I know this might not be so useful for those who already are well familiar with the .pak and config files, but for a green little man/woman this is very useful to bring order and keep notes of things. It is up to everyone for them self if they want to use it or not.

This was my test case:

Code: [Select]
// in line comment
// in line comment
/**
 * This is a single
 * comment block
 *
 */

# Old style
 Ancient style

Obj = menu                      // inline comment
name =  GeneralTools
Image[0] => menu_buttons.0.1   /* inline comment block */

Image[1]=> /*inline comment block 2*/ menu_buttons.0.2
/*inline comment block 3*/Image[2]=> menu_buttons.0.13

Image[3]=> /* mixed style 1*/ menu_buttons.1.13  // Land Lower
/* mixed style 2 */ Image[4]=> menu_buttons.1.12  // Land restore

Image[5]=>    menu_buttons.0.6   // Marker (plate)
Image[6]=> -                  // Block delete reservation
Image[7]=> menu_buttons.1.1   /* Transformer station,
very special test case */ Image[8]=> menu_buttons.1.2   // Found new city
/* extreme */ Ima/*block*/   ge[9]=/* commenting */ >menu_buttons.1.3   // Plant trees
Image[10]=> menu_buttons.1.4  // Schedule maintenance
Image[11]=> menu_buttons.1.5  // Construction map editor
Image[12]=> menu_buttons.1.6  // Industry link
Image[13]=> menu_buttons.0.3  // Plant forest
Image[14]=> menu_buttons.0.4  // Replace stop
Image[15]=> menu_buttons.0.5  // Make stop public
Image[16]=> menu_buttons.0.8  // Add city car
Image[17]=> menu_buttons.5.7  // Buy house
--
--

It can be one or more whitespace around a key or value, these are trimmed down, except for leading on keys (this is still a comment).
Whitespace within a value is not removed.

As I understood, both simutrans and makeobj are using
tabfile_t::read_line()
to read these files. The comments are resolved and removed at this level. This makes them invisible to the rest of the applications.

Is there another way to read these files, other through
tabfile_t::read_line()
function?

PS. If the devteam thinks that ## should be included, just let me know...

Enjoy...

EDIT:
This is the wrong patch file, see next post...
« Last Edit: April 14, 2013, 03:27:12 AM by Max-Max »

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
It seems like the uploaded patch file is cached somewhere. I will try to upload in this post again...

EDIT:
Yes this is the one =)

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 18684
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
This is a rather splendid idea! Excellent contribution.

Offline IgorEliezer br

  • Devotee
  • Administrator
  • *
  • Posts: 4085
  • Cake recipes are cool... REALLY!
    • Igor Eliezer Architect and Urban Planner/Arquiteto e Urbanista
  • Languages: PT, EN, AutoLISP, Python
This is a rather splendid idea!
Yes, it makes easier for people who got started with addon making to learn from the DAT files.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4587
  • Languages: EN, DE, AT
Looks good :)

Two comments:
(a) in tabfile.h you have some stray 'tabfile_t:: ' within class  declaration
(b) our coding style is not that much consistent, but we do enclose all blocks by curly brackets

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
(a) in tabfile.h you have some stray 'tabfile_t:: ' within class  declaration
(b) our coding style is not that much consistent, but we do enclose all blocks by curly brackets

I read the coding standard doc, but I must have missed a spot... I'll have a look when I get home tonight and post a new patch...

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
This is hopefully the final patch.

I only fixed the coding format and documentation according to the coding style document.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
I am not very keen on allowing comments in from of commands.

Other than this I am not sure what to gain from another kind of commetns. (Especially keeping in mind that almost all user of dat files are rathermore graphic biased than programmer, multiline or indentation c comments look rather dangerous, in terms of mismatching or comments starting within comments). Also the comment string and some other server sections of simuconf.tab may contain already such strings.

Moreover, multiline comments work already fine, they are ignored. ;)

Ok, after these inital sceptics: I am still not against further comment styles. But I would suggest to put all those directly into the parsing of the file name in the image_writer. That would take them out of simuconf.tab and some other parsing.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
I'm by far not the person to argue about the code, after poking around in it for a few days, but I don't really understand where in image_writer you refer to. Can you give me the function name so I know where to look?

I have single stepped through simutrans and in simmain.cc function simu_main() a tabfile_t instance is created to read the simuconf.tab file.

Code: [Select]
tabfile_t simuconf;
char path_to_simuconf[24];
// was  config/simuconf.tab
sprintf(path_to_simuconf, "config%csimuconf.tab", path_sep[0]);
if(simuconf.open(path_to_simuconf)) {
{
tabfileobj_t contents;
simuconf.read(contents);
// use different save directories
multiuser = !(contents.get_int("singleuser_install", !multiuser)==1  ||  !multiuser);
found_simuconf = true;
}
simuconf.close();
}

Furter down simuconfig.tab is access through this object all along. In umgebung_t::default_einstellungen.parse_simuconf the simuconf.tab is again read through the tabfile_t where all comments are removed before each line is returned.

tabfile_t::read() is calling tabfile_t::read_line() and it is in there all comments are removed.

For newbies the abilities to put a remark on the same line is invaluable when it comes to learning the system. Putting a remark over each statement in a .dat file makes it very hard to read.

EDIT:
image_writer_t::write_obj() is called after the .dat or .tab file has been read. All comments has already been removed at this point. Was this your concern?
« Last Edit: April 14, 2013, 11:41:38 PM by Max-Max »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
My concern is that putting comment removal in tab file handling will remove too much, as those routines are used in many situations, even for reading description of webservers and the like as well as UTF8 and some other encodings.

If you remove the comments in the string the image writer uses (i.e. only there the comments are filtered out), then these issues will not arise, i.e. you will not affect what people can write into copyright strings, web addresses and the like.

(To be overly pedantic: so far image files starting or using spaces as well a # were completely legal filenames ... ;) )

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Thank you for the explanation, I do see the problem now...

Okay, back to the drawing board again...
« Last Edit: April 15, 2013, 05:34:07 PM by Max-Max »

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Allright, I have made some more digging and need some advice.

If I move the comment removal into the image_writer we still have a problem in
root_writer_t::write()
.

Because the
tabfile_t::read()
function is storing all keys and values, the values are stored together with any following comments. Then later on when
root_writer_t::write()
retrieves the name key it may have a comment appended to its value generating an invalid filename.

To remove comments, it has to be done both in
root_writer_t::write()
and
image_writer_t::write_obj()
. Since we also work with keys/values we can't keep track of block comments either. Even if we drop block commenting (C style) there will be 3 places (including today's comment style) where comments are removed.

I still think the best way would be to remove them in
tabfile_t::read_line()
if we can come up with a good inline sequence not interfering with something else, like http:// addresses or common copyright information.

As Fabio suggested, we could use for example use " ##" but with a leading space to make it more distinct.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
At this stage you will almost certainly break something, if you want to change the comment sign rules. Especially, sicne the was no rule, everything was a comment until explicitely matching something useful. (Which is probably the most non-programmer friendly way on comments.) Just remember, that simutrans was made for non-programmers. Also makeobj is made for non-programmer.

Back on topic: What is the problem with a comment removal in line 208 in image_writer.cc? The image is read there and then only written. After that the filename is no longer used at all.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
I intend to keep the old comment style as well, but I fell that the addition of placing a comment behind a key/value will make the documentation of .Dat files much easier.

Since I don't know how the full code works it looks like a bad practice to scatter the comment removal all over the place (3 places)... But if these are the only 3 places, well it isn't beautiful but it works ;)

I will drop the block comment style /* */ since we can't track multiline comments in the hash table.

What do you suggest as inline comment? " //" or " ##" (one leading space) or something else?

Offline VS

  • Senior Plumber (Devotee)
  • Devotee
  • *
  • Posts: 4855
  • Vladimír Slávik
    • VS's Simutrans site
  • Languages: CS,EN
sicne the was no rule, everything was a comment
This.

Anyway! Inline comments would (will?) be a good thing to have, I think, based on my disability to decipher my own comments :D

Given that comments were always informally done with #, ## feels right (to me).

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
@Prissi,

It is not sufficient to do the comment removal on line 205 in image_writer.cc. I found more places where key values are retrieved.

root_writer.cc line 79
name = name + obj.get("obj") + "." + obj.get("name") + ".pak";

obj_writer.cc line 30,31
string type = obj.get("obj");
string name = obj.get("name");


obj_writer_t::write_head 45,46
const char* name = obj.get("name");
const char* msg = obj.get("copyright");


I can think of 4 solutions.

1) Remove comments every where a key value is retrieved (many different places and a potentially risk of missing a place).

2) Iterate through the hash table and remove all comments after all key values has been inserted (root_writer.cc after line 75).

3) Since we use " ##" as inline comment, would it be safe to do it as I originaly did, in the tabfile_t::read_line() ?

4) It is ugly but we can do solution 3 and check the key name to exclude keys that may interfere with the inline comment sequence.
What do you suggest?
« Last Edit: May 02, 2013, 05:46:13 PM by Max-Max »

Offline neroden

  • Devotees (Inactive)
  • *
  • Posts: 831
  • Nathanael Nerode
Warning: the rules for what is respected by makeobj are NOT the same as the rules for what is respected by simutranslator.

It would be nice if it were possible to modify the simutranslator code to process the files according to the same rules.  Does anyone have access to that code?

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Are you referring to the language tab files in the text folder (en.tab for example)?

This will not interfere with them as long you don't use the in-line comment sequence (and only if solution 3 or 4) is implemented.

Offline VS

  • Senior Plumber (Devotee)
  • Devotee
  • *
  • Posts: 4855
  • Vladimír Slávik
    • VS's Simutrans site
  • Languages: CS,EN
We have a site / online app for translation, which gets fed the very same .dat files...

http://simutranslator.simutrans-germany.com/

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
A comment in the copyright string is not very likely. Any letter combination is valid copryright strings (including \n) as well as for names. Especially for this line comments this does not make sense. (There are even objects with spaces in their names). [## is a valid filename too.]

I would also refrain from comments after obj. Usually the comment of what an object comes can be just plain text lines above the actual object (as it is now in most dats I saw [if there was a comment at all]).

As said, simutranslator is fed with the same datfiles. Since simutranslator will ignore images it cannot retrieve, comments after images are ok. But for all other entries I would abstain from comments. Moreover, apart from simutranslator (where dat files are uploaded too) there are several parsing tools which would break too. The all can cope wil random text at the beginning because they ignore nonmatching lines, but otherwise will be confused by C style block comments and trailing line comments.

If a pakset author really wants to use those, maybe just call a preprocessor before parsing by makeobj is the simpler solution.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5520
  • Languages: EN, NO
I find that comments on the end of lines make files more readable than files with comments and statements on alternate lines. Does to ability to write ## in copyright strings really outweigh this? Those who use # in file names are heading for trouble either way, as that character has a special meaning in URLs.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
@Prissi

A little bit earlier (reply #34) I gave 4 solutions and asked for your opinion, you must have missed that one :)
« Last Edit: May 03, 2013, 12:37:18 PM by Max-Max »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
My standing is that we are heading for trouble with this, and some pak sets like pak128 or pak192 and other which use sripting systems would struggle on this. Same maybe as tile cutter. Giving the times to implement that on simutranslators and the phython tools, and so on I am very reluctant to add this.

Commenting stuff at the end of the line is a very programmer thicking. You would not enter a box with your name
Pristovsek ## Maybe Dr. in front?
into an official form. Rather
Name=Pristovsek
So why you would expect any artist doing such weird stuff then? And if you want to comment offsets, no problem as you could do this already today (just avoid the .)

Sure it is easy to add a remove_trailing_comments() function, and call it for whatever object you want. I could see the use for images, and maybe for some other entires (where comments were possible also today due to ignoring superflous input.)

But I highly doubt a comment feasture will be used at all. I am the only pakset developer who replied on this thread. Without further input, I somewhat have to assume this is a solution without a problem.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Isn't it more confusing if you can put a ## comment after an image[] but not after a name or obj?
If ## is implemented it should be the same for the whole "script".

Also, just because ## is implemented doesn't mean that everyone HAS to use it right away everywhere on every single line.
I also said I will drop the block commenting /* */, no point since # can be used instead as you stated in the beginning.

I think we have agreed so far to use the sequence (space)## for trailing comments. I gave 4 alternatives with pros and cons, asking which of them would you (prissi or other developers) prefer?

Offline kierongreen

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 2269
I've used end of line # comments for 2 years now and it worked just fine, I didn't even think to check whether it was allowed first...
Code: [Select]
Obj=ground
Name=LightTexture
Image[0][0]=images/texture-lightmap.0.14 # flat ns|ew
Image[1][0]=images/texture-lightmap.1.6 # sw1
Image[2][0]=images/texture-lightmap.0.6 # sw2
Image[3][0]=images/texture-lightmap.1.7 # se1
...
Of course it worked in this case because image.y.x is looking for x and y being numbers so anything afterwards is ignored.

Offline VS

  • Senior Plumber (Devotee)
  • Devotee
  • *
  • Posts: 4855
  • Vladimír Slávik
    • VS's Simutrans site
  • Languages: CS,EN
IMO external tools can adjust really easily...

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5520
  • Languages: EN, NO
IMO external tools can adjust really easily...
But I'd prefer it if their authors came forth and said that they were in on the change, and not only had the ability, but also the time to do the change. It's not uncommon for people here to suddenly take some time off for weeks or months. It took a while to gather all subwebsite owners for the server move. This is not only a code change, but a change of an (informal?) standard, so all involved parties should be heard first, or at least have had a reasonable chance to be heard. Is all even aware of this topic?

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9511
  • Languages: De,EN,JP
Ok, there was aparently a misunderstanding. I would prefer such comments for individual lines: Like for images, maybe also for all numerical fields get_int() but not automatically for standard entries.

Offline Max-Max

  • Coder/patcher
  • *
  • Posts: 670
    • MK Development
  • Languages: SV,EN
Okay, after the discussion on projects that just dies :P I will step forward and post an good example of what "to do" :)

I have abandoned this concept because I don't see the point to have remarks only to work on some lines and not on some others. I would like to see a consistent system that would work on all lines.

So, I have put this project back into the drawer to collect dust...