News:

Simutrans.com Portal
Our Simutrans site. You can find everything about Simutrans from here.

[Project Canceld] MakeObj Patch: Comments on all rows in dat

Started by Max-Max, April 13, 2013, 01:24:16 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Max-Max

Hi,

This is my first contribution. This patch makes the .dat parsing a little bit more stable by removing all trailing non readable characters ( character 0x00 up to and including 0x20).

This patch also allows comments at the end of any line in the .dat file.

Example (ignore the crappy google translations):

Obj=menu
name=GeneralTools
Image[0]=> menu_buttons.0.1   # Query tool
Image[1]=> menu_buttons.0.2   # Demolition
Image[2]=> menu_buttons.0.13  # Land Lift
Image[3]=> menu_buttons.1.13  # Land Lower
Image[4]=> menu_buttons.1.12  # Land restore
Image[5]=> menu_buttons.0.6   # Marker (plate)
Image[6]=> -                  # Block delete reservation
Image[7]=> menu_buttons.1.1   # Transformer station
Image[8]=> menu_buttons.1.2   # Found new city
Image[9]=> menu_buttons.1.3   # Plant trees
Image[10]=> menu_buttons.1.4  # Schedule maintenance
Image[11]=> menu_buttons.1.5  # Construction map editor
Image[12]=> menu_buttons.1.6  # Industry link
Image[13]=> menu_buttons.0.3  # Plant forest
Image[14]=> menu_buttons.0.4  # Replace stop
Image[15]=> menu_buttons.0.5  # Make stop public
Image[16]=> menu_buttons.0.8  # Add city car
Image[17]=> menu_buttons.5.7  # Buy house
--


I'm not sure what to do with this patch? Who/when decides when to include this in the trunk? Should I check it in somewhere?
- My code doesn't have bugs. It develops random features...

IgorEliezer

Quote from: Max-Max on April 13, 2013, 01:24:16 AMI'm not sure what to do with this patch? Who/when decides when to include this in the trunk? Should I check it in somewhere?
A member of the devteam will check your patch and, if okay, it will be committed in the SVN trunk.

BTW, nice addition. :)

Max-Max

I tried to make some .dat files and had a hard time due to its sensitivity and the lack of comments =)

Okay, so all patches should be posted here and just wait for a developer to check it?
Are they moved from this forum when they have been checked?

- My code doesn't have bugs. It develops random features...

IgorEliezer

Quote from: Max-Max on April 13, 2013, 02:31:08 AM
Okay, so all patches should be posted here and just wait for a developer to check it?
Are they moved from this forum when they have been checked?
Yes.
Once checked, they are moved to one of the sub-forums here: http://forum.simutrans.com/index.php?board=33.0

TurfIt

Inline comments looks quite nice.

One thing, IMHO author attributions in the code makes things messy in the future when others make changes. Any objection to removing your name from the comments? Note: patches from others have the author shown in the svn commit message.

Dwachs

Can one still  bind the '#' key to a tool? In pak64, '#' is used to toggle the grid.
Parsley, sage, rosemary, and maggikraut.

Fabio

Maybe add an escape character?

Or require ## instead of # for inline comments?

Max-Max

@Dwachs
I was not aware of the # bind situation and this patch will not work as it is now (it will count the first occurrence as the remark character).

Can you please post an example I can test with?



@TurfIt
I had seen comments from others in the code so I added mine too. But I have no problems by removing it.
- My code doesn't have bugs. It develops random features...

Ters

Quote from: Fabio on April 13, 2013, 10:20:33 AM
Maybe add an escape character?

Or require ## instead of # for inline comments?
I'd go for double comment symbols over an escape character, though it makes end-of-line comments different from those on their own lines, which must still support single comment symbol for backward compatibility.

Dwachs

Quote from: Max-Max on April 13, 2013, 11:47:03 AM
@Dwachs
I was not aware of the # bind situation and this patch will not work as it is now (it will count the first occurrence as the remark character).

Can you please post an example I can test with?
The file that contains the key bindings (pak*/config/menuconf.tab) is also read by this class. Just test with pak64, press '#' and see, what happens (or not).
Parsley, sage, rosemary, and maggikraut.

Max-Max

I'm almost sure it will not work. I will have a look at the problem. I'm not comfortable with a mix of two almost alike comment sequences.

I'll see what I can do...
- My code doesn't have bugs. It develops random features...

Max-Max

Question:

What is the difference between #16 and # in these two lines?

simple_tool[0]=0,#16
simple_tool[12]=12,#

- My code doesn't have bugs. It develops random features...

prissi

Personally, I would refrain from this, as this tools also reads pak set definitions, where this character might be needed in copyright strings and the like.

Furthermore, you can add these comments even now for dat files line in pak which are reading images (if you do not use a dot in the comment).

Max-Max

I have already implemented C++ comment style, both /* block */ and //inline. I also tossed in ## as inline as well.

Right now I'm testing and improving the stability a bit. I will send a new patch as soon I'm done testing...
- My code doesn't have bugs. It develops random features...

prissi

As mentioned, this is not needed for makeobj. You can use it right now, if you refrain form writing somthing like "3.14" in the comments. Otherwise, modify besch_/writer/image_write.cc please.

Max-Max

There are many places where whitespaces are removed. I made a trim_whitespace() function, removing all characters, to the left and right, up to and including space.

For me it feels like this should be in some sort of helper module, or string tools. Should I add it to the class  tabfile_t (where I use it) or as a global function somewhere else?
- My code doesn't have bugs. It develops random features...

Max-Max

Here is the new patch that implements C++ style comments for both makeobj and Simutrans' config files.
I left the old comment style (# and spc) intact. I didn't add the ## style because it might mess with future plans...

I know this might not be so useful for those who already are well familiar with the .pak and config files, but for a green little man/woman this is very useful to bring order and keep notes of things. It is up to everyone for them self if they want to use it or not.

This was my test case:

// in line comment
// in line comment
/**
* This is a single
* comment block
*
*/

# Old style
Ancient style

Obj = menu                      // inline comment
name =  GeneralTools
Image[0] => menu_buttons.0.1   /* inline comment block */

Image[1]=> /*inline comment block 2*/ menu_buttons.0.2
/*inline comment block 3*/Image[2]=> menu_buttons.0.13

Image[3]=> /* mixed style 1*/ menu_buttons.1.13  // Land Lower
/* mixed style 2 */ Image[4]=> menu_buttons.1.12  // Land restore

Image[5]=>    menu_buttons.0.6   // Marker (plate)
Image[6]=> -                  // Block delete reservation
Image[7]=> menu_buttons.1.1   /* Transformer station,
very special test case */ Image[8]=> menu_buttons.1.2   // Found new city
/* extreme */ Ima/*block*/   ge[9]=/* commenting */ >menu_buttons.1.3   // Plant trees
Image[10]=> menu_buttons.1.4  // Schedule maintenance
Image[11]=> menu_buttons.1.5  // Construction map editor
Image[12]=> menu_buttons.1.6  // Industry link
Image[13]=> menu_buttons.0.3  // Plant forest
Image[14]=> menu_buttons.0.4  // Replace stop
Image[15]=> menu_buttons.0.5  // Make stop public
Image[16]=> menu_buttons.0.8  // Add city car
Image[17]=> menu_buttons.5.7  // Buy house
--
--


It can be one or more whitespace around a key or value, these are trimmed down, except for leading on keys (this is still a comment).
Whitespace within a value is not removed.

As I understood, both simutrans and makeobj are using
tabfile_t::read_line()
to read these files. The comments are resolved and removed at this level. This makes them invisible to the rest of the applications.

Is there another way to read these files, other through
tabfile_t::read_line()
function?

PS. If the devteam thinks that ## should be included, just let me know...

Enjoy...

EDIT:
This is the wrong patch file, see next post...
- My code doesn't have bugs. It develops random features...

Max-Max

It seems like the uploaded patch file is cached somewhere. I will try to upload in this post again...

EDIT:
Yes this is the one =)
- My code doesn't have bugs. It develops random features...

jamespetts

Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

IgorEliezer

Quote from: jamespetts on April 14, 2013, 06:28:12 PM
This is a rather splendid idea!
Yes, it makes easier for people who got started with addon making to learn from the DAT files.

Dwachs

Looks good :)

Two comments:
(a) in tabfile.h you have some stray 'tabfile_t:: ' within class  declaration
(b) our coding style is not that much consistent, but we do enclose all blocks by curly brackets
Parsley, sage, rosemary, and maggikraut.

Max-Max

Quote from: Dwachs on April 14, 2013, 06:50:14 PM
(a) in tabfile.h you have some stray 'tabfile_t:: ' within class  declaration
(b) our coding style is not that much consistent, but we do enclose all blocks by curly brackets

I read the coding standard doc, but I must have missed a spot... I'll have a look when I get home tonight and post a new patch...
- My code doesn't have bugs. It develops random features...

Max-Max

This is hopefully the final patch.

I only fixed the coding format and documentation according to the coding style document.
- My code doesn't have bugs. It develops random features...

prissi

I am not very keen on allowing comments in from of commands.

Other than this I am not sure what to gain from another kind of commetns. (Especially keeping in mind that almost all user of dat files are rathermore graphic biased than programmer, multiline or indentation c comments look rather dangerous, in terms of mismatching or comments starting within comments). Also the comment string and some other server sections of simuconf.tab may contain already such strings.

Moreover, multiline comments work already fine, they are ignored. ;)

Ok, after these inital sceptics: I am still not against further comment styles. But I would suggest to put all those directly into the parsing of the file name in the image_writer. That would take them out of simuconf.tab and some other parsing.

Max-Max

I'm by far not the person to argue about the code, after poking around in it for a few days, but I don't really understand where in image_writer you refer to. Can you give me the function name so I know where to look?

I have single stepped through simutrans and in simmain.cc function simu_main() a tabfile_t instance is created to read the simuconf.tab file.

tabfile_t simuconf;
char path_to_simuconf[24];
// was  config/simuconf.tab
sprintf(path_to_simuconf, "config%csimuconf.tab", path_sep[0]);
if(simuconf.open(path_to_simuconf)) {
{
tabfileobj_t contents;
simuconf.read(contents);
// use different save directories
multiuser = !(contents.get_int("singleuser_install", !multiuser)==1  ||  !multiuser);
found_simuconf = true;
}
simuconf.close();
}


Furter down simuconfig.tab is access through this object all along. In umgebung_t::default_einstellungen.parse_simuconf the simuconf.tab is again read through the tabfile_t where all comments are removed before each line is returned.

tabfile_t::read() is calling tabfile_t::read_line() and it is in there all comments are removed.

For newbies the abilities to put a remark on the same line is invaluable when it comes to learning the system. Putting a remark over each statement in a .dat file makes it very hard to read.

EDIT:
image_writer_t::write_obj() is called after the .dat or .tab file has been read. All comments has already been removed at this point. Was this your concern?
- My code doesn't have bugs. It develops random features...

prissi

My concern is that putting comment removal in tab file handling will remove too much, as those routines are used in many situations, even for reading description of webservers and the like as well as UTF8 and some other encodings.

If you remove the comments in the string the image writer uses (i.e. only there the comments are filtered out), then these issues will not arise, i.e. you will not affect what people can write into copyright strings, web addresses and the like.

(To be overly pedantic: so far image files starting or using spaces as well a # were completely legal filenames ... ;) )

Max-Max

Thank you for the explanation, I do see the problem now...

Okay, back to the drawing board again...
- My code doesn't have bugs. It develops random features...

Max-Max

Allright, I have made some more digging and need some advice.

If I move the comment removal into the image_writer we still have a problem in
root_writer_t::write()
.

Because the
tabfile_t::read()
function is storing all keys and values, the values are stored together with any following comments. Then later on when
root_writer_t::write()
retrieves the name key it may have a comment appended to its value generating an invalid filename.

To remove comments, it has to be done both in
root_writer_t::write()
and
image_writer_t::write_obj()
. Since we also work with keys/values we can't keep track of block comments either. Even if we drop block commenting (C style) there will be 3 places (including today's comment style) where comments are removed.

I still think the best way would be to remove them in
tabfile_t::read_line()
if we can come up with a good inline sequence not interfering with something else, like http:// addresses or common copyright information.

As Fabio suggested, we could use for example use " ##" but with a leading space to make it more distinct.
- My code doesn't have bugs. It develops random features...

prissi

At this stage you will almost certainly break something, if you want to change the comment sign rules. Especially, sicne the was no rule, everything was a comment until explicitely matching something useful. (Which is probably the most non-programmer friendly way on comments.) Just remember, that simutrans was made for non-programmers. Also makeobj is made for non-programmer.

Back on topic: What is the problem with a comment removal in line 208 in image_writer.cc? The image is read there and then only written. After that the filename is no longer used at all.

Max-Max

I intend to keep the old comment style as well, but I fell that the addition of placing a comment behind a key/value will make the documentation of .Dat files much easier.

Since I don't know how the full code works it looks like a bad practice to scatter the comment removal all over the place (3 places)... But if these are the only 3 places, well it isn't beautiful but it works ;)

I will drop the block comment style /* */ since we can't track multiline comments in the hash table.

What do you suggest as inline comment? " //" or " ##" (one leading space) or something else?
- My code doesn't have bugs. It develops random features...

VS

Quote from: prissi on May 01, 2013, 08:27:24 PM
sicne the was no rule, everything was a comment
This.

Anyway! Inline comments would (will?) be a good thing to have, I think, based on my disability to decipher my own comments :D

Given that comments were always informally done with #, ## feels right (to me).

My projects... Tools for messing with Simutrans graphics. Graphic archive - templates and some other stuff for painters. Development logs for most recent information on what is going on. And of course pak128!

Max-Max

@Prissi,

It is not sufficient to do the comment removal on line 205 in image_writer.cc. I found more places where key values are retrieved.

root_writer.cc line 79
name = name + obj.get("obj") + "." + obj.get("name") + ".pak";

obj_writer.cc line 30,31
string type = obj.get("obj");
string name = obj.get("name");


obj_writer_t::write_head 45,46
const char* name = obj.get("name");
const char* msg = obj.get("copyright");


I can think of 4 solutions.

1) Remove comments every where a key value is retrieved (many different places and a potentially risk of missing a place).

2) Iterate through the hash table and remove all comments after all key values has been inserted (root_writer.cc after line 75).

3) Since we use " ##" as inline comment, would it be safe to do it as I originaly did, in the tabfile_t::read_line() ?

4) It is ugly but we can do solution 3 and check the key name to exclude keys that may interfere with the inline comment sequence.
What do you suggest?
- My code doesn't have bugs. It develops random features...

neroden

Warning: the rules for what is respected by makeobj are NOT the same as the rules for what is respected by simutranslator.

It would be nice if it were possible to modify the simutranslator code to process the files according to the same rules.  Does anyone have access to that code?

Max-Max

Are you referring to the language tab files in the text folder (en.tab for example)?

This will not interfere with them as long you don't use the in-line comment sequence (and only if solution 3 or 4) is implemented.
- My code doesn't have bugs. It develops random features...

VS


My projects... Tools for messing with Simutrans graphics. Graphic archive - templates and some other stuff for painters. Development logs for most recent information on what is going on. And of course pak128!