News:

SimuTranslator
Make Simutrans speak your language.

Coding style

Started by An_dz, June 02, 2018, 06:34:27 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

An_dz

I was checking out our coding style file and I would like to propose some changes to improve.

1. The first is to remove the final notice about MSVC 6.00, I don't think anyone still uses that so we should choose if we want all code to follow the first or the second style.

2. We should also remove all and every copyright/author notices in the files. Let's remove those This file is part of the Simutrans project headers and those @author stuff. The first is not necessary, legally all files inside the repo follow the license file by default and are heavily outdated (And we should add the license.txt to the root of trunk). The second is useless because that's why we use version control and it would require lots of lines to add all the names.

3. We should add info about the case and the format, like MACROS should be uppercase and everything else lower. Everything uses underlines and not camel case (get_data vs GetData).

4. We should also recommend passing objects as const-reference instead of copying them.

5. We should change some styles. Currently the styles asks for no space between statement and parenthesis and two spaces at start and end of parameters inside parenthesis. I propose the more common space on start and no spaces inside, optionally if parenthesis would collide, but then space must be added both in start and end, and space between closing parenthesis and opening brace. Examples:
// Current
if(  test  &&  is_test  ) {
    ...
}
// Proposed
if (test && is_test) {
    ...
}
if ( (test == 2) && is_test ) { // parenthesis would collide, space added at start and end
    ...
}
draw_it( scr_coord(x, y + 4 * (p + w)) );

Other info we should add is that functions should have starting brace at next line, while at the same line with statements. We should also require space around operators to improve readability, no space between function name and parenthesis. And we should remove that functions return type must be in a separate line.

i=5+size; // wrong, unreadable
i = 5 + size; // correct

if (test<3) // wrong, unreadable and looks like function statements
{
if (test < 3) { // correct

int get_it (int x) { // wrong, looks like base statements
int get_it(int x) // correct
{


6. All functions, classes and files must have a doxygen style comments. Doxygen allows automatic creation of documentation, it's also very readable and understandable. We lack documentation too much, I know that it's boring and time consuming but it will help everybody understand what's going on. Below is a complete look including all of the options we would use:

/** < Must start with two asterisks to let doxygen know it's documentation and not just code comment.
* @brief One liner of what the function does
*
* A long description explaining the function more in-depth.
* It can contain multiple lines, makes automatic links to other
* functions and has *markdown* support.
*
* @note It can have multiples notes
*
* @info It can also have extra information
*
* @warning It can contain a warning for sensible info
*
* @param x Explain parameter X
* @param y Explain parameter Y
*
* @return Explain what the function returns
* @return You can add more possible returns, like:
* @return -1 if an error happens
*/

/**
* @file filename.h
* Describes what the cc/h file do.
*/

ceeac

Quote from: An_dz on June 02, 2018, 06:34:27 AM5. We should change some styles.

I suggest using tools like clang-format or uncrustify to automate the process. Doing the change by hand would be much too tedious imo.

Ters

Quote from: An_dz on June 02, 2018, 06:34:27 AM
2. We should also remove all and every copyright/author notices in the files. Let's remove those This file is part of the Simutrans project headers and those @author stuff. The first is not necessary, legally all files inside the repo follow the license file by default and are heavily outdated (And we should add the license.txt to the root of trunk). The second is useless because that's why we use version control and it would require lots of lines to add all the names.
I do not think all contributors are mentioned in the SVN log, in part because they contributed before SVN was used or because they did not have commit rights and someone else committed for them. At least prissi mentioned who wrote the patches he commits on behalf of others, but you have to know that that's what the thing in parenthesis is.

@author is however useless (once all those mentioned by them are mentioned elsewhere), as code gets rewritten, sometimes completely, without the author being updated. Or people keep adding themselves as @author to a function or file until the @author list is longer than the function or file itself. We've stopped using at work, although we do have perfect history of contributions (or at least did until some projects started using Git).

Quote from: An_dz on June 02, 2018, 06:34:27 AM
4. We should also recommend passing objects as reference instead of copying them.
I prefer const reference and non-const pointer. That makes it possible to spot from the call site whether the arguments may have changed when the function returns. Such functions are bad from a mathematical standpoint, but performance considerations may dictate it. (C and C++ also doesn't allow multiple return values, except by using wrapper types. Newer C++ has tuples that can be used as the wrapper type, but I find them awfully verbose in use, compared to Python.)

Compilers are supposed to implement pass-by-const-reference as pass-by-value if that is more optimal, such as for koord. They can't do so for non-const reference, as the caller can not know if the called modifies the value.

prissi

About references, it depends. Things like handle are ment to be lightweight and adding references to them just increase work.

About the formatting: Personally I like if(  sss  ) { very much, and it was the style of which most Simutrans was written until 2004. But even then people contributed in their own style.

I tried reformatting automated formatter tools (including uncrustify) at least twice, before making it open source and years later after many contributions in other than the original style. But I could not generate something that was nice. EIther there were spaces too much or too little, and the delibrately added spaces in and after statements tend to be gone too. In the end, the code was not better readable, which should be the main purpose.

Also such reformatting break any lingering patches and makes applying the patches to experimetnal even more difficult.

About the magic D_MARGIN_... macros versus gui_theme_t..magin_... well, I think I like the macros. Any decent IDE can point me to the definition, and having six times gui_theme_t::xxx in a single line will not help increase readabilitz either. Having D_XXX+xpos makes immediately clear which is constant (UPPERCASE) and which is variable.

But I agree that the coding style is outdated. However, when I tried to update it the last time, we could not agree one a style ...

An_dz

Quote from: Ters on June 02, 2018, 08:33:24 AM
I prefer const reference [...]
Compilers are supposed to implement pass-by-const-reference [...]
Yes, const-reference, I forgot to specify.

Quote from: prissi on June 02, 2018, 01:01:24 PM
About the formatting: Personally I like if(  sss  ) { very much, and it was the style of which most Simutrans was written until 2004. But even then people contributed in their own style.
No problem, the double spacing is ok, I just want to know if we should keep no space between statement and parenthesis or add it. And if we should apply this rule to function definitions and calls.

TurfIt

A previous discussion: Simutrans coding style. There's another one hiding around here somewhere too...

And a second vote for retaining the double spaces   if(  blah  ) {    good.     if (blah){   horrid.  And double spaces around logical operators, and singles around arithmetics. Give it 20 years and see what your eyes say is more readable.

References - bah. Only when necessary. Otherwise plain good old pointer is more readable given the explicit syntax (non const references are evil).

At some point the language standard version should be settled on too. Should also be discussions on that around.

ACarlotti

Quote from: TurfIt on June 02, 2018, 04:23:30 PMA previous discussion: Simutrans coding style.
The topic or board you are looking for appears to be either missing or off limits to you.

QuoteThere's another one hiding around here somewhere too...
I do remember reading another one recently, but I forget where it was.

Personally I think I prefer "if (blah){" or "if (blah {", but I'd happily go with anything sensible for consistency reasons if I knew what consistent meant. And I think there's a lot of "if (blah){" in Extended so in practice it's often that.

An_dz

Quote from: TurfIt on June 02, 2018, 04:23:30 PM
And a second vote for retaining the double spaces   if(  blah  ) {    good.     if (blah){   horrid.  And double spaces around logical operators, and singles around arithmetics. Give it 20 years and see what your eyes say is more readable.
I don't have a preference, the only thing I actually like is a space around the parenthesis: if (  blah  ) { good, if(  blah  ){ bad.

Quote from: TurfIt on June 02, 2018, 04:23:30 PM
References - bah. Only when necessary. Otherwise plain good old pointer is more readable given the explicit syntax (non const references are evil).
Edited that it's const-references. Non const are truly evil.

Quote from: TurfIt on June 02, 2018, 04:23:30 PM
At some point the language standard version should be settled on too. Should also be discussions on that around.
American vs British? I'm used to British, James is British, most devs are Europeans so are used to British.

Ters

Quote from: An_dz on June 02, 2018, 05:10:20 PM
American vs British? I'm used to British, James is British, most devs are Europeans so are used to British.
The English I learned in school was probably British, but I've probably learned most of my English writing from computer software made in the USA, and from the Internet, where I have no idea what kind of English it is. So I write something that could be called Common English (or Bastard English).

Dwachs

We should delete the codingstyle.txt file.

The coding style is defined by the existing code. Of course, the style is not consistent at all. I do not care about this. I also do not like when commits contain noise from changing in white space because of coding style.

I agree that all this @author stuff can be removed instantly.
Parsley, sage, rosemary, and maggikraut.

ACarlotti

Quote from: Dwachs on June 02, 2018, 06:22:48 PMWe should delete the codingstyle.txt file.
There may still be some value in seeing how things were intended to be done in the past. Maybe it would be better to add a note at the beginning along the lines of 'the rest of the file is historical; the most important thing is to try to be clear and locally consistent, and don't make unnecessary changes.'

prissi

Another vote to delete coding style. Everyone contributes for a long time in their own style. Only, if code is unreadable, one should interfere. We are free software, so let's embrace freedom.

An_dz

So, here's what we can agree on:

1. removing the author/copyright notices
2. no coding style, whatever you prefer

This is what I ask for now:

3. adding our license on the root of trunk (so people can find it easier)
4. add doxygen documentation on functions/classes/files (we can discuss how to do it later)

DrSuperGood

QuoteOtherwise plain good old pointer is more readable given the explicit syntax (non const references are evil).
Why are non-const references evil? The main reason to use a reference over a pointer is that references cannot (assuming sane usage) be null. For a function that mutates the inputs this can be useful because it implicitly forces the arguments to be non-null as opposed to pointers which might be null and so either need to be explicitly tested for null or the function will likely cause a null dereference crash. I understand there are cases where one might want to pass null as an argument, in such case pointers are the only way to go.
QuoteAt some point the language standard version should be settled on too. Should also be discussions on that around.
Currently it appears to be US English? There are currently hundreds of US English "color" used in variable names and comments which is not the UK English word "colour".

I would suggest deleting all function comments from above the implementation definitions. Currently some function comments are mirrored both in the header files above the function declaration as well as in the implementation files above the function definition. If the function comment has to be changed for any reason, eg due to an API change with the function, this means that it currently has to be changed in 2 locations rather than just the header. Any decent IDE tool should be pulling the comments from above the function declaration anyway.

One also should clearly specify which version of C++ is being used as well as what the target platforms are. It is very easy to accidently use a modern C++ language feature only to find that it breaks building on some platforms.

Ters

Quote from: DrSuperGood on June 03, 2018, 06:15:02 AM
Why are non-const references evil?

Because I can not tell if the following code is safe:

if (i > 0) {
  foo(i);
  array[i] = x;
}


foo could potentially be

void foo(int& i) {
  i = -1;
}


Hopefully, such a method would have a name indicating a operation that would obviously involve changing i, but when scanning through the code looking for what may trigger the access violation caused by attempting to read array[-1], I would be looking for syntactical patterns. whatever(i) would parse as a read operation and simply skipped. whatever(&i) would trigger a deeper investigation.

ACarlotti

Quote from: DrSuperGood on June 03, 2018, 06:15:02 AMAny decent IDE tool should be pulling the comments from above the function declaration anyway.
I'm currently using vim/grep/gcc/gdb/git to work on this, so I generally look first in the .cc for the implementation (which is usually a more reliable indicator of what's going on), and might miss coments only in the .h files. But that doesn't mean you have to adapt entirely for my workflow - just any comments acting as a warning against breaking things should perhaps appear with the implemention too.

Quote from: Ters on June 03, 2018, 09:29:39 AMwhatever(i) would parse as a read operation and simply skipped. whatever(&i) would trigger a deeper investigation.
I (at least partly) agree with Ters. When i is a simple data type, then the ability for f(i) to modify i is something I would overlook unless it was clearly indicated. (For complicated objects I would be more likely to expect pass-by-reference rather than pass-by-value, so it's less of an issue to me there). This is almost certainly intuition I've developed while using Python; before 4 months ago I had never used C++ and had only used basic C features.

Dwachs

Quote from: An_dz on June 03, 2018, 12:34:43 AM
So, here's what we can agree on:

1. removing the author/copyright notices
2. no coding style, whatever you prefer

This is what I ask for now:

3. adding our license on the root of trunk (so people can find it easier)
4. add doxygen documentation on functions/classes/files (we can discuss how to do it later)
yes to all of them
Parsley, sage, rosemary, and maggikraut.

Ters

Fortunately, C++ isn't as bad as Fortran when it comes to pass-by-reference. Fortran allows foo(1) to change the value of 1.

DrSuperGood

QuoteBecause I can not tell if the following code is safe:
One should not need to tell if the following code is safe because one should not be writing such code to begin with...

  • If i does not need to be modified it should be made constant. In such case a compile time error is thrown because a non-constant reference to a constant int is being attempted.
  • When using a function like foo it is clear that the argument i must be a return value since it is a non-constant reference. In such case one should be allocating unique local storage and passing a reference to that rather than trying to reuse some local which value is needed later.
  • If foo is meant to both read from i and then write to i that potentially points towards a flawed/nonsense API.
QuoteI'm currently using vim/grep/gcc/gdb/git to work on this, so I generally look first in the .cc for the implementation (which is usually a more reliable indicator of what's going on), and might miss coments only in the .h files. But that doesn't mean you have to adapt entirely for my workflow - just any comments acting as a warning against breaking things should perhaps appear with the implemention too.
I am referring to function API comments, and not comment lines inside functions. Currently many such function API comments, the sort that An_dz proposes be moved to doxygen, are located both in the header by their declaration and in the implementation by their definition. This seems like pointless duplication to me as changes would have to be made to both for consistency which is error prone and unlikely to always happen.

The reason I suggest keeping them only in the header file by the function declaration is because it seems the most natural place for API comments as one can quickly browse and understand the function declarations. Equally well though one could adopt the Java approach and throw them in by the function definitions so that there is no ambiguity during implementation but one then has to rely on pre-built API documents to understand the whole API as trying to browse through thousands of lines of implementation is slow. In either case there should only be a single copy of the function API comment to modify.

Ters

Quote from: DrSuperGood on June 03, 2018, 10:12:30 AM
If i does not need to be modified it should be made constant. In such case a compile time error is thrown because a non-constant reference to a constant int is being attempted.
That is the point. References should be const.

Quote from: DrSuperGood on June 03, 2018, 10:12:30 AM
When using a function like foo it is clear that the argument i must be a return value since it is a non-constant reference. In such case one should be allocating unique local storage and passing a reference to that rather than trying to reuse some local which value is needed later.
In a perfect world. In reality, a developer notices the function used somewhere, but doesn't check the declaration, much less the definition. That may cause the developer to unwittingly drop the copying during a minor rewrite or when using the function elsewhere. Such considerations must be as impossible to ignore as possible. The ultimate solution is to make everything immutable, although other considerations may cause that to be unfeasible.

Quote from: DrSuperGood on June 03, 2018, 10:12:30 AM
If foo is meant to both read from i and then write to i that potentially points towards a flawed/nonsense API.
That is unrelated to the use of references over pointers. I've seen such and API, and being a C API, it used pointers. It didn't make it better.

Quote from: DrSuperGood on June 03, 2018, 10:12:30 AM
Equally well though one could adopt the Java approach and throw them in by the function definitions
I think the Java approach is to put the documentation by the declaration, not the definition. It is just that for plain classes, they are one and the same. When there is an interface and an implementation, the documentation is at the interface. Of course, one reason for this is that the interface is a contract established before the implementation might even have been started on.

DrSuperGood

QuoteI think the Java approach is to put the documentation by the declaration, not the definition.
Java has no concept of separate declarations... When one declares a method in Java one defines it at the same time. That definition might be abstract (implemented in child class) but still it is defined with that respect. As such there is no alternative but to place API comments along with the definitions.
QuoteWhen there is an interface and an implementation, the documentation is at the interface. Of course, one reason for this is that the interface is a contract established before the implementation might even have been started on.
Children are free to overwrite the API documentation of an abstract method inherited from either an abstract class or interface. The standard Java API does this quite a lot to elaborate specific implementation variances of the parent interface/class. In any case there is still 1 copy of the API comments, as opposed to currently in the Simutrans source where the same comments are duplicated multiple times between headers and implementation files.

Ters

Quote from: DrSuperGood on June 03, 2018, 11:27:38 AM
Java has no concept of separate declarations...
Exactly, which is why there isn't so much of an alternate Java approach.

Quote from: DrSuperGood on June 03, 2018, 11:27:38 AMChildren are free to overwrite the API documentation of an abstract method inherited from either an abstract class or interface.
Just like Simutrans is free to put documentation in both header and source files. The point is was trying to convey was that it isn't common in Java to not document the interface, but rather put all documentation on the implementation. There is a difference between interface-classes and implementation-classes in Java and header and source in C/C++, since interface-classes and implementation-classes is also something that exists in C++ in parallel to the header-source concept, and you can put implementation in header files, but within the scope of this discussion, there are some similarites.

isidoro

Convenience can't be the only reason to discard non-const references.  You can't tell for sure if a variable is modified looking just at the place where a function is called.  So what?  Look at the definition.  It happens for objects in Java and there isn't such a big problem, is it?  Just don't assume what you can't assume.

Regarding UK vs. US English, do you think it is such a big problem so as to break patches, porting code, etc.?

Does the "u" or its absence in colo(u)r such a big issue?  Are we talking about different languages with big variations in words that make understanding impossible or just different flavo(u)rs of the same language?  Please be more flexible and sensible.

An_dz

There's no reason to change US <-> UK English. Just use whatever you wish, but at least keep it consistent, I won't create gui_colour_element because all the others don't have the u.

TurfIt

Deleting codingstyle.txt is fine; Not like we'd ever agree on anything anyway.  :P

By deciding on a language standard, I actually meant C++ version, not UK vs US English. Either is fine there, except maybe that 'connexion' thing in Sim-EX. That's just "wrong"!

An_dz

#25
I think we should aim at the following:

Simutrans must compile at least on:

- The oldest supported Ubuntu
- The oldest supported Fedora
- The latest Arch Linux
- The latest MSYS2/MinGW-w64
- The oldest supported Visual Studio
- The latest Haiku (x64)

And Simutrans must run on Windows XP SP3 and up (When XP drops below 1% of market share we can consider going to Windows 7)

As such the current limitations are:

- It must compile on GCC 4.8.2 because of Ubuntu 14.04 LTS
- It must compile on Visual Studio 2010 SP1 (if we consider extended support)
- It must compile on Visual Studio 2013 U5 (if we consider mainstream support)
- It must also compile on GCC 8 (Arch, Fedora 28), 7 (Fedora 27, MinGW-w64, Ubuntu 17.10, 18.04, 18.10) and 5 (Ubuntu 16.04)
- It must run on Windows XP SP3 and up

C++ 11 is fully supported on GCC 4.8, on Visual Studio it's only fully supported on VS 2015. VS 2010 has lambda, rvalue references, auto, decltype, static_assert, and nullptr, some are partial support. VS 2013 has more support for C++11.

Supported C++ versions and features on GCC can be found here, standard library support is here. For Visual C++ you can find both here, but it mostly only shows stuff from VS 2015 and up. VS 2013 support can be seen here, a little more info about STL support can be found here. VS 2010 info, and more info about VS 2013/2012, can be found here.

Ters

Quote from: isidoro on June 03, 2018, 11:00:21 PM
It happens for objects in Java and there isn't such a big problem, is it?

At least some think it is. There is much encouragement to make all objects immutable. It is part of the functional programming drive. Concurrency issue complicates having mutable objects further, but still. Unfortunately, legacy code interacts poorly with immutable objects. And pretty much the entire standard library is made up of mutable objects.

I'd say about half of the bugs in our applications at work is due to objects passed by reference getting mutated, and failure to revert those changes when the method fails for one reason or the other. Sure, one can make it work be fixing the faulty error handling, but people just have a hard time getting that right.

prissi

About C++ standard, why not try to keep it as low as possible. It will be anyway a moving target, and the excersize of updatiing things like the nightlz bulding system (using GCC four, because running on debian stable) should set the standard. Also Haiku still uses GCC 4.xx

Ters

Could you be more specific? GCC 4.x spans from 2005 to 2016. I think it picked up some new standards along the way. Then they went from 5 to 8 in the last three years.

An_dz

Quote from: prissi on June 04, 2018, 03:12:59 PM
About C++ standard, why not try to keep it as low as possible.
The lowest possible is C++98.

Quote from: prissi on June 04, 2018, 03:12:59 PM
things like the nightlz bulding system (using GCC four, because running on debian stable) should set the standard.
That's why I wrote that chunk of stuff up here with a good explanation of what can be set. My proposition allows anyone to check by themselves the current support, no need to say GCC4 and keep that for 20 years until someone asks "Let's upgrade to GCC 6", this way the day Ubuntu 14.04 ends support everybody can jump to GCC 6 without asking.

Quote from: prissi on June 04, 2018, 03:12:59 PM
Also Haiku still uses GCC 4.xx
No, it's already on version 7.3.0. (x64)

FreeBSD has all gcc options from 4.7 up to 8.xx. The stable options are 4.7.4, 4.8.5, 4.9.4, 5.5.0, 6.4.0, 7.3.0.

The main limitation, as I wrote above, are Ubuntu 14.04 and Visual Studio.

Ters

I wonder if Windows 7 will drop below Windows XP before Windows XP drops below 1%. It has happened to both Windows Vista before it and Windows 8 and 8.1 after it. Although for Vista, there was little reason not to move on to 7, and 8 was an oddball stepping stone towards 10. (Some see 8 as superior to 10, but if they also see it as superior to 7, then I can not understand their way of thinking.)

Gentoo Linux by the way is pushing GCC 6.4 last I checked, which completes the line-up of GCC versions. They warned about some incompatibilities between 5 and 6, which required a complete recompile of the system (both of them). That made me hesitate about upgrading, but I went for it. Maybe not all are ready for that.

An_dz

#31
Another point I would like to make is that we should not be constrained to old standards because of old releases. As long as it does not break compiling on the old version we should accept it. If a new standard just causes warnings on older compilers we should not prevent adding such new features.

Quote from: Ters on June 04, 2018, 05:49:19 PM
(Some see 8 as superior to 10, but if they also see it as superior to 7, then I can not understand their way of thinking.)
Windows 8 is faster than 7 and 10 and the only difference between 7 and 8 is the lack of a start menu and 'aero' transparency.

Quote from: Ters on June 04, 2018, 05:49:19 PM
I wonder if Windows 7 will drop below Windows XP before Windows XP drops below 1%.
No way, Windows 7 is too high, and the fall rate of both are very close, XP is just slightly slower. Also there's a lot of software that doesn't run on XP anymore, the "LTS" version of Firefox has just dropped support for XP/Vista, which basically means that all browsers are dead on XP. Most games also don't run on XP, even fairly old ones. I think XP will drop below 1% by middle 2019. Windows 8.1 drop rate is much slower by comparison but since its code base is modern it won't be unsupported any time soon.

Edit:
Stats from May from our site and forum: (not excluding bots I believe)


OSWindows XPWindows VistaWindows 7Windows 8Windows 8.1Windows 10MacLinux
Portal1%0%25,8%0,5%7,1%35,1%8,3%12%
Forum1,7%0,6%23,6%0,3%5,1%37,25%5,5%16%

Ters

Quote from: An_dz on June 04, 2018, 08:25:17 PMMost games also don't run on XP, even fairly old ones.
Most of the games I play struggle to run at anything newer. But it is not likely to be games that is keeping XP alive, or which is driving those still using it to upgrade. Windows XP was sort of the OS when everyone got online. As long as they can still using Internet Explorer to do online banking, and Outlook Express to read e-mail, they are fine. And maybe play Solitaire once in a while. Some might be happy that they no longer have to spend most of the time the computer is turned on installing updates. Then there are all the businesses tied to XP because their expensive, tailor made applications don't run on anything newer, or because no one can be bother to try. They are not likely relevant to us as targets, but they will affect the statistics available to us.

DrSuperGood

QuoteWindows 8 is faster than 7 and 10 and the only difference between 7 and 8 is the lack of a start menu and 'aero' transparency.
The only slowdown I noticed with Windows 10 over Windows 7 was the result of Windows 10 using more memory for the OS than Windows 7 did. After getting more RAM such slowdown disappeared entirely.

Can we all agree that the old windows sockets driven network implementation with IPv4 only support can die already? That would remove a considerable amount of code from the network implementation increasing maintainability.

prissi

AMIGA_OS requires IPV4_ONLY, so that part has to stay for now. But you can delete the two winsock ifdefs inside, if it makes you happy.