The International Simutrans Forum

 

Author Topic: Modern C++?  (Read 5932 times)

0 Members and 1 Guest are viewing this topic.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Modern C++?
« on: February 03, 2018, 12:39:25 AM »
I put this post in "patches and projects", although I am not sure that it is a perfect fit for this board as there is no patch and it is not a project as such - but I am not sure what other subforum would best suit this topic. Although I write from the perspective of developing the Extended branch, at least most of this will be equally applicable to Standard, which is why I have not posted this on the Extended subforum.

Simutrans is a rather old codebase. Much of the coding style in Simutrans (both Standard and Extended) dates from the 1990s when Simutrans was first written. Indeed, the project was started before even the C++98 standard was finalised, and finished only shortly afterwards.

Consequently, Simutrans has many characteristics of a codebase of its era, including copious use of raw pointers, char* or char[] strings, Simutrans-specific container classes and smart pointers (and minimal use of the STL), and little or no use of more recent language features such as inline initialisation, lambdas, constructor inheritance, the shared/unique/weak pointers and many others. Simutrans-Extended does use a C++11 feature, thread local storage, but that was a recent innovation necessary to make certain multi-threading aspects work effectively.

My own learning of C++ has been through the Simutrans codebase and what I was able to learn online when I first started working on Simutrans from the end of 2008, and I had largely missed learning about the new features introduced in C++11 and later and am only recently coming to realise that they mark a significant change in the way in which the language works, such that pre-C++11 coding styles and techniques are considered outdated. It is interesting to note that "modern C++" (as it seems that C++ using language features from C++11 onwards is now called) is enjoying a significant resurgence of interest after the importance of C++ as a language had appeared to wane in the 2000s, as the new language and STL features replicate many of the advantages of more modern languages whilst retaining the speed, cross-platform and backwards compatibility of C++.

To what extent is it worthwhile adopting some of these new language features in the Simutrans codebase (both Standard and Extended)? I do not imagine that adopting the new language features just for the sake of being up to date is likely to be worth the large effort involved in re-factoring myriad lines of code, but is there any merit in (1) selectively refactoring certain critical parts of the code; (2) adopting some or all conventions of modern C++ when code is refactored for other reasons; and (3) coding new features using some or all parts of the modern C++ style instead of the traditional C++ style?

I know that there has been some discussion of related issues in the past and the argument was made at the time that Simutrans needed to be able to compile on some very old platforms that had not even adopted compilers compatible with the C++98 standard. I do not know whether that is of non-trivial significance to any users of Simutrans-Standard now. It is definitely not relevant to Extended, which only aims to target Windows, Linux and Mac (and which since 2016 has required C++11 due to the aforementioned local storage), but I do not want to Extended codebase to diverge from the Standard codebase more than is necessary for Extended's different feature-set, so it is worthwhile adopting a joint approach to this issue where possible.

Does anyone have any idea whether the latest STL implementation for collection classes might be better or worse (for performance) than the Simutrans versions? I know that the Simutrans hashtable can be slow when there are many entries in it; but I do not know whether the STL version is any faster. Are the vectors better or worse? I notice that Simutrans has a weighted vector which is quite important that I do not believe that the STL has (although I could be wrong).

Simutrans even has its own min() and max() templates, which have caused real problems in the past: using them with 64-bit integers silently loses precision in ways that are not deterministic between different platforms, and I spent weeks tracking down network desyncs that were caused by this some time ago, eventually replacing the Simutrans min() and max() with std::min() and std::max() whenever 64-bit variables were used. Would it not be better to remove the Simutrans min() and max() entirely and use the STL versions instead to avoid this sort of problem in the future?

We will not get rid of anything but a fraction of Simutrans's raw pointers any time soon, but does anyone know how the homegrown quickstones compare to the new weak_ptr (together with shared_ptr and unique_ptr) in terms of performance? The latter are certainly more flexible in implementation, save that they cannot return a handle ID that can be used for purposes other than memory management as is currently done in many places with convoihandle_t and halthandle_t objects in particular. Is it worthwhile replacing NULL with nullptr? That could be done wholesale relatively quickly with a search/replace.

I wonder also whether using traditional C++ might over time increasingly deter newer developers from working on the project - but short of refactoring the codebase entirely, which seems impractical, I am not sure how one would handle that.

I should be very interested in thoughts from new and seasoned Simutrans developers alike on this topic.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #1 on: February 03, 2018, 01:28:49 AM »
The reason Simutrans cannot use more advanced C++ revisions is because it has to support being built with compilers that existed long before many of the revisions were made.

Using the recent File Systems API of C++ would have solved all this unicode file path nonsense dead easilly, but instead one has to hack our own file system API. Like wise a lot of typing problems could be solved using feautres such as "auto" for local variables.

Seeing how Simutrans still must support only IPv4 and running off a single thread I do not expect this to change any time soon.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #2 on: February 03, 2018, 02:20:03 AM »
Do we have data on how many (if any) people are actually still using Simutrans on these old platforms in 2018?

Incidentally, I am in the process of experimenting with modern C++ in Extended by replacing the outdated, Extended specific ITERATE macro with the C++11 ranged for (currently in the vehicle-management branch for testing). In Extended, of course, this should not be an issue, since we need C++11 compatibility in any event for the thread_local keyword.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #3 on: February 03, 2018, 03:00:30 AM »
As far as I am aware it is one of the Windows GCC ports that is the problem.

If C++ version is made, one should consider also performing the following...
  • Remove legacy Windows IPv4 code path. The Berkley socket implementation works on any reasonable version of Windows and supports both IPv4 and IPv6.
  • Force dependency on pThread. One cannot be multi-thread ignorant in this day and age where processors have more and more cores.
  • Migrate to using C++ FileSystem API if C++ version choosen supports it. Would solve a lot of platform specific file system problems.
  • Only support SDL2. Would clean up the 5 different front ends currently used down to 2.
  • Look to officially support 64bit builds. Although they currently work I am pretty sure there are optimizations to be had which could bring 64bit build performance in average use cases to parity or better than a x86 build.
  • Try to remove excessive Macro usage. A lot of the changes with recent C++ revisions were adding better template support so that one seldom has to use macros. Some of the loop macros are so complex that it breaks MSVC intellisense so it does not even recognize the holding function definition exists.


The use of standard collection classes to replace the special Simutrans ones is something that would need to be profiled. The Simutrans ones were written such to be more efficient in certain aspects.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #4 on: February 03, 2018, 12:02:20 PM »
Those are all good suggestions. Is there any reason for Simutrans to be supporting outdated versions of Windows GCC ports? Msys can cope without difficulties with C++11, and supports all the versions of Windows that Microsoft still supports. Is there any reason for Simutrans to support older versions of Windows than Microsoft supports? Anyone with such very old hardware would probably be better off installing Linux in any event.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #5 on: February 03, 2018, 04:44:40 PM »
Windows 7 (supported until 2020) is apparently not supported by MSVC 2017, only be previous versions (MSVC 2015). So this would be probably one limit.

The simutrans specific stuff is usually quite fast. For instnace the freelist, which was about 20-30 times faster than malloc/free (and had less overhead). I would suspect similar for the quickstone_tpl.

On most other things it is probably not so much different.

Also, according to this https://stackoverflow.com/questions/30829364/open-utf8-encoded-filename-in-c-windows teh filename issue is not fixed by C++ file-IO.

The question is, if potentially breaking code is worth the gain (of potentially finding errors or making the code base more appealing). Or if the time is not rather spend on something else.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #6 on: February 03, 2018, 05:28:57 PM »
The VS 2015 limitation might be a reason not to use C++17 features, but should not limit earlier features from C++11 and C++14. Indeed, I am using Visual Studio 2015 and can use thread local storage, the auto keyword, ranged for, etc.

For the collection classes, we might want to profile.

As to your final question, I suspect that the correct answer differs depending on the part of the code in question. Also, we need to consider whether to use new language features in new features or refactoring for other reasons.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #7 on: February 03, 2018, 05:40:03 PM »
Using the recent File Systems API of C++ would have solved all this unicode file path nonsense dead easilly, but instead one has to hack our own file system API.

I'm not sure how that helps unless the endpoints are in on the game. Does libz, bzip2 and SDL support the new C++ API? Last I checked, they only had a C API. And that is only if the C++ API has been implemented correctly in the first place, and does not rely on the char C API being UTF-8.

  • Remove legacy Windows IPv4 code path. The Berkley socket implementation works on any reasonable version of Windows and supports both IPv4 and IPv6.
What do you mean? I have not seen anything but Berkley sockets on Windows, in the form of Winsocks, and Winsocks supports both IPv4 and IPv6 through the same API (like Berkley sockets should).


  • Force dependency on pThread. One cannot be multi-thread ignorant in this day and age where processors have more and more cores.
  • Only support SDL2. Would clean up the 5 different front ends currently used down to 2.
I really prefer having everything statically linked, as that mean I only have the executable and the settings.xml file to worry about when testing different versions, and only the former when upgrading. I don't have to get umpteen other DLLs onto the path. Multithreading the renderer is a lot of trouble that gives me nothing in return. If going modern is the goal, then moving rendering away from the CPU should be the goal, but as far as I have been able to figure out, that pretty much requires rewriting karte_t from scratch, including lots of changes in everything interacting with it. The dirty rectangle system has to operate on the world, not on the already rendered image.


  • Look to officially support 64bit builds. Although they currently work I am pretty sure there are optimizations to be had which could bring 64bit build performance in average use cases to parity or better than a x86 build.
I don't think Simutrans can equally support both 32-bit and 64-bit without a lot of macro hacks. Even then, there is the issue that cache lines are the same size on both 32-bit and 64-bit x86. Increased pointer sizes might mean that you get bad cache line alignment, which in turn causes bad performance. However, I saw no indication of that back when I tried tried getting 64-bit builds to work at all. Maybe I didn't have big enough maps at the time, but I haven't seen people worrying about cache lines in a looong while, except here.


  • Try to remove excessive Macro usage. A lot of the changes with recent C++ revisions were adding better template support so that one seldom has to use macros. Some of the loop macros are so complex that it breaks MSVC intellisense so it does not even recognize the holding function definition exists.
If code can't be properly inlined from a template by a compiler from the last 15 years, if probably can't be written with a macro without going insane. It would probably take some insane levels of nesting.



The simutrans specific stuff is usually quite fast. For instnace the freelist, which was about 20-30 times faster than malloc/free (and had less overhead). I would suspect similar for the quickstone_tpl.


The C++ library is designed for plugable allocators, so we can use its container classes, but use freelist for the actual memory allocations. I'm don't think quickstone_tpl has a direct replacement in the C++ standard. One could perhaps solve the original problem with shared and weak pointers, but I don't know what their overhead is.


Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #8 on: February 04, 2018, 01:31:11 AM »
Quote
Windows 7 (supported until 2020) is apparently not supported by MSVC 2017
But... https://docs.microsoft.com/en-us/visualstudio/productinfo/vs2017-system-requirements-vs states that Windows 7 is supported. And...
https://docs.microsoft.com/en-us/visualstudio/productinfo/vs2017-compatibility-vs
States that building for Windows 7 is supported.
Quote
Also, according to this https://stackoverflow.com/questions/30829364/open-utf8-encoded-filename-in-c-windows teh filename issue is not fixed by C++ file-IO.
That seems to be using boost, which would be the file system API before it was added as core to C++.
Quote
I don't think Simutrans can equally support both 32-bit and 64-bit without a lot of macro hacks. Even then, there is the issue that cache lines are the same size on both 32-bit and 64-bit x86. Increased pointer sizes might mean that you get bad cache line alignment, which in turn causes bad performance. However, I saw no indication of that back when I tried tried getting 64-bit builds to work at all. Maybe I didn't have big enough maps at the time, but I haven't seen people worrying about cache lines in a looong while, except here.
A lot of Simutrans relies of 64bit stuff to work however. So in the end it is all a balancing act to try and assure the negative performance from larger pointers is countered by the higher performance from more registers and 64bit instructions.
Quote
If code can't be properly inlined from a template by a compiler from the last 15 years, if probably can't be written with a macro without going insane. It would probably take some insane levels of nesting.
Do not under estimate modern compilers, expect maybe MSVC...

Even if they cannot support it perfectly efficiently now, they might tomorrow.
Quote
What do you mean? I have not seen anything but Berkley sockets on Windows, in the form of Winsocks, and Winsocks supports both IPv4 and IPv6 through the same API (like Berkley sockets should).
This monster says otherwise...
Code: [Select]
#ifdef USE_IP4_ONLY
// Network load. Address format e.g.: "128.0.0.1:13353"
char address[1024];
static char err_str[256];
uint16 port = 13353;
const char *cp2 = strrchr(cp,':');
if(cp2!=NULL) {
port=atoi(cp2+1);
// Copy the address part
tstrncpy(address,cp,cp2-cp>sizeof(address)-1?sizeof(address)-1:cp2-cp+1);
cp = address;
}

struct sockaddr_in server_name;
memset(&server_name,0,sizeof(server_name));
server_name.sin_family=AF_INET;
#if USE_WINSOCK
bool ok = true;
server_name.sin_addr.s_addr = inet_addr(cp); // for windows we must first try to resolve the number
if((int)server_name.sin_addr.s_addr==-1) {// Bad address
ok = false;
struct hostent *theHost;
theHost = gethostbyname( cp ); // ... before resolving a name ...
if(theHost) {
server_name.sin_addr = *(struct in_addr *)theHost->h_addr_list[0];
ok = true;
}
}
if(!ok) {
#else
/* inet_anon does not work on BeOS; but since gethostbyname() can
* do this job on all other systems too, we use it only:
* instead of if(inet_aton(cp,&server_name.sin_addr)==0) { // Bad address
*/
struct hostent *theHost;
theHost = gethostbyname( cp );
if(theHost) {
server_name.sin_addr = *(struct in_addr *)theHost->h_addr_list[0];
}
else {// Bad address
#endif
sprintf( err_str, "Bad address %s", cp );
RET_ERR_STR;
}
server_name.sin_port=htons(port);

SOCKET my_client_socket = socket(AF_INET,SOCK_STREAM,0);
if(my_client_socket==INVALID_SOCKET) {
err = "Cannot create socket";
return INVALID_SOCKET;
}

if (connect(my_client_socket, (struct sockaddr*)&server_name, sizeof(server_name)) == -1) {
sprintf(err_str, "Could not connect to %s", cp);
RET_ERR_STR;
}

#else

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #9 on: February 04, 2018, 09:38:09 AM »
A lot of Simutrans relies of 64bit stuff to work however.

How can it? Which parts?

Do not under estimate modern compilers, expect maybe MSVC...

Did you think I did? I was aiming for the opposite.

This monster says otherwise...
Code: [Select]
#ifdef USE_IP4_ONLY
   // Network load. Address format e.g.: "128.0.0.1:13353"
   char address[1024];
   static char err_str[256];
   uint16 port = 13353;
   const char *cp2 = strrchr(cp,':');
   if(cp2!=NULL) {
      port=atoi(cp2+1);
      // Copy the address part
      tstrncpy(address,cp,cp2-cp>sizeof(address)-1?sizeof(address)-1:cp2-cp+1);
      cp = address;
   }

   struct sockaddr_in server_name;
   memset(&server_name,0,sizeof(server_name));
   server_name.sin_family=AF_INET;
#if USE_WINSOCK
   bool ok = true;
   server_name.sin_addr.s_addr = inet_addr(cp);   // for windows we must first try to resolve the number
   if((int)server_name.sin_addr.s_addr==-1) {// Bad address
      ok = false;
      struct hostent *theHost;
      theHost = gethostbyname( cp );   // ... before resolving a name ...
      if(theHost) {
         server_name.sin_addr = *(struct in_addr *)theHost->h_addr_list[0];
         ok = true;
      }
   }
   if(!ok) {
#else
   /* inet_anon does not work on BeOS; but since gethostbyname() can
    * do this job on all other systems too, we use it only:
    * instead of if(inet_aton(cp,&server_name.sin_addr)==0) { // Bad address
    */
   struct hostent *theHost;
   theHost = gethostbyname( cp );
   if(theHost) {
      server_name.sin_addr = *(struct in_addr *)theHost->h_addr_list[0];
   }
   else {// Bad address
#endif
      sprintf( err_str, "Bad address %s", cp );
      RET_ERR_STR;
   }
   server_name.sin_port=htons(port);

   SOCKET my_client_socket = socket(AF_INET,SOCK_STREAM,0);
   if(my_client_socket==INVALID_SOCKET) {
      err = "Cannot create socket";
      return INVALID_SOCKET;
   }

   if (connect(my_client_socket, (struct sockaddr*)&server_name, sizeof(server_name)) == -1) {
      sprintf(err_str, "Could not connect to %s", cp);
      RET_ERR_STR;
   }

#else

The Windows specific code seems unrelated to IPv4 vs. IPv6, although it only exists in the USE_IP4_ONLY case. Perhaps it has to do with incorrect/incomplete Berkley implementation in some ancient Windowses. USE_IP4_ONLY seems a bit of a misnomer in this case. It has more to do with legacy systems not supporting getaddrinfo. Maybe getaddrinfo was only introduced to Berkley sockets when IPv6 came along. I can't find any clear information about that. (It apparently came to Windows with IPv6 support.) Since getaddrinfo has been supported on Windows since XP (in Unicode since SP2), it can only be BeOS/Haiku, and a possible rule to still support Windows 2000, that requires the USE_IP4_ONLY code. I think it is five years or more since we raised minimum system requirements to Windows 2000. We still had vocal users using Windows 2000 at the time, but is that system still in use for Simutrans? I have no idea about BeOS/Haiku.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #10 on: February 04, 2018, 12:36:00 PM »
The Haiku user numbers are single digit (15 month accumulated), and AmigaOS slightly more (in then tens).

Also the code was from a time, when winsock2 was not default. Anyway, if it decreases performance and maintanbility, then lets take it out.

About putting time and effort into changing existing code to new C++ feature: I would only do this if there is much to gain.

The freelist overloads new, so I do not think hooking up later in the caller chain can ever give any additional gain. Also it is already very much proper C++. Not every new feature is more perfomant, on the contrary, many new concepts are complex and requires more logic for compiler to properly optimise. Most performant would be pure C code without class overhead ... the C++ is rather for maintainabilty (or how this is spelled) and confort, at least this was my impression.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4493
  • Languages: EN, DE, AT
Re: Modern C++?
« Reply #11 on: February 04, 2018, 01:28:14 PM »
Thanks for bringing this up. However, I fear the discussion will not lead to anything.

What new C++ features would help us to improve the code base?

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #12 on: February 04, 2018, 01:37:11 PM »
Thanks for bringing this up. However, I fear the discussion will not lead to anything.

What new C++ features would help us to improve the code base?

That needs careful thought. What is worth doing for new features is a different question to what is worth the effort of re-factoring existing features that work well, and that is different again to what is worth the effort for re-factoring code that is found not to work well.

In the first category, the auto keyword and ranged for make things much easier for the coder. It is possible that more reliable code can be produced in many cases by using the new smart pointer types instead of raw pointers (but I suspect that it is not likely to be worthwhile replacing the existing quickstones with them).

I can see that move semantics might be quite useful in one or two places (e.g. when copying/moving schedules in some instances, perhaps).

Given the problems with the built-in min/max functions, I should recommend removing those and replacing them with std::min() and std::max().

As to the collection classes, the Simutrans specific ones seem reliable, but if the STL ones perform non-trivially better, it would probably be worth using those instead (but only profiling will determine this).

An open question is the extent to which having modern coding alongside traditional coding for an extended period of time will make the code less readable.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #13 on: February 04, 2018, 06:32:22 PM »
Most performant would be pure C code without class overhead ... the C++ is rather for maintainabilty (or how this is spelled) and confort, at least this was my impression.

Not quite. Non-polymorphic C++ classes should have the same performance as a plain C struct. If you want polymorphism, then I think you might be able to do it in C with less indirect jumps than the typical C++ implementation, but that requires more memory and more instructions when allocating new objects (by inlining the vtable in each object). Of course, anything that can be done in C can be done in C++.

An std::vector should be as fast as a plain dynamically allocated array, as long as you stick to the unchecked accessors. C++ might make it easier to do things inefficient, though, as some details might be hidden. I have no idea how many allocations take place in a = b + c + d when all four variables are std::strings.

What new C++ features would help us to improve the code base?

The auto keyword can save some typing with complicated template types, in particular iterators. Simutrans uses some macro magic to hide these types, but those macros are something new developers have to figure out what does, rather than recognizing a standard C++ construct. Using it for primitives can however introduce bugs as it is no longer clear whether an integer variable is signed or how big it is.

There are now standard implementation of hash maps and single-linked lists in C++, with an interface C++ developers should be familiar with. It has to be seen how they perform, though, and if they fit what Simutrans does. There might be ways of doing a single-linked list faster than the C++ standard, by making a less encapsulated implementation where the data structure being stored is an active participant, rather than ignorant about being stored in such a list.

Using templates for things like min and max might be more typesafe, but that has nothing to do with modern C++. The necessary template functionality has been there since day one, I think.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #14 on: February 04, 2018, 06:56:49 PM »
Using templates for things like min and max might be more typesafe, but that has nothing to do with modern C++. The necessary template functionality has been there since day one, I think.

If one uses std::min() and std::max(), surely there is no need to use templates?

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #15 on: February 04, 2018, 08:55:10 PM »
If one uses std::min() and std::max(), surely there is no need to use templates?

They are templates.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #16 on: February 04, 2018, 09:52:23 PM »
Quote
How can it? Which parts?
Since floating points cannot be used, there is a lot of fixed point hackery going on. To prevent overflows these calculations are done using 64bit primitives. Only x86-64 added instructions to perform operations on 64bit primitives. In x86 the primitives have to be manipulated with a sequence of instructions (slower). Additionally the extra registers of x86-64 would speed up the calculations by holding intermediate results instead of such values having to be written out to stack.

Additionally a lot of values are stored in 64bit types for the extended number range. Examples include city growth, currency, sum totals for monthly averages, etc.

The factory code is full of 64bit types being operated on. I also beleive the convoy movement logic is a similar case. Power also uses a lot of 64bit types.

Quote
The Windows specific code seems unrelated to IPv4 vs. IPv6, although it only exists in the USE_IP4_ONLY case. Perhaps it has to do with incorrect/incomplete Berkley implementation in some ancient Windowses. USE_IP4_ONLY seems a bit of a misnomer in this case. It has more to do with legacy systems not supporting getaddrinfo. Maybe getaddrinfo was only introduced to Berkley sockets when IPv6 came along. I can't find any clear information about that. (It apparently came to Windows with IPv6 support.) Since getaddrinfo has been supported on Windows since XP (in Unicode since SP2), it can only be BeOS/Haiku, and a possible rule to still support Windows 2000, that requires the USE_IP4_ONLY code. I think it is five years or more since we raised minimum system requirements to Windows 2000. We still had vocal users using Windows 2000 at the time, but is that system still in use for Simutrans? I have no idea about BeOS/Haiku.
Seeing how more and more ISPs only offer IPv6 addresses to their customers should one even support the option of IPv4 only? As far as I am aware that code is as good as unmaintainable. Even if it has bugs chances are no one uses it to report them.

I thought minimum system requirements were Windows XP.

Quote
What new C++ features would help us to improve the code base?
The "auto" keyword for local variables. Would solve a lot of type warnings, especially with loops.

The filesystem library might be the holy grail solution to the Unicode file name nonsense. It might also just be another name for the same nonsense. Would have to be tested.

There are also several useful localization related features that were added. For example printing time or numbers in a locale friendly way without having to reinvent the wheel and ending up making it an oval or square.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #17 on: February 04, 2018, 10:19:04 PM »
They are templates.

Interesting - they are not the normal templates requiring <>s, so they must have some very interesting way of deducing the type automatically. Either way, they work properly with 64-bit integers whereas the built-in Simutrans ones do not.

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1309
Re: Modern C++?
« Reply #18 on: February 04, 2018, 10:43:44 PM »
Actually they do require <>s as the 'interesting' auto stuff fails. Attached a patch I started the second time Simutrans own sint32 only min max burned me, but still needs verification....
It's not only 64 bits types that silent conversions to sint32 fail - max value uint32s don't work either!

IMHO nothing should ever be 'auto'. Automatic things always fail in strange ways - make it explicit and one always knows what's going on.



Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #19 on: February 05, 2018, 06:38:53 AM »
Quote
IMHO nothing should ever be 'auto'. Automatic things always fail in strange ways - make it explicit and one always knows what's going on.
This is only needed with the case where mathematics/logic is important. This is because one has to explicitly state what is going on. For example the fixed point hackery used by industry, power and convoy movement logic.

If one is manipulating an object the type is pretty much implicit so all not using auto is doing is repeating what the person already knows.
Code: [Select]
SomeType* obj = new SomeType(); // It was assigned new SomeType so it is obviously SomeType*...
auto obj = new SomeType(); // obj is clearly SomeType* because the assignment explains it all.
If one were to change the type of object used only 1 change would be needed with auto, as opposed to 2 where both object and type has to change. Type safety still applies, the compiler will throw a compile time error or warning if one tries to use the variable as a different type of argument.

There is also an explicit nullptr value, as opposed to a macro or 0.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #20 on: February 05, 2018, 06:52:24 AM »
Since floating points cannot be used, there is a lot of fixed point hackery going on. To prevent overflows these calculations are done using 64bit primitives. Only x86-64 added instructions to perform operations on 64bit primitives. In x86 the primitives have to be manipulated with a sequence of instructions (slower). Additionally the extra registers of x86-64 would speed up the calculations by holding intermediate results instead of such values having to be written out to stack.

Additionally a lot of values are stored in 64bit types for the extended number range. Examples include city growth, currency, sum totals for monthly averages, etc.

The factory code is full of 64bit types being operated on. I also beleive the convoy movement logic is a similar case. Power also uses a lot of 64bit types.

You don't have to build a 64-bit application to use the 64-bit registers and instructions. All it takes is raising the minimum targeted CPU to one that supports them. Whether it then makes any sense not to go full 64-bit is a different question.

The "auto" keyword for local variables. Would solve a lot of type warnings, especially with loops.

My point is that it might simply propagate the error rather than raise a warning, because the loop variable was of the right explicit type, but the input was wrong. With auto, it would simply default to the wrong type from the inputs.

Interesting - they are not the normal templates requiring <>s, so they must have some very interesting way of deducing the type automatically.

They use <> in the declarations (and definitions). Only when using templated types to you normally need to use <> when using them. For function calls, the templated parameter types can usually be deduced from the arguments (and perhaps what type you want to store the returned value in).

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #21 on: February 05, 2018, 07:14:44 AM »
Quote
You don't have to build a 64-bit application to use the 64-bit registers and instructions. All it takes is raising the minimum targeted CPU to one that supports them. Whether it then makes any sense not to go full 64-bit is a different question.
As far as I am aware one does. The 64bit instructions can only be executed if the processor is running in 64bit mode which it will only do so when running a 64bit application. When running 32bit applications it runs in 32bit mode which does not support those instructions and registers. This is how backwards compatability was achieved, via some use of special registers/addresses to change CPU mode.

Exception are the vector instructions. However these are intended to manipulate multiple 8, 16, 32, or 64 bit values simultaniously and not for individual calculations. For simple 64bit operations on a single value they are likely considerably slower than using a single x86-64 instruction. At the very least they have 2 instruction overhead to use as one has to explicitly state they are in use and no longer in use so that OS context switching knows to dump their values.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #22 on: February 05, 2018, 07:00:36 PM »
My memory must have been playing tricks on me. I was certain I had seen the compiler spit out 64-bit instructions in a 32-bit compile. 32-bit registers are also available in 16-bit mode. But I can't reproduce it, and the manual seems indeed very explicit that it can't be done. (Were they really out of possible prefixes?)

Offline TurfIt

  • Dev Team, Coder/patcher
  • Devotee
  • *
  • Posts: 1309
Re: Modern C++?
« Reply #23 on: February 05, 2018, 08:38:43 PM »
Thinking of the x32 ABI perhaps? Such would be ideal for Simutrans, and any program that can't effectively use more the 4GB. All the goodness of x86_64 registers/instructions but still 32 bit pointers for compactness. Shame not widely supported...

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #24 on: February 05, 2018, 08:51:50 PM »
Well, that reminds me of what I was thinking of, and it is possible it was on Linux I saw what I thought I saw. On the other hand, this seems to be 64-bit applications that has an agreement with the OS to only use 4GB of address space. That is a strange quirk I probably would have remembered even better.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #25 on: February 05, 2018, 10:27:05 PM »
Quote
(Were they really out of possible prefixes?)
Using prefixes for it is counter productive. The entire point of x86-64 was to try and adopt a RISC like design. By using suffixes and also supporting all of x86 at the same time it would become just another CISC extension adding to the complexity of the system.

Only recently have people considered going back to CISC for some features as it can be faster thanks to pipeline optimizations and micro code. For example a CISC memory copy instruction with repeat suffix can be generally the most efficient way to perform bulk memory copies on modern processors as micro code will automatically convert it to the most optimized memory copy routine. On the other hand the numerous general purpose registers of RISC design is here to stay, and one of the big performance advantages of using x86-64 over x86.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #26 on: February 06, 2018, 07:20:10 AM »
Using prefixes for it is counter productive. The entire point of x86-64 was to try and adopt a RISC like design.

One would think so, but according to the Intel instruction reference I have, x86-64 uses prefixes to access the new general purpose registers. I get mixed signals whether a prefix is needed to access 64-bit general purpose registers at all.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #27 on: February 07, 2018, 05:48:02 AM »
While this slowly gets into 32 bit versus 64 bit. Simutrans need to supply 32 bit builds for windows for the next five years or so, since at least until last year low end hardware was sold with Windows in 32 bit mode on atoms. SImutrans currently runs fine on those.

About auto keyword: Are there really so many place where it would save typing? Again, considering the effeort of changing versus the time better spend on other stuff ...

IP4 was the only platform supported by the Amiga builds, Not sure, if there is a user still around.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #28 on: February 07, 2018, 07:05:51 AM »
About auto keyword: Are there really so many place where it would save typing? Again, considering the effeort of changing versus the time better spend on other stuff ...

I don't think we should make an effort to actively change things just because we can. It would be only for newly written or rewritten code. Places were it really would save on typing would be:

Code: [Select]
for (auto iter = vehicles.begin(); iter != vehicles.end(); ++iter) {
  ...
}
vs
Code: [Select]
for (std::vector<road_vehicle_t>::iterator iter = vehicles.begin(); iter != vehicles.end(); ++iter) {
  ...
}

and, although I'm not sure Simutrans does this

Code: [Select]
auto result = map.insert(std::make_pair(1,"Hello"));
vs
Code: [Select]
std::pair<std::map<int,std::string>::iterator,bool> result = map.insert(std::make_pair(1,"Hello"));

One can make the worst cases shorter with using namespace std;, but still.

But I would like to have the type given explicitly in
Code: [Select]
for (auto i = some_var; i < get_something(); ++i) {
  ...
}
rather than having to look up what type some_var is and get_something returns and then figure out what the compiler should deduce auto to be. If i is explicitly typed, and this seems add odds with some_var and/or get_something(), then at least GCC 5+ issues warnings.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #29 on: February 07, 2018, 11:52:55 AM »
Indeed - auto is not really something worth going back and changing retrospectively, but can be useful for new code. Also,

Code: [Select]
for (auto iter = vehicles.begin(); iter != vehicles.end(); ++iter) {
  ...
}

can actually become

Code: [Select]
for (auto iter : vehicles) {
  ...
}

in C++11, and I have confirmed that it works without difficulties using the Simutrans collection classes rather than the STD/STL collection classes.

One thing that might be worthwhile going and changing retrospectively (because it is extremely easy with search and replace) is NULL to nullptr. If we are able to use C++11 language constructs such as auto and ranged for generally, then it seems to me that this makes sense. Likewise, the built-in Simutrans min() and max() templates are problematic as discussed, and these can very easily be changed to std::min() and std::max() with search and replace.

It may be best to think of this in two distinct stages:

(1) does the Simutrans codebase need to retain compatibility with compilers that are incapable of dealing with C++11 code; and

(2) if not, what new C++11 language features are worth deploying in any given case of new/existing code?

If we are answering question (1) in the negative, then things such as nullptr > NULL and std::min/max() over Simutrans min/max() are very straightforward. Other things, such as STD/STL collection classes, STD smart pointers, the auto keyword, would need to be considered on a case by case basis whenever code is refactored. We may want to consider whether to refactor the FOR macros which are non-standard and difficult to read for those not familiar with them to the ranged for construct, which is much easier to read (the auto keyword is not compulsory). That refactoring should not be too difficult in principle.

If, conversely, we are answering the first question in the affermative, then all of this is moot in any event, although that would be unfortunate, I think.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #30 on: February 07, 2018, 02:29:23 PM »
The FOR macros are working and are just another way of writing the auto ranged for. They have the advantage of telling the type explicitely, so I like them more than the ranged for. Also I am not sure how the ranged for reacts to a change of end() during iteration, which the macro allows. Mixing iterators is not good (as we would have then three different styles in the code). So yet another style, hmm.

If the min/max from std:: behave the same as the existing templates (in terms of unsigned versus signed), yes why not. Will also remove some min max macro trouble from windows includes.

About C11++ compilers, GCC5 is supported by Haiku, GCC 6.x by AmigaOS, so there should be no problems even with those exotic platforms.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #31 on: February 07, 2018, 06:33:58 PM »
The FOR macros are working and are just another way of writing the auto ranged for. They have the advantage of telling the type explicitely, so I like them more than the ranged for. Also I am not sure how the ranged for reacts to a change of end() during iteration, which the macro allows. Mixing iterators is not good (as we would have then three different styles in the code). So yet another style, hmm.

I think that is the problem. For new Simutrans developers, the FOR macro is another Simutrans specific thing they have to learn what does and how to use, whereas plain C++ for loops are either known from before or useful knowledge to have beyond Simutrans. I seem to take such oddities and quirks with ease, but I notice that not all developers do so.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #32 on: February 07, 2018, 09:22:45 PM »
Splendid, it is useful to have a good idea as to the progression of this matter.

May I ask whether a view has been taken on nullptr/NULL?

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #33 on: February 07, 2018, 10:29:28 PM »
Quote
May I ask whether a view has been taken on nullptr/NULL?
nullptr is better as it is a strongly typed constant where as NULL is a macro to a numeric constant. Macros cause abstraction of values because from compiler error output it can be hard to find the problem with a macro.

Like wise using the appropriate types for pointer difference operations is recommended over size_t, or worse int.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #34 on: February 08, 2018, 12:17:30 AM »
nullptr is indeed the best choice, I would agree with that too. Again, needs to be all replaced, and not mixed.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #35 on: February 08, 2018, 12:22:04 AM »
nullptr is indeed the best choice, I would agree with that too. Again, needs to be all replaced, and not mixed.

This should be quite straightforward. Would you like me to upload a patch for Standard with a full search/replace for this?

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #36 on: February 08, 2018, 06:46:39 AM »
I like NULL, because it screams out to me, both by being in caps and being highlighted (and is shorter to type).

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4493
  • Languages: EN, DE, AT
Re: Modern C++?
« Reply #37 on: February 08, 2018, 07:16:29 AM »
I like NULL, because it screams out to me, both by being in caps and being highlighted (and is shorter to type).
same for me.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #38 on: February 08, 2018, 12:51:29 PM »
We could just add to simconst.h §define NULL nullptr ...

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #39 on: February 08, 2018, 06:14:36 PM »
I think NULL is already defined in some standard header. It is not good practice to redefine stuff from there.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #40 on: February 08, 2018, 09:11:55 PM »
Is it not better to follow the standard and use nullptr, even if it may seem less easy at first?

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #41 on: October 29, 2018, 11:08:10 PM »
Sorry for resurrecting a 9 months old thread. I would like some information about the idea of modernizing the codebase of simutrans to C++11. I noticed that some people would be happy to have better zero-overhead abstraction in simutrans, but others looks like pretty against it.

To make things easier, I make some direct questions:
  • Will it be accepted the introduction of C++11 code? I understand that there are reasons about sticking to C++98 that are independent to code quality, maintainability and portability, and in these cases I cannot (and I don't want) to convince anyone about using C++11
  • How the development should be handled? I generally would open a git PR, but I don't know if with simutrans this is the normal procedure. Probably a C++98->C++11 'port' would create many small patches instead of some bigger ones, in order to ease the code review from other devs, IMHO source control is the best approach, but if you prefer using path files (using the forum) it is not a problem.
  • Would you accept "just the patch", or you would prefer/require the writing of a unit test before the C++98->C++11 transition of something, then the patch itself? That is really useful to grant the correct behaviour of the new code and it will improve simutrans in general. On the other hand, this will slow down the porting, and obviously unit tests in general require maintenance.
  • I have seen discussions about the "performance cost of C++ abstractions". If this is a concern, do you eventually want benchmarks and asm comparisons of the parts that are being ported to the new standard? I think that the generated code speaks much louder than speculations.

Sorry if my words feel a bit harsh, I really don't want to be offensive or aggressive. I am just pragmatic. I would really love to improve simutrans, and I really thing that modernizing the codebase is one important step that will be necessary to improve the game.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #42 on: October 30, 2018, 02:57:41 AM »
Quote
I would really love to improve simutrans, and I really thing that modernizing the codebase is one important step that will be necessary to improve the game.
Modernizing the code base would make maintenance easier, but chances are the game will not be improved directly by it.

In the end there is not too much to be gained by moving to C++11 which is why the change has not really happened. Sure one occasionally runs into a more modern C++ language feature that has to be patched away, but generally C++98 does everything one wants to do.

That said there are some more modern C++ features that could be useful. For example the standard file system support should make Unicode file name support less platform specific, bit that needs C++17. There are also options for multi-threading other than PThread so one could finally do away with optional multi threading support by making it mandatory.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #43 on: October 30, 2018, 05:32:53 AM »
Btw. Simutrans even predates C++98 be some years, although it used GCC features in the beginning, since only GCC (or its DOS port) could handle the large memory of it.

However, changing things just for the sake of change is usually not a good idea. If it fixes a problem, fine. But there are some templates and list objects, which occur in every tile. Having a little overhead is fine, but having 100 Million of extra bytes is not. Same for memory handling. A normal alloc is relatively slow and carriers overhead (between 4 and 255 bytes). Again, if you need many in pre-determined size, then own memory management makes sense. So you will have always some own construct pointing out, and these are often in the most critical part of the code.

Using std::vector versus the simutrans template does not affect function very much, and could be easily done. But both are more or less bug free, so personally I do not care which to use. I am rather worried about how and exchange would clash with larger patches and things like Experimental versus standard and so on.

But we are a very small group of contributors. SO if C++11 would enable more contributors, and it is on non-critical part of the code. The nightly build server uses a stable Debian, which uses GCC 4.9.x so this should also work with C++11 (unless some garbage collection is wanted).

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #44 on: October 30, 2018, 06:17:45 AM »
There pretty much isn't a thing as rewriting the code into C++11. With later versions, replacing deprecated, or even completely removed, features is more of a thing in general, but can not remember Simutrans using such things in the first place.

Replacing Pthread with C++ threading would eliminate the Pthread dependency when using Microsoft's compilers, but last I checked, C++ threading support varies between Mingw64 builds. (For some odd reason, my cross-compiler uses an incomplete implementation based on native Win32, while the Mingw64 I have for Windows has a full Pthread-based implementation. It can't statically link Pthread, though, but that is a problem regardless of whether Simutrans uses the threading part of the standard C++ library or not.)

Furthermore, there have been indications that Simutrans is popular among people who do not have the most up-to-date computers. What might amount to insignificant overhead on a new computer may not be so on a 10 year old computer.

What might make more of an impact in terms of maintainability, at least in the eyes of new developers, is removing non-standard things for which there exists equally performant standard things, such as many of the container templates. It is important to remember that the standard library contain generic solutions to common problems. There are situations where they are not suited, but that requires profiling. Unfortunately, profiling on different systems and/or compiled with different compilers, including different versions, may yield different answers.

And to really bring the game to the next level, in particular to be able to leverage hardware accelerations, or do multiplayer in a less hackish way, one needs, in my opinion, to completely shatter the core of the game and rebuild it. It is not a matter of switching programming language version, but switching out the architecture and some design principles. Unfortunately, that is not something I have the time or energy for beside work.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #45 on: October 30, 2018, 07:52:09 AM »
If one really want to invest time, then getting Simutrans on Android is probably the best use of that ... THe GUI is changing now into a fully scalable version with thumb support, so it will be better suited for it today than simutrans was in the past.

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #46 on: October 30, 2018, 08:55:46 AM »
Ok, maybe I needed to be clearer on my point: I said that C++11 will make the game better does not mean that it will make it faster. Using "modern" capabilities allows to create better and safer abstraction, much more safer and compiler friendly than using "old approaches".

A really silly example: simutrans is full of manual allocations. And if you compile using a leak sanitizer you will notice that just opening and closing the game creates leaks. I totally know that leaks are not UB, but on the other hand it is not a really good practice to leak memory around. The part related to the password manager does not use memset_s (which is C11/C++11) not volatile to clear data, creating a possible information leak.

What I am saying is that a code hard to mantain is hard to evolve. The removal of the pointers (using lvalue reference, rvalue reference and std::unique_ptr, for instance) is a great improvement by itself. And I am not saying that this is true because I am saying that, but based on the experience of many many C++ experienced developers (half of the seminars from CppCon are about good practices and code modernization).

Said that, I ask for some more specific replies. I will not be offended if you say "no, no C++11 here". But I would like to understand if I can contribute in simutrans modernizing the code during my spare time.

Offline Dwachs

  • DevTeam, Coder/patcher
  • Administrator
  • *
  • Posts: 4493
  • Languages: EN, DE, AT
Re: Modern C++?
« Reply #47 on: October 30, 2018, 10:23:54 AM »

The part related to the password manager does not use memset_s (which is C11/C++11) not volatile to clear data, creating a possible information leak.
Could you provide a patch to address this?

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #48 on: October 30, 2018, 12:48:11 PM »
Could you provide a patch to address this?
Yes, it can be done using std::fill using static_cast<char volatile *> for both begin and end arguments. However, my point is this one of the many cases in which the usage of C++11 could make things simpler, more idiomatic and easier to maintain.

If you tell me that we can go with C++11, the patch is just replacing memset with memset_s. Depending on what is decided about the modernization discussion, the patch will be different. In any case I will be happy to send it even if the decision is to stick to C++98. I think that I have got a couple of other minor issues, but this is another story  ;)

P.S.: there was a typo:
not volatile
was meant to be "nor volatile"

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #49 on: October 30, 2018, 02:22:27 PM »
Replacing pointer by overlaid types will be a big problem if you (like in Simutrans) allocate 100 million of objects. How would you handle an objectlist without pointers?

volatile has nothing to do with security. It just means the compiler should load something from the memory and not the cache, but the compiler is free to ignore this. Also, for a non multithreading program like simutrans (the display and few map routines aside), volatile is useless in dialogues and gui handling.

Also std::fill versus memset versus memset_s? All have the same functionality. Also memset_s and std:fill relies on passing the size (end), so I fail to see the improvement versus memset: If you feed the wrong size to memset, then you can do the same to memset_s ... (Also, while the Debian compiler of the nightlies is C++11 compliant, the libstdc++ is not fully that. One really has to check what is supported.)

memset( a, lengthof(a), 0 ) versus std::fill( static_cast<char volatile *>a, static_cast<char volatile *>a+lengthof(a), 0 ) is just typing more letters to me. But maybe I misunderstood this, because I am not a computer scientist and my self taught IT education is 20 year out of date.

Opening and closing a game does not create leaks, at least we ocassionally check with valgrind. Loading a pakset creates leaks, due to the way the pak system is implemented, but that cannot be helped without a major redesign. And it is not an issue, since paks are only loaded once.

Could you be more specific, where the issue with the password_frame.cc is (or which file are you referring to?) Simutrans does not store passwords on the client side, so I am not sure, if there is a vulnerability. Of course, if you send enough garbage to a Simutrans server or client, you can crash it, because the core engine was never built for abuse (because it was a single user core). But this is totally independent from C++11. However, since Simutrans is cross platform and less than 20 clients and servers are online at any time, and servers are usually hand compiled with different compiler versions and architectures, I doubt you can run a sucessful exploit (versus just crashing the program).

Having said that, security improvements are welcome. This had never been foremost, since the networkcode came very late to Simutrans.

Finally, a personal comment: I am supporting Simutrans for about 14 years now. I had seen in that time some trends come and go. Granted, a few new stuff (like automatic for loops and working templates from C++98) are quite welcome. But just coughing "::" all over the place to replace working code is a job, I happily delegate to other. Depending on your IDE, too much C++ can make debugging harder, since a lot of stuff is hidden when using virtual object and automagic conversion and the like. (The pak loader is a good example of code almost undebugable with MSVC ... )

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #50 on: October 30, 2018, 03:30:48 PM »
Replacing pointer by overlaid types will be a big problem if you (like in Simutrans) allocate 100 million of objects. How would you handle an objectlist without pointers?
Once you have abstractions, you will never need to handle raw pointers. It is obvious that at the end of the abstraction layers a pointer to something is handled, but there is a huge gap between doing manual memory management and using safe abstractions.

volatile has nothing to do with security
I did not say what I was pointing at because it was pretty superfluous in this discussion, but let's discuss it. Line 16 of network/pwd_hash.h, the method clear() use the MEMZERO macro to set hash to zero bytes. MEMZERO is indirectly defined as a call to memset, which can be optimized away by the compiler if no reads of the memory are performed after the set operation. memset_s avoids this behaviour, and prior to C++11/C11 it is possible to cast a pointer to a type to a pointer of a volatile type in order to unallow the optimization.

It just means the compiler should load something from the memory and not the cache, but the compiler is free to ignore this
No, volatile means that the compiler cannot assume that the value is constant between two subsequent accesses, and cannot ignore it (otherwise it would be impossible to use memory mapped devices)

for a non multithreading program like simutrans
Multithreading is completely unrelated to volatile. http://isvolatileusefulwiththreads.in/C++/ volatile access to memory does not mean atomic access.

Also std::fill versus memset versus memset_s? All have the same functionality
memset and memset_s can only handle raw memory, std::fill can be used with any iterable object, without the needs of contiguous memory. And at the same time it is optimized to use memset in case of trivially copy assignable contiguous data. In the specific case of pwd_hash it is the same as using memset, until you change the type of the hash variable.

Simutrans does not store passwords on the client side, so I am not sure, if there is a vulnerability
Maybe not. I end up to that piece of code by chance, I noticed the clean method, the fact that a a password hash was stored, I just made the assumption that that memory wanted to be zeroed, without any chance of let the memset being optimized away.

while the Debian compiler of the nightlies is C++11 compliant, the libstdc++ is not fully that
https://packages.debian.org/stretch/libstdc++6 The stable release of Debian fully support C++11: libstdc++ is 6.3, and AFAIK it also fully supports C++14. But maybe I may be wrong, can you provide a link that shows that GCC 6.3 + libstdc++ 6.3 are not able to compile valid C++11 code?

Opening and closing a game does not create leaks
Yes, it does.
Code: [Select]
Direct leak of 39000 byte(s) in 975 object(s) allocated from:
    #0 0x7f0884e57d29 in operator new(unsigned long) /build/gcc/src/gcc/libsanitizer/asan/asan_new_delete.cc:90
    #1 0x55e8eba51d06 in obj_desc_t::operator new(unsigned long) descriptor/reader/../obj_desc.h:29
    #2 0x55e8eba50d1e in image_t::copy_image(image_t const&) descriptor/image.cc:54
    #3 0x55e8eba515a2 in image_t::copy_rotate(short) const descriptor/image.cc:96
    #4 0x55e8eba53f9e in create_alpha_tile descriptor/ground_desc.cc:104
    #5 0x55e8eba58be8 in ground_desc_t::init_ground_textures(karte_t*) descriptor/ground_desc.cc:814
    #6 0x55e8ec31ee1b in karte_t::karte_t() /home/user/src/simutrans/simworld.cc:2053
    #7 0x55e8ec281d34 in simu_main(int, char**) /home/user/src/simutrans/simmain.cc:1238
    #8 0x55e8ec2ae462 in sysmain(int, char**) /home/user/src/simutrans/simsys.cc:1036
    #9 0x55e8ec3c777e in main /home/user/src/simutrans/simsys_s2.cc:786
    #10 0x7f08842f5222 in __libc_start_main (/usr/lib/libc.so.6+0x24222)

Direct leak of 9364 byte(s) in 157 object(s) allocated from:
    #0 0x7f0884e56019 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/asan/asan_malloc_linux.cc:86
    #1 0x55e8ec284b04 in xmalloc(unsigned long) /home/user/src/simutrans/simmem.cc:156
    #2 0x55e8ebb3310c in recode dataobj/translator.cc:138
    #3 0x55e8ebb35c1d in load_language_file_body dataobj/translator.cc:335
    #4 0x55e8ebb3618e in translator::load_language_file(_IO_FILE*) dataobj/translator.cc:379
    #5 0x55e8ebb373f1 in translator::load(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) dataobj/translator.cc:446
    #6 0x55e8ec27ffb8 in simu_main(int, char**) /home/user/src/simutrans/simmain.cc:1030
    #7 0x55e8ec2ae462 in sysmain(int, char**) /home/user/src/simutrans/simsys.cc:1036
    #8 0x55e8ec3c777e in main /home/user/src/simutrans/simsys_s2.cc:786
    #9 0x7f08842f5222 in __libc_start_main (/usr/lib/libc.so.6+0x24222)

Direct leak of 672 byte(s) in 28 object(s) allocated from:
    #0 0x7f0884e57d29 in operator new(unsigned long) /build/gcc/src/gcc/libsanitizer/asan/asan_new_delete.cc:90
    #1 0x55e8ec1abde8 in stadt_t::cityrules_init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/user/src/simutrans/simcity.cc:393
    #2 0x55e8ec280208 in simu_main(int, char**) /home/user/src/simutrans/simmain.cc:1060
    #3 0x55e8ec2ae462 in sysmain(int, char**) /home/user/src/simutrans/simsys.cc:1036
    #4 0x55e8ec3c777e in main /home/user/src/simutrans/simsys_s2.cc:786
    #5 0x7f08842f5222 in __libc_start_main (/usr/lib/libc.so.6+0x24222)

Direct leak of 474 byte(s) in 47 object(s) allocated from:
    #0 0x7f0884d9ef01 in __interceptor_strdup /build/gcc/src/gcc/libsanitizer/asan/asan_interceptors.cc:405
    #1 0x55e8ec289bd6 in tool_t::read_menu(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/user/src/simutrans/simmenu.cc:721
    #2 0x55e8ec280d89 in simu_main(int, char**) /home/user/src/simutrans/simmain.cc:1104
    #3 0x55e8ec2ae462 in sysmain(int, char**) /home/user/src/simutrans/simsys.cc:1036
    #4 0x55e8ec3c777e in main /home/user/src/simutrans/simsys_s2.cc:786
    #5 0x7f08842f5222 in __libc_start_main (/usr/lib/libc.so.6+0x24222)
These are some of them. And as I already said, the fact that leaking is not UB (or it only happens once) does not mean it is a good behaviour.

too much C++ can make debugging harder,
I totally don't agree with this: you will never need to step into stdlib, and using standard functions correctly makes code safer and most of the time faster. And, honestly, saying that a C++ project could have too much C++ looks like a weak argument.

Again, I would like to avoid this kind of discussion. I just asked if the main developers consider acceptable moving from C++98 to C++11 as a requirement. I don't want to force anyone to do anything, the reason I am asking is because it does not make any sense that I create a patch to have idiomatic C++11 code and it is rejected because it is C++11.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #51 on: October 30, 2018, 03:54:44 PM »
C++ has gotten lots of fancy features since the turn of the millennium, but the fact that they "always" want to implement it partially as templates in the standard library rather than directly as part of the language syntax has caused a lot of long-winded constructs. auto helps somewhat, but not always.

memset_s doesn't really have much of an edge over memset in my opinion. In most cases, size and count is the same and/or size is not passed to the place memset is called in the first place. std::fill has the advantage over both in that you don't need to convert element count to byte count yourself. The only disadvantage is that iterator arithmetic, needed to calculate end, isn't the most understandable thing to a non-experienced C++ programmer. std::fill_n is better at that.

I don't think passwords being stored on the client side is much of an issue. If something is able to suck that information out of memory, it would also be able to intercept the keystrokes, which is probably easier. Grabbing passwords from the server's memory would also be more tempting, as one can get more passwords that way.

Fast, maintainable, secure code is still a fantasy silver bullet, except for the most trivial of cases, although they are by no means mutually exclusive.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #52 on: October 31, 2018, 12:39:09 AM »
The pointers issue is an interesting one. Raw pointers are used extremely heavily in Simutrans. Assuming that there is no performance impact at all of switching to newer abstractions (is this assumption correct? It would be interesting to see it tested), there would be a very large amount of work required to go back and change all the existing code, especially when an awful lot of code is something like:

Code: [Select]
if(pointer)
{
   pointer->do_a_thing(variable);
}
else
{
   do_something_else();
}

and quite a lot more code uses arrays and pointer arithmetic.

What I had originally suggested in starting this thread was using the new semantics for new additions to the code and retrospectively changing limited parts of the code that are easy to change (e.g. NULL to nullptr), although even the latter was not accepted in the end.

However, we have at least established that C++11 language features can safely be used in Simutrans if necessary and do not violate Simutrans coding standards.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #53 on: October 31, 2018, 12:53:20 AM »
Quote
Yes, it can be done using std::fill using static_cast<char volatile *> for both begin and end arguments. However, my point is this one of the many cases in which the usage of C++11 could make things simpler, more idiomatic and easier to maintain.
Casting a non-volatile type to volatile is ignored due to it being complete nonsense. Especially if it is a pointer...

Volatile is a storage modifier. It is needed when a variable is declared. Typecasting non-volatile to volatile will do absolutely nothing as the compiler already knows the variable is non-volatile. Compilers will warn you this if you set the appropriate warning level.
Quote
The part related to the password manager does not use memset_s (which is C11/C++11) not volatile to clear data, creating a possible information leak.
I am not even sure what you are referring to. All passwords are stored on the server, with the server receiving the password entered by a client to authenticate. If the server is compromised then all passwords are compromised anyway as the password file can be accessed directly. In the case of clients the biggest thing that can happen is that if their system is compromised their password is revealed to the hacker. That said who in their right mind would hack Simutrans passwords...

Technically a lot more is required than simply secure setting to zero. For the password to be safe spectre mitigation is also required but good luck with that... This is because on some processors another application can guess the contents of memory it does not have access to by measuring the time taken for a security exception to be raised representing memory state and even content state due to speculative execution.

Also memset_s is already a secure memory set function as it makes the guarantee that after it is returned the memory will have been set. Compilers cannot optimize it away like they can with normal memset. No need to cast anything to volatile, if it ever made sense to do so to begin with.
Quote
Grabbing passwords from the server's memory would also be more tempting, as one can get more passwords that way.
Or they could just look at the completely unsecure password hash file that accompanies the save. No need to even look at the running servers memory if the server is that compromised...
Quote
These are some of them. And as I already said, the fact that leaking is not UB (or it only happens once) does not mean it is a good behaviour.
Opening and closing Simutrans does not create memory leaks. This is because the OS discards all virtual memory pages exclusively owned by an application on application closure anyway. Sure, logically there might seem to be a leak but in reality no memory is leaked by the application as the application no longer exists after exiting and all its exclusive memory pages are discarded anyway. This might not always been the case using languages like C and C++ due to some platforms not having a concept of application or virtual memory, but as far as I am aware Simutrans does not support any such platform.

The OS will not automatically free any handles owned by an application on application closure so those can be considered potential leaks. Like wise resources within specific APIs, eg Vulkan handles, are not automatically freed on application closure.

If you are referring to reloading maps then yeh there are possibly a few minor ones. That said the major ones I discovered in Experimental (now fixed) had nothing to do with code shared with Standard.
« Last Edit: October 31, 2018, 01:41:02 AM by DrSuperGood »

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #54 on: October 31, 2018, 05:07:09 AM »
memset cannot be optimised away in said function, because acting on its own class data (unless by a broken optimiser). Because most routines that modify their class data do not read them back. If you refer to this compiler bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=8537 than this is that: A compiler bug, and usually we try to ignore those. (Same for the workaround in that thread with volatile; and thanks, I know volatile must not be used for threading, since I had done my share of embedded programming, where you use it for hardware access.) Anyway, DrSupergood gave a good answer, why even if this bug would be present (which is not) it would not make a vulnerability.

I want to see an abstract pointer that does not imply a lot of overhead. std::unique_ptr may either allocate on the stack (but then we would run out of stack quickly) or is via the std memory allocation implementation which is very slow (about 100x slower for objects like trees or tiles which we have plenty) and has a certain overhead (std allocation tend to allocate always multiples of 16 or even 256). Also allocation 1000 Millions or bjects can cause some implementations to run out of memory too (albeit that is nowadays a thing of the past). Allocationg a 1024*1024 map with standard allocator is taking almost minutes versus 3-5 seconds using our tailored routines.

Another example are the lists: In the stl these are double linked. This implies quite an overhead (extra 8 bytes, and twice the effort on unliking), which we do not need.

Going back to C++11. There are neat things like the for loop; but a change of the end iterator during the loop is undefined. We do this on more than one case. So we cannot use simple C++11 code there, even though it would look nicer than the "FOR()" macros.

Don't get me wrong, I use C++11 feature for normal "quick and dirty" programming, and I am not fundamentally opposed against. However, a lot of code in Simutrans has been optimised and reduced to its minimum, because it allows people to play larger maps with more things. It is a little like assembler versus high level languages. You can be usually faster in assembler, but you only do it when needed. So we have tailored structures, which are faster and more compact than more general ones. We have also some normal use of them, because it was more consistent.

One could of course change the non-critical parts with more C++11, and that is slowly done (as shown by the slowly replacing of some other older string types by std::string in Simutrans). (NB: std::string is also "nice" if you trace through it, which lots of contructors going before you finally see the intermediate result you would like to see).

About the leaks you see there, I also go with DrSuperGood. I am programming stuff since 1986, and any OS worth its name, releases all memory at the end of execution unless explicitely allocated as non-releaseable. The only leaks I am concerned are from (re-)loading a game, which happens frequently on a server. Since the standard server can run for two months without a big memory increase, I think there are no big leaks right now.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #55 on: October 31, 2018, 07:21:35 AM »
Quote
(Same for the workaround in that thread with volatile; and thanks, I know volatile must not be used for threading, since I had done my share of embedded programming, where you use it for hardware access.)
If volatile can be used for multi threading is entirely down to the platform as the standards clearly state it cannot as it does not fit in with the multi threaded memory model. One either needs to use specific atomic types, or use locks and other synchronisation structures to enforce consistency and ordering when dealing with multiple threads.

The main and pretty much only use for volatile is accessing memory mapped peripherals and control registers in low level programming. This is encountered in driver creation as well as embedded controller programming and often the appropriate compiler will make guarantees as to how volatile works, especially when it comes to bitfield manipulation as some platforms have dedicated instructions to do that. Embedded processors might have only a single core and hence there always is a well defined and ordered memory model without the need for synchronization structures or atomic types.

The main advantage to be gained from newer C++ standards would be with C++ native Unicode support and File System support (C++17). This could potentially make the IO code more platform independent and remove many of the reinvented wheel like pieces of code that have been written to process Unicode and file paths.
« Last Edit: October 31, 2018, 07:40:52 AM by DrSuperGood »

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #56 on: October 31, 2018, 07:26:54 AM »
The OS will not automatically free any handles owned by an application on application closure so those can be considered potential leaks.
On Windows at least, handles to OS resources will be closed on process exit. (Which is why terminating processes is quite safe, while terminating threads is not.) I suspect that same is true on any operating system having process isolation, because it is the only way to protect the rest of the system from crashing or otherwise misbehaving programs in the long run. Handles to shared resources from elsewhere is a different matter, although resources on the same machine should preferably handle this just like the OS, while external resources should have a timeout.

You can be usually faster in assembler, but you only do it when needed.
I'm not at all convinced this is true anymore, unless you are targeting very specific processors or even entire machines. C, and by extension the C++, as languages are by design a very thin abstraction layer over assembly, and compilers have become very clever at optimizing. It is however still the case that you can sometimes sacrifice development speed for increased runtime speed by throwing away abstraction layers. One doesn't even have to switch language. (In that regard, I recently wrote some insanely complicated C++ template code to auto-generate some boilerplate raw byte stream manipulation that I until then had written explicitly each time. The compiler did realize exactly which machine instructions I wanted it to end up with, although I will probably have a hard time understanding the template code myself now. Some quirks in C++ might have caused some duplication of code in the binary output, though. Due to the nature of the project, I haven't bothered looking more into that.)

(NB: std::string is also "nice" if you trace through it, which lots of contructors going before you finally see the intermediate result you would like to see).
std::vector would be a better example, I think. There is little reason to delve into std::string, but an std::vector can contain elements with custom constructors, assignment operators and move operators that needs looking into. However, even more of an issue is the compiler error you get when messing up some things. Your error usually propagates into some internal implementation types before finally failing, giving you half a page of error messages with little connection to the C++ standard. Nothing C++11 about that though, it has been that way since way before that.

Raw pointers are used extremely heavily in Simutrans. Assuming that there is no performance impact at all of switching to newer abstractions
If there is none at all, there is a question as to what the abstraction provides in the first place. std::unique_ptr is perhaps the only thing I can think of. References is a nice way signify un-owned references, but they can not be null and they can not be changed once set, which in turn makes them uneasy member variables. (They shine as parameter types in a functional programming approach.) std::shared_ptr has overhead both in speed (probably insignificant) and size (for Simutrans, likely quite significant in many places).

and quite a lot more code uses arrays and pointer arithmetic
Those things are still fundamental to modern C++, although pointer arithmetic has been generalized to iterator arithmetic. (I'm not sure what arrays have been generalized into, but std::vector belongs to it. Probably std::map as well.)

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #57 on: October 31, 2018, 09:19:41 AM »
However, we have at least established that C++11 language features can safely be used in Simutrans if necessary and do not violate Simutrans coding standards.

Ok good. I still have some questions:
  • As I already asked before, what do you think it could be a nice way to handle the development? I think that a diff-and-patch method would be heavyweight in this case, and I think that it would hard for reviewing. As I said, my approach would involve using git and PRs, but I think that it is not doable in the current context, isn't it? What are your suggestions?
  • My idea is to write a unit test for a function I want to change, because I don't want to break any code during the process. Do you have any preferences about the handling of tests? I recently watched a CppCon talk of Phil Nash talking about Catch/2, a very nice testing framework: it is one single header file and it is incredibly easy to setup and use compared to other frameworks. I generally don't like adding a dependency just for testing, but in this case the solution is clean and there is no need to reinvent the wheel.
  • About coding standards: there are many patterns in C++11 that does not exist in C++03/98, rvalue references, variadic templates, noexcept and constexpr for instance. All the features that are from the C++11 standard cannot be obviously covered in the C++03 coding standards. Can I the c++ core guidelines (https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines) to cover these cases?

I see that most people here think that abstractions have costs. My idea is that I cannot assert that some code has zero cost abstraction or the opposite without proofs. My question is: how can I give you these proofs? I have two major concerns:
  • To have a comparison, I need two versions of the same code. Even if, for instance, I create a commit with both C++03 and C++11, do whatever is needed for the comparison and then I make another commit without the old code, the comparison can only be made on a single commit. And this is obviously not a good solution. However, I don't have any better idea in mind. What do you think?
  • There are two main ways to compare the effects of two different versions of the same code: microbenchmarks and assembly comparison. The former is only useful to avoid significative regressions (I don't expect major performance gains for most of the time), the latter is a bit hard to just show. One possible approach could be the following: I create the new version of a function, I took both the old and the new ones, I put them on compiler explorer and on quickbench (obviously creating a proper benchmark test), then I create permalinks and add them in the commit description/patch notes/whatever. What do you think?

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #58 on: October 31, 2018, 10:55:53 AM »
I should note that I am a developer of the Extended branch, but not a developer of the Standard trunk, so I cannot comment authoritatively on how best, if at all, to go about seeking to about modernising the code (and it is better to minimise the non-functional code differences between Standard and Extended, so it would not be sensible to modernise the code shared with Standard on the Extended branch except by merging in any modernisations in the equivalent parts of the Standard code).

As to testing different versions of the code for performance loss or gain, you could use profiling: the community edition of recent versions of Visual Studio, which is used for Simutrans development, has in-built profiling, and one might alternatively use profiling builds in GCC. Of course, only one type of change should be tested at once.

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #59 on: October 31, 2018, 11:13:51 AM »
Sorry I did not mention about profiling. I tend to use TDD, and the only way of doing it correctly is checking perfect code coverage through profiling. Using it for performance comparison is pretty nice too (even if I generally prefer perf), but my main concerns about showing the data are almost the same unfortunately.

I completely agree on your points about the modernization: I would like to make changes to the standard branch that can be directly merged in the extended branch, in order to maximize the results minimizing the efforts.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #60 on: October 31, 2018, 12:42:04 PM »
If one really want to modernise Simutrans, a complete rewrite is probably easier than starting with the current code. There are many different styles in here, from the more recent C++ like approach to put everything in a separate two line function to the veteran C approach to do everything in one function and all variations in between. If you aim for a code, as you probably imagine, then this means more or less touching everything. In the end, I think, such a major change needs to be actively supported by somebody who know the code well.

About testing units. We have had these, but since Simutrans is so diverse (15+ pak sets) automated testing is probably only possible for the most simple cases. Most of the easy bugs have been caught in the last 20 years ... Honestly, I have no formal education in informatics, and when I finally was thrown at Simutrans, I had to learn a lot very fast. (No prior C++ experience, only C, Pascal, Modula2 ... ) By the way, Simutrans does not use exceptions, because they clash with pthreads. And they have a lot of overhead (we profiled a few years back).

I have also to admit, that personally after every such big patches in the past I had to find out things again, which I know before. So the older the coder, the more conservative probably. The privilege of youth is to ignore me therefore ...

We do not use git because it did not exist when simutrans had already in an svn. (Albeit we lost the first one, as one can see from the current archive.) Moreover, svn was easier for beginners (on a command line). Gitless is of course something trying to mend that. Finally, the linear revision makes it easier to know if things have been fixed or not. If something is fixed in r8601, then it is fixed also in revisions larger 8601. Since we rely on bug reporting by normal players, this is helpful.

Having said that, personally, as an everyday programmer (not for money, but for the tools I need to work in semiconductors), I would rather wish for a basic portable GUI standard library as part of C++ instead some abstract features which makes the straight-forward craft of programming into an intellectual abstract exercise. Because most of the time, when I program something, I spend on the GUI (despite being very fluent in Win32 and frameworks), and that is not even portable. So for me C++2x should include a portable minimalistic GUI standard libary, instead garbage collectors. That would save a lot of time. Yes, and for loops where the end can be modified without being "undefined". For a language, having undefined is really bad, because such code will be used by someone for sure (unless a big compiler warning).

Sorry for drifting quite offtopic.

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #61 on: October 31, 2018, 01:33:40 PM »
You made many good points, I will try to reply to them one by one.

a complete rewrite is probably easier than starting with the current code
Cannot say that you are right, but I think that is true as well. Unfortunately, this would create a complete new project, splitting resources instead of merging them together. The idea of modern refactorization was related to the idea of making changes easy to understand from someone who already knows the code.

uch a major change needs to be actively supported by somebody who know the code well
This is the reason I would like to follow a sort of leaves-to-branches modernization, in order to deep into the project step-by-step and, at the same time, perform real improvements.

About testing units. We have had these
That's good to know! I am sorry I totally missed that.

By the way, Simutrans does not use exceptions, because they clash with pthreads.
I am quite happy about this. I am able to write containers with strong exception guarantees, but I hate doing that, it is a huge pain. Moreover, it is not just a matter of pthreads: throwing an exception in a std::thread is a perfect footgun, and writing perfect-forwarding noexcept declarations is still a lot of boilerplate.

I had to find out things again
I would really like to avoid that. What I would like to obtain is something like "oh, this function used to take a char *, now it takes this st::string_view... fine". (Just to make an example of abstracting away things)

We do not use git because it did not exist when simutrans had already in an svn.
Yes, I know that  :D I just don't know how you generally handle contributions in which there are many patches instead of a single, big one.

I would rather wish for a basic portable GUI standard library as part of C++ instead some abstract features which makes the straight-forward craft of programming into an intellectual abstract exercise
The fact is that creating a brand new GUI requires a lot continuous development. On the other hand, performing modernization is much more step-by-step. I can decide to modernize one function, writing a unit test and sending a patch for that, everything (maybe) after dinner. It is not possible for me to do that for bigger features. I know that probably the total time is much bigger for the modernization, but I can manage the time in a better way  ;)

So for me C++2x should include a portable minimalistic GUI standard libary
There is a paper that, unfortunately, have been rejected (for now). There are lots of things that would be nice in C++ (i.e. heterogeneous computing), but remember that there are already a lot of useful things, but if we stick to the old standards we are leaving unused lots of efforts made by the community  :)

Yes, and for loops where the end can be modified without being "undefined". For a language, having undefined is really bad, because such code will be used by someone for sure (unless a big compiler warning).
I have to disagree with this. Undefined behaviour must exists, otherwise you cannot do what you want. Ok, it really sound strange, but follow me with the following examples. You can dangle pointers, but deferencing them is UB. If you define the behaviour of deferencing dangling pointer, it means that the compiler must put additional code to check whether that pointer is valid, and add other code to unwind the stack or whatever. Another example is overflowing signed integers, which is UB. You can easily check that performing a multiplication between ints is much more efficient that doing that between unsigned ints. And this is only because overflowing ints is UB and overflowing unsigned ints is perfectly defined behaviour.
Said that, give a look to Rust when you have some spare time, I think that you will find a wonderful language.

Offline jamespetts gb

  • Simutrans-Extended project coordinator
  • Devotee
  • *
  • Posts: 17764
  • Cake baker
    • Bridgewater-Brunel
  • Languages: EN
Re: Modern C++?
« Reply #62 on: October 31, 2018, 01:42:32 PM »
There is, I think, certainly a worthwhile intermediate point between, on the one hand, doing nothing to the code, and, on the other, rewriting it from scratch (the latter of which is not realistic given current coding resources). That intermediate position is modernising in the existing code those specific things that are relatively easy to modernise without major rewrites (e.g. NULL to nullptr), in addition to writing any new code or parts of existing code that is having a major rewrite for other reasons (e.g. to fix a bug or add a feature) using more modern features and practices.

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #63 on: October 31, 2018, 02:54:36 PM »
I can live with undefined behaviour in C, since this is like undefined behaviour is assembler (it is still a very thin layer, especially with some nice asembler like MC68k ) But C++ should not allow such things, i.e. offer a save_int with a defined behaviour on overflow and so on. Especially fundamental parts like a for loop should be deterministic in any situation, either the iteration is already determined at start or it is flexible. Or such features are useless, because who know what a function does added later in a different module, and one has to use own code.

Dereferencing unassigned pointers is also something, which is ok for C but should be defined for C++. Even MSVC assigns all pointer in debug mode to an undefined address, and having a defined behaviour (even as compile switch or pragma) would be very helpful for a high level language. Undefined behaviour is extremely dangerous, because it tends to work for some time until everybody forgot it was there. (Like volatile worked most of the time with threads, but we have been there.)

Anyway, these are just my rambling on where I would like to modernise C++. But it seems there are rather tons of stuff added to libraries than making the code more deterministic by design.

About testing, that was fully removed around 2005, when more than one pak became the norm. It was just loading a standard game and perform a few standard tasks. These things are done by anyone who uses a nightly, and hence never ever caught a bug. (And thus this was not maintained at all when I took over, so it was removed.)

Simutrans has also never strongly enforced a coding style, and automatic refactoring tends to make the code always more unreadable in certain places, where human deviated from the norm for a reason. I tried this several times, but the result was never very nice, and also would break existing patches. (However, apart from GUI patch, we lost most of them anyway recently).

But I think we agreed that careful modernisation is a good thing. So maybe rather start the coding then ;)

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #64 on: October 31, 2018, 06:09:27 PM »
I think C++ was designed to add no overhead compared to C unless you explicitly invoked it. Otherwise, it would not be as attractive in a time where CPU cycles and bytes of RAM counted more than today. That is why pointers are just as dangerous. It is also made to interface directly with legacy C code, which could explain why it has pointers at all. As a safer alternative, C++ has references as a safer alternative to pointers, although they are not perfect replacements as already explained.

I'm not sure why safe_int isn't part of the standard, but it certainly is made so that it is possible to create such a thing that integrates nicely with the rest of the code and can be used just like an int.

It is also a good question why accessing unassigned variables at all is legal, although it is probably due to C compatibility. I don't know how hard it is for compilers to detect such things, or remember when they even started giving warnings. Many aspects of C, such as headers, are to simplify compilers.

C++ exceptions seem almost fundamentally broken, or at least only half-done. Their main problem is that they can't pass through code not compiled according to the same C++ ABI, such as pure C code and modules compiled with other C++ compilers.

Offline dodomorandi

  • *
  • Posts: 18
  • Languages: IT, EN
Re: Modern C++?
« Reply #65 on: October 31, 2018, 10:21:49 PM »
C++ exceptions seem almost fundamentally broken, or at least only half-done. Their main problem is that they can't pass through code not compiled according to the same C++ ABI, such as pure C code and modules compiled with other C++ compilers.
C++ ABI is (unfortunately) unspecified by the standard. It is not a matter of exceptions, it is the whole thing :( In x86_64 the Itanium ABI is a de-facto standard on Linux, but MSVC uses its own ABI...

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #66 on: November 01, 2018, 04:37:00 PM »
C++ ABI is (unfortunately) unspecified by the standard. It is not a matter of exceptions, it is the whole thing :( In x86_64 the Itanium ABI is a de-facto standard on Linux, but MSVC uses its own ABI...
Yes, but code flow in general can be routed through the platform's "C" ABI. When it can't, at least directly, you (mostly) notice at compile time or link time. Exceptions just explode at runtime, perhaps silently if you're unlucky.

Offline DrSuperGood

  • Dev Team
  • Devotee
  • *
  • Posts: 2524
  • Languages: EN
Re: Modern C++?
« Reply #67 on: November 01, 2018, 07:24:20 PM »
Many of the newer standard C++ features were actually designed to reduce overhead. For example the loosening of constant expressions allows the potential for more compile time optimizations.

If one profiles Simutrans most of the time spent by the game running will be spent on a few areas of the code. You could increase the running time of the rest of the code by 10 or even 100 times and it will make unnoticeable difference to performance. All one must do is make sure that any changes made do not degrade the performance where it matters.

Offline TrainMith

  • *
  • Posts: 58
Re: Modern C++?
« Reply #68 on: November 26, 2018, 07:14:25 AM »
Please do not suggest putting "using namespace std;", even as a quick fix.  Such a statement is only used for extremely quick feature R&D for a small program, or for beginners in the first few programs during their C++ language learning.  AVOID IT.
The proper way would be to insert "using std::whatever_it_is;" which would allow that name and only that name to be brought in from the standard. 

Worth mentioning is that "volatile" was introduced as the glue between such things as interrupt handlers and programs, and before threading was ever really introduced.  As you might well be aware, the processor would switch to the handler, change the contents of a memory location, and return back to the program which would need the updated value.  Hence, a compiler should _not_ be ignoring the volatile keyword, but instead forced to retrieve the current value of the memory location from actual memory.  Unless of course the current standard wording is tolerating the possibility of the updated value being retrievable from cache with a separate copy on write to the actual memory. 

It's quite interesting that while there is a complaint about using standard allocation, ie. new/delete, in some parts of the code, that at no time was an customized allocator mentioned. Granted, originally C++98 allocators were not allowed to have state, but that was rectified in C++11.  Furthermore, C++17 now has other allocators under the std::pmr namespace, with some controversy still.  My hope is that allocators get ironed out totally by C++2000.
Even with all of this, custom allocators can be made and used to help alleviate some of the issues we might have with new/delete in those particular sections of code in question, with more desirable allocation schemes.

It was mentioned about the simutrans customized singly-linked list, which it might be worth pointing out that C++11 introduced the std::forward_list for just such reasons of memory overhead.  Also introduced was std::array.  The hashing containers likewise previously mentioned are std::unordered_map and unordered_multimap.  Don't forget the std::set, multiset, unordered_set, and unordered_multiset containers.  I personally am reveling in the ability to have a std::vector<std::unique_ptr<My_Type>> that allows polymorphism and safer memory guarantees.  Extracting an object of My_Type does require a std::move, for those expecting easy copying. 

I'll have to dig up a few CppCon presentations on Youtube, several of which are showing how terse the resulting machine code can be and accompanying improved performance. 
prissi, I think you are right that the standard is for typical situations, and that simutrans sometimes has simutrans-specific situations that can't use standard constructs.  However, we won't be totally sure of what benefits could be gained or lost without trying.  It seems to me that we are tossing around a lot of code in this posting, which might be better to just toss it in some sections already.  I do think you are correct that the core, underlaying portion of karte_t or such should be possibly overhauled.  It might make maintenance a lot easier sooner than we expect.

Two things bother me about the current simutrans code. 
The first is the profuse copying of forward class declarations.  Placing most of these within small header files would seem to be more beneficial.  The increased number of header files might initially seem like a nuisance, but the code maintenance would be increased as well as seeing the actual overall code structure quicker. 
The second is quite a few class inheritances seem to be higher than need be.  The base convoy class really shouldn't know about the overtaking class.  Granted, this would require each child class of convoy to separately inherit from overtaking, but this would seem to make better sense, especially since an aircraft wouldn't really know about overtaking.  In fact, aircraft might benefit from some other strategy of changing routing, totally unrelated to how the cars would overtake, or perhaps how trains might do so in a home-brewed compile. 

Offline prissi

  • Developer
  • Administrator
  • *
  • Posts: 9309
  • Languages: De,EN,JP
Re: Modern C++?
« Reply #69 on: November 26, 2018, 03:23:56 PM »
I think you are touching very interesting points. However, the main thing is that Simutrans was mostly made before C++98, before even std:: was convceived. At that time as much inheritance as possible was also considered clever style. Hence the strict separation of class implementation and definition (with the exception of oneliners).

Things were added over time, and fixed as needed. std came, and grew, and the style coding changed (maybe even faster than the contributors ... ) So we have a relatively bug free Simutrans, and hence maintenance of Simutrans is not very high. Of course with outdated coding style.

Of course one could change Simutrans into a way to comply with the current style of C++ implementation. The motivation for that is easily maintainable code. However, what about stable code, which almost needs no maintenance any more? If changing the code style will bring in many new developers and new features, that will be a different thing. But my feeling is that Simutrans (as well as OpenTTD btw) has been past their peak and now sit comfortable in their niche with some other commercial counterparts.

Not only C++ standard have changed, the world too. Simutrans is ancient, it was top notch when there were 486 CPUs and 4 MB DOS main memory still nothing special, and 1024x768 has high resolution graphics. Multitasking was something only Lunix could do ... This is like putting a steam engine from a railroad on the street by adding rubber tires to it. It can drive and over time the speed limits and steering got upgraded as well. But it is still a steam engine made for tracks; so this is how Simutrans fits to modern massive threaded GPU driven environments. You can drive it, but ...

Realistically Simutrans has reached its limits, without major restructuring, as the 3D patch branch showed. Nowadays one would transfer all the drawing to a GPU and would use more threads, which both would require quite a different basic design. If done well, one could get something very close to Machinsky https://store.steampowered.com/app/598960/Mashinky/

So should effort on changing coding style to do the same thing with different compiler input not rather invested more wisely in doing something not yet possible? The future might be rewriting the core from scratch to optimise for GPU and threads and reuse parts from current Simutrans (like the routing) and the right graphics. Then all the "modern" concepts can be used to their fullest potential.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #70 on: November 26, 2018, 07:18:35 PM »
Please do not suggest putting "using namespace std;", even as a quick fix.  Such a statement is only used for extremely quick feature R&D for a small program, or for beginners in the first few programs during their C++ language learning.  AVOID IT.
The proper way would be to insert "using std::whatever_it_is;" which would allow that name and only that name to be brought in from the standard.
I totally agree. It defeats the point of namespaces. (One could perhaps argue that the standard C++ library should have been divided into more namespaces, which appears to be what other languages have done, but that is beyond our control.)

It's quite interesting that while there is a complaint about using standard allocation, ie. new/delete, in some parts of the code, that at no time was an customized allocator mentioned.
I think it has been mentioned in earlier discussions that were very focused on Simutrans' custom collections. Possibly even by me. I don't know more about them than that they exist, though.

I personally am reveling in the ability to have a std::vector<std::unique_ptr<My_Type>> that allows polymorphism and safer memory guarantees.
It is certainly better than std::vector<My_Type*> (as long as this collection owns the pointers and the pointers have a limited life span), but both have the drawback that you have to de-reference two pointers to get that actual data, which may be significant in performance critical situations.

Granted, this would require each child class of convoy to separately inherit from overtaking, but this would seem to make better sense, especially since an aircraft wouldn't really know about overtaking.
Composition might be even better, although I'm not sure if C++'s multiple inheritance makes that less of an issue than the Java I'm used to.

Simutrans is ancient, it was top notch when there were 486 CPUs and 4 MB DOS main memory still nothing special, and 1024x768 has high resolution graphics.
I don't think I've even played Transport Tycoon on such a poorly equipped computer! But then I've gotten the impression that the Simutrans crowd lags somewhat behind other gamers.



All in all, I think the biggest problem Simutrans has is the combination of game logic and display logic in karte_t. It makes it pretty much impossible to add GPU rendering as an alternative back-end, as the information need for that is quite different from the current algorithm that always draws everything to the backbuffer and only then figures out, in display coordinates, where the changes are. And I suspect it is a drawback for multiplayer as well.


I don't see a way to change that gradually, though.


Direct2D has an API that looks more suited for what Simutrans does than OpenGL (or Direct3D for that matter), but I haven't gotten around to checking if it has some tricks to avoid the serious performance problems the OpenGL attempts have had. It is also Windows only, which makes it significantly less interesting for a cross-platform game.

Offline TrainMith

  • *
  • Posts: 58
Re: Modern C++?
« Reply #71 on: December 05, 2018, 05:13:48 PM »
(One could perhaps argue that the standard C++ library should have been divided into more namespaces, which appears to be what other languages have done, but that is beyond our control.)
Yes, but that is why the "using" construct was made; only bring a name into the current namespace if necessary, and as a preventative of multiple full specificiers (ie. std::vector this, std::vector that). 

Composition might be even better, although I'm not sure if C++'s multiple inheritance makes that less of an issue than the Java I'm used to.
As with any good tool there is a wrong way, and also the wrong time, to use the proper tool.  Although I've stayed away from Java, the analog to Java's interface is the pure virtual C++ base class, which has the slight downside of a vtable indirection, but the benefit in being able to use it with multiple inheritance to present different facets for different uses of the derived class with less entanglement.

I'm constantly vexed by simutran's use of forwarding declarations within files that merely use the class, whenever I try to use vim's [-Ctrl-i or [-I code keyword search to find where exactly the class is truly forward declared within an interface header and then its accompanying source file.  I'm sure that it might have helped originally coding each module, but now it is wasting coder's time to find these and  about 60% of the compiler's time trying to match these forwarding declarations.  I'm sure this is also a big dissuasion for some new coders.
I don't know more about them than that they exist, though.
I've yet to make one myself, mostly because C++98's allocator was poorly designed and documented.  Yet they have been improved, and what better time to learn?
It is certainly better than std::vector (as long as this collection owns the pointers and the pointers have a limited life span), but both have the drawback that you have to de-reference two pointers to get that actual data, which may be significant in performance critical situations.
Your fears may be unfounded:  C++Now 2018: Matt Godbolt “What Else Has My Compiler Done For Me Lately?” https://www.youtube.com/watch?v=nAbCKa0FzjQCppCon 2017: Matt Godbolt “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid” https://www.youtube.com/watch?v=bSkpMdDe4g4Also CppCon 2016: Jason Turner “Rich Code for Tiny Computers: A Simple Commodore 64 Game in C++17” https://www.youtube.com/watch?v=zBkNBP00wJEOne profound observation was that compilers might be better at creating machine code and, when set to higher optimization levels, don't always expect the machine code to exactly follow the source code while also providing better instructions.

Interesting point to make:  if the dirty bit is being used to mark a graphic to be redrawn, it is being done for two reasons: once for the object needing to move or be placed/removed, and otherwise for redrawing itself because something in front of it moved.  Thus, I'd suggest having the dirty bit marked in two places; first for the object actually being moved/placed/removed (ie, OpenGL happy, perhaps also for marking a Display List, VBO, or VAO as needing to be remade), and the other for the graphics in back to redisplay (ie, not needing work if using OpenGL).

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #72 on: December 05, 2018, 06:36:41 PM »
Yes, but that is why the "using" construct was made; only bring a name into the current namespace if necessary, and as a preventative of multiple full specificiers (ie. std::vector this, std::vector that). 
"using" is not the solution. It is equally the problem. What I'm talking about is that "using" can either import a single symbol, or the entire standard C++ library. Other languages, like Java and Python, have divided their standard library into multiple parts so that you can import for instance everything related to file access or GUI with a single line. In C++, the fact that you also need to use #include allows you to filter down somewhat, but since included files often include other files, you get more than you implicitly ask for.

On the other hand, there is debate on whether importing symbols in bulk is a good idea in Java, even though the bulks are far smaller than in C++.

Your fears may be unfounded
Compilers can't work around language rules and physical realities. Although I haven't read the C++ standard itself, it is apparently not freely available, it is my impression from other sources that C++ dictates that std::vector<int*>, and by extension std::vector<std::unique_ptr<int>>, basically boils down to int**. Getting hold of the actual int value requires two memory accesses. std::vector<int> boils down to int*, where only one memory access is needed to get the int value. std::vector<std::unique_ptr<int>> is of course a strange thing to have, but the same holds true for other types than int.

There are some potential benefits of having a vector of some kind of reference to something, rather than just a vector of something. What ultimately is best depends on the size of that something, the access patterns, and whether execution speed is more important than development speed.

Offline TrainMith

  • *
  • Posts: 58
Re: Modern C++?
« Reply #73 on: December 07, 2018, 02:09:55 PM »
Apparently the forum has a problem with code blocks, and decided to delete all of my commentary.  My apologies for the original seemingly empty post. Upon further forum testing, code blocks have problems with line returns and some other anomaly.  I also had to add an extra line return after every each code line. Anyways, back to my commentary!

it is my impression from other sources that C++ dictates that std::vector, and by extension std::vector
Intuitively I had thought the same, although was more optimistic due to both the gcc and clang compilers, which optimize the following code back into an int array access:
Code: [Select]
#include <vector>#include <memory>
int check_it(const std::vector<std::unique_ptr<int>> &vupi)
{
int result(0);
for(size_t i(0); i < vupi.size(); ++i)
result += *vupi[i];
return result;
}
MS VS2017 apparently isn't as clever, and does the expected double indirection.  This apparently is a specification for int, maybe even for any built-in, type, since the following code results in the double indirection we both were expecting:
Code: [Select]
#include <vector>
#include <memory>
class A
{
    public:
        virtual int get() const = 0;
        virtual ~A() {}
};
class B : public A
{
        int b;
    public:
        B(int bb) : b(bb) {}
        int get() { return b; }
};
class C : public A
{
        int c;
    public:
        C(int cc) : c(cc) {}
        int get() { return c++; }
};
int check_it(const std::vector<std::unique_ptr<A>> &vupi)
{
    int result(0);
    for(size_t i(0); i < vupi.size(); ++i)
        result += vupi[i]->get();
    return result;
}
Tested this at cpp.godbolt.com by the methods mentioned within the videos I linked earlier.
« Last Edit: December 07, 2018, 02:53:47 PM by TrainMith »

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #74 on: December 07, 2018, 05:12:02 PM »
I don't see how std::vector<std::unique_ptr<int>> can be turned into int[] without breaking certain things. In particular:
Code: [Select]
int *a = vec[1].get();
vec.erase(vec.begin());
int *b = vec[0].get();
assert(a == b);
Only if the compiler can be certain that it sees all usages of vec can if do such a substitution.

Offline isidoro

  • Devotee
  • *
  • Posts: 1124
Re: Modern C++?
« Reply #75 on: December 08, 2018, 12:45:27 AM »
Not a good example.  Instead of moving the content of the int[], the compiler can make the vec start now at int[1]...
 

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #76 on: December 08, 2018, 09:01:53 AM »
Not a good example.  Instead of moving the content of the int[], the compiler can make the vec start now at int[1]...
That is std::deque, isn't it? But the point is that insert and remove can cause the elements of the vector to be moved to a different memory address. What if I did an insert at index 0 with a million new elements instead of one remove? Are you suggesting that the compiler can just allocate a million elements ahead of the start just in case? Remember that the compiler will not see all operations being done on the vector, as the vector is accessed from multiple compilation units. You'd need some serious link-time optimizations to improve this.

Offline isidoro

  • Devotee
  • *
  • Posts: 1124
Re: Modern C++?
« Reply #77 on: December 08, 2018, 08:51:08 PM »
I only said that yours wasn't a good example, not that there wasn't the problem you mention.   ;) The best thing can be to look at the assembly code in compilers that do such an optimization, if there are...

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #78 on: December 08, 2018, 11:22:00 PM »
The best thing to do is pick good algorithms and data structures, using the collective experience built up over decades, then measure the performance. Trying to read the optimized assembly output from a compiler will drive most people, even most developers, half crazy. One can try small examples, but there are so many things that come into play, that making examples covering them all is very hard. GCC has some intermediate form that can be dumped for analysis, but it is very verbose, so it is also not really usable on actual production code.

Offline TrainMith

  • *
  • Posts: 58
Re: Modern C++?
« Reply #79 on: December 09, 2018, 09:01:00 AM »
I don't see how std::vector<std::unique_ptr<int>> can be turned into int[] without breaking certain things.
Ter, it didn't make int[], rather it was int*[].  The compiler optimized for a single pointer dereference from the vector since there was an iteration over all the elements, and then did a per element dereference from the unique_ptr.  Your example wouldn't cause an assertion error, btw, because of C++11 move semantics introduced into the standard containers.  Pointing into the vector would still cause a problem due to elements being relocated.
Ter, even the creator of Compiler Explorer mentioned that it is best for analyzing small portions of code, and to see how they relate using different compilers.  It even highlights between the original code and the assembly produced, and it has buttons for reducing the verbosity (even name mangling).  I'm thinking you'd like it a lot more than you think.

So should effort on changing coding style to do the same thing with different compiler input not rather invested more wisely in doing something not yet possible? The future might be rewriting the core from scratch to optimise for GPU and threads and reuse parts from current Simutrans (like the routing) and the right graphics. Then all the "modern" concepts can be used to their fullest potential.
I think the biggest problem Simutrans has is the combination of game logic and display logic in karte_t. It makes it pretty much impossible to add GPU rendering as an alternative back-end
I'm in agreement with Ter here; karte_t is needing to be reworked.  Perhaps game_entity_t would be a good idea for the per-game aspect, keeping karte_t for the karte it is supposed to be.    Also we could have a lobby_t, allowing us to select the game_entity_t for play.

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #80 on: December 09, 2018, 09:53:31 AM »
Ter, it didn't make int[], rather it was int*[].  The compiler optimized for a single pointer dereference from the vector since there was an iteration over all the elements, and then did a per element dereference from the unique_ptr.
You were the one who wrote that the compiler turned it into an int array, when I claimed that it could not do better than an array of pointers to integers, which in turn had to be accessed via a pointer (in all but the most trivial of cases).

Offline isidoro

  • Devotee
  • *
  • Posts: 1124
Re: Modern C++?
« Reply #81 on: December 09, 2018, 11:53:58 PM »
The best thing to do is pick good algorithms and data structures, using the collective experience built up over decades, then measure the performance. Trying to read the optimized assembly output from a compiler will drive most people, even most developers, half crazy. One can try small examples, but there are so many things that come into play, that making examples covering them all is very hard. GCC has some intermediate form that can be dumped for analysis, but it is very verbose, so it is also not really usable on actual production code.

The best thing to do to solve a mystery about how some thing or other is done by a compiler is to look at the assembly generated by it.  It's not that difficult.  Give it a try.   ;)

Offline Ters

  • Coder/patcher
  • Devotee
  • *
  • Posts: 5368
  • Languages: EN, NO
Re: Modern C++?
« Reply #82 on: December 10, 2018, 06:53:43 AM »
The best thing to do to solve a mystery about how some thing or other is done by a compiler
I'm not here to solve mysteries, except for why Simutrans is as it is.

Give it a try.
I don't have to try. I've read more than enough x86 assembly, and some Java equivalent, to risk my sanity (if I had any to begin with).

Offline TrainMith

  • *
  • Posts: 58
Re: Modern C++?
« Reply #83 on: December 10, 2018, 06:12:31 PM »
You were the one who wrote that the compiler turned it into an int array, when I claimed that it could not do better than an array of pointers to integers, which in turn had to be accessed via a pointer (in all but the most trivial of cases).
Ter, forgive me for misreading your post to which I was originally replying, as that we must have been thinking of the same array of pointers to integers.  I had only been inferring that the vector might turn into an optimized array access (especially in a for-loop), but did not intend that the contained pointer would be simplified. 

As for other aspects of modern C++, the usage of constexpr for both constants and functions returning compile-time constants would help some with compile and runtime times.

Offline isidoro

  • Devotee
  • *
  • Posts: 1124
Re: Modern C++?
« Reply #84 on: Yesterday at 12:15:33 AM »
I'm not here to solve mysteries, except for why Simutrans is as it is.

Good for you!