News:

Simutrans.com Portal
Our Simutrans site. You can find everything about Simutrans from here.

Re: While coding C++ how do you usually test your expressions and functions?

Started by sdog, November 25, 2016, 06:24:59 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

sdog

I think some clarification with regard to the const matter would be good. It seems that intended and perceived meaning do not converge yet.

Initially we were talking specifically about constant references. For example let N be a field that includes all valid values for int and y in N, and int a = x; const int &b;. I understand this as a reference where the value of a is mutable but this value can be accessed read-only through b; ie a = y; is possible while b = y; is not, for all y in N\x.
Now, I understand the const attibute in the context of a & reference is to inform the compiler that the latter operation, b += y;, is forbidden. However, in the binary there is no difference if there is a pointer to the address denoted where the value of a is stored, a const reference, a non-const reference or a itself. Is this so far correct?


I should like to discuss this quote in that context, I number sentences for later reference:
Quote
(1) I think it is more of an instruction to the compiler to forbid you from changing the value. (2)(const also predates const&, which affects how the latter works.) (3) However, the programmer can pull the rug out from underneath the compiler by casting away the const-ness and changing the value nonetheless. (4) Even without casting away the const-ness, the value can change, so I'm not sure how many assumptions the compiler can make.
[...]
(5) I do not know of any cases where pass-by-value is turned into pass-by-const-reference, I just can't rule it out.
(1) appears to be consistent with what I think. However, (5) is slightly contradictory, why would the compiler, that creates the machine code bother with instructions to itself? That gets me to the second point of (5) when the compiler can establish that there are%

Ters

For C/C++, I tend to just write the code and test if the entire program works as it should. Sometimes I have written smaller proof-of-concept programs. But I have never written more than perhaps ten lines of C/C++ code professionally. I have written some C# code professionally, but that was just test clients and proof-of-concepts clients for our web services.

When doing Java professionally, we use xUnit-style testing. I have used cppunit for some C++-testing, which I found somewhat more cumbersome than doing the same with Java, but certainly doable, at least if the code isn't too tightly coupled. These kinds of tests are however not just to test that the code you write works, but also that it keeps working, even as the program gets rewritten year after year, known as regression testing. The tests are written as part of the code base, and can be set up to run every time the application is built. Some insist on writing the tests first, then write the implementation until the tests pass. I do that sometimes, but not always, probably not even most of the time.

I also sometimes throw together some small proof-of-concept programs in Java as well. While they must be compiled to run, it is much easier than with C, C++ and C# since Java programs are not linked into a single image when built.

But even in the code I write professionally, there are parts that only get tested by actually using the program.

DrSuperGood

I generally do not test the expressions itself as I have had enough experience that most expressions I write work as intended. I find testing the end behaviour far better, especially since a lot of what I have done so far is maintenance work so changes should be immediate.

Ideally one wants test each module to some extent. However the way Simutrans is written is not very modular with a lot of functions that are hard to test or couple so much that one cannot really test.

prissi

When Simutrans was way smaller there were some tests, of which only the default map is still existing. But I think they were already useless before I took over.

And I wonder how one would test something that acts one an object as complex as the world_t in Simutrans. The chances of the test routine being buggy seems as high as that the code is buggy. Same for a scientific simulation: you can verify that for a single wavefunction the eigenvalues are correct; but if the coupling between them is assumed "correct physics" and implemented correctly, that can be old seen by running the simulation.

Moreover C (like Fortran) predates procedure testing nearly by decades (I think). Those are rather an introduction into practice in the last twenty years.

Anyway, for interpreter there is Ch: https://www.softintegration.com/download/ which is based on this http://www.drdobbs.com/cpp/building-your-own-c-interpreter/184408184 from 1989 ...

DrSuperGood

Quote
And I wonder how one would test something that acts one an object as complex as the world_t in Simutrans.
One cannot as its not very modular. Ideally one would want to test individual parts.

For example a factory. One should be able to simulate a factory working without a need for any visuals or a world. One should be able to simulate the factory receiving goods, simulate the factory ticking and simulate output being pulled from it etc.

Tests for world_t would be placing objects in a well defined way and preforming actions on them to check if they mutated properly I guess.

sdog

Ters:
Quote
For C/C++, I tend to just write the code and test if the entire program works as it should. Sometimes I have written smaller proof-of-concept programs. But I have never written more than perhaps ten lines of C/C++ code professionally.

DrSuperGood:
QuoteI generally do not test the expressions itself as I have had enough experience that most expressions I write work as intended.
Also with the complicated syntax, say anonymous functions in C++11, or things you are not actively used to? (something that to me as an outsider seems as if it were abundant for every given user of c++) I suppose it is rarer, and you might just compile a quick test? It indicates though that the mistake in my assumption is wrong, and most are so sure in the language that they check expressions and functions without having to resort to tests.

Quote
Same for a scientific simulation: you can verify that for a single wavefunction the eigenvalues are correct; but if the coupling between them is assumed "correct physics" and implemented correctly, that can be old seen by running the simulation.
Thanks for that example. That was indeed 60% of my work. Looking at output data and reasoning whether it is correct or not. This has been done for nearly the same code for nearly 20 years by about 5 people on average. However, it helps a lot when one could be certain that single subroutines or smaller expressions are correct. Since many things go wrong when putting the stuff together (passing arrays to functions in f77 -- pain!) it helps just a bit.

Prissi:
Ch looks quite good, cheers for the link.

QuoteMoreover C (like Fortran) predates procedure testing nearly by decades (I think). Those are rather an introduction into practice in the last twenty years.
The Linux kernel alone would require 30t of punch cards...
Stuff was somewhat more terse back then.


I forgot something very useful before, checking the type of the output of expressions. Here's a simple example.



****************** CLING ******************
* Type C++ code and press enter to run it *
*             Type .q to exit             *
*******************************************
[cling]$ #include <cmaths>
input_line_3:1:10: fatal error: 'cmaths'
      file not found
#include <cmaths>
         ^
[cling]$ #include <cmath>
[cling]$ pow(5, 2)
(double) 25.0000
[cling]$ int a = 5
(int) 5
[cling]$ pow(a, 2)
(double) 25.0000
[cling]$ double b = 5
(double) 5.00000
[cling]$ pow(b, 2)
(double) 25.0000
[cling]$ 5.0
(double) 5.00000
[cling]$


Which is also of much less importance with C since function declarations already
define the output type.

In Haskell I often use ghci to get the type of a function, to paste it into my programme
instead of writing it right away. Example:

Prelude> let f = \x -> x
Prelude> :t f
f :: t -> t
Prelude> let g = \(x,y) -> x**y
Prelude> :t g
g :: Floating a => (a, a) -> a


ps.: Oh dear. "C does not have a built-in operator for exponentiation, because it is not a primitive operation for most CPUs. Thus, it's implemented as a library function." What have I gotten into.

edit: another test

[cling]$ #import <cmath>
[cling]$ long double a = 5
(long double) 5L

[cling]$ pow (a, 2)
(double) 25.0000

[cling]$ long double b = 2
(long double) 2L
[cling]$ pow(a, b)
(double) 25.0000

[cling]$ a*b
(long double) 10L


Ters

Quote from: prissi on November 25, 2016, 09:44:22 PM
And I wonder how one would test something that acts one an object as complex as the world_t in Simutrans.

You simply don't have so complex objects. On the other hand, I don't think world_t is that complex. The biggest problem is that almost everything else is now hard coupled to a single world_t. For testing, you'd want to pass the object being tested a mocked world_t that behaves just like it should for whatever scenario is being tested, and which possibly also tracks what interactions it receives, so that you can test that the object being tested calls to world_t in exactly the expected way. It is harder to mock if interactions between objects are not purely through interfaces, or at least virtual functions.

Quote from: sdog on November 25, 2016, 10:28:39 PM
Also with the complicated syntax, say anonymous functions in C++11, or things you are not actively used to?

Even when experimenting with the new C++ features, I wrote a lot more than just simple expressions to get the hang of them. And most of the troubles with them were getting them to compile. Once I got them to compile, they behaved like they should. The most troublesome errors C and C++ can give you during runtime, is when you've entered the mysterious realm of undefined behavior. These things have a nasty habit of working fine at first, only to start blowing up once in a blue moon some time later.

prissi

About c++ return types: Those can easily depend on calling types, if memory serves me right.

char * pow( char *, double ); and double pow( double, double ); both are legal, but pow ("", 0 ) would return a char * and pow (1,2) a double. Compiler warn at least when you do pow (int, int ) and pow( double, double )

sdog

Quote from: prissi on November 25, 2016, 11:19:28 PM
About c++ return types: Those can easily depend on calling types, if memory serves me right.

char * pow( char *, double ); and double pow( double, double ); both are legal, but pow ("", 0 ) would return a char * and pow (1,2) a double. Compiler warn at least when you do pow (int, int ) and pow( double, double )

I'm afraid I couldn't follow you with that.
Let: double d; int i; and char c;
The expression c * pow (c, d); does return a double [which is somehow strange (hackish?), this really ought failt to compile].

Why is it unusual that pow (d, d); is legal? I should expect that a^b is correct for all a, b in R.

Why compiler warnings for a^b for all a, b in N ? Aren't those exactly the correct uses? Perhaps only a^b for all a in R and for all b in N being more common?



[cling]$ double d; int i; char c
[cling]$ #include <cmath>

[cling]$ pow(c, d)
(double) 1.00000

[cling]$ pow(d, d)
(double) 1.00000

[cling]$ pow(i, i)
(double) 1.00000

[cling]$ pow("", 0)
input_line_15:2:2: error: no matching
      function for call to 'pow'


by the way, a funny one:

[cling]$ pow(0,-1)
(double) inf

Lovely: double precision infinity!


[cling]$ pow(0,0)
(double) 1.00000

Ouch.

Double ouch, this time for being thick, asking maths stackexchange:
http://math.stackexchange.com/questions/11150/zero-to-the-zero-power-is-00-1

jamespetts

A good way of testing small fragments of code, I find, is to use the visual debugger in Visual Studio, sometimes with special test variables the value of which can be observed at various times. The variables can be removed when the code has been tested.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

Ters

Quote from: sdog on November 25, 2016, 11:51:16 PM
I'm afraid I couldn't follow you with that.
Let: double d; int i; and char c;
The expression c * pow (c, d); does return a double [which is somehow strange (hackish?), this really ought failt to compile].

That makes perfect sense by the low level rules C follows. The only obfuscating part is the name of the data type char. It is not inherently a text data type, just another integer data type with a range of typically [-128, 127]. And integers can automatically be converted to floating point types. So your example makes just as much sense as 4 * pow(4, 3.1415);. In real life applications, it would perhaps be more usual that the exponent is the integer and the base a floating point number. And since exponents usually are smaller than 100, being able to store it in an 8-bit data type might have been important back when available memory was counted in kilobytes.

Quote from: sdog on November 25, 2016, 11:51:16 PM
Why is it unusual that pow (d, d); is legal?

Where did you get that idea from? There is nothing particular about that in C or in math in general. I can't think of any real life application for such a formula, but math doesn't seem care for such things.

Quote from: sdog on November 25, 2016, 11:51:16 PM
Lovely: double precision infinity!

Double precision positive infinity! One of those funny values that aren't equal to anything, even itself. At least it's comparable to numbers and the other infinity. NaN is neither. By not comparable, I mean that the comparison always yields false, not that you get an error. (C also has positive and negative zero due to the way it is implemented. All of this is according to IEEE 754, which is not unique to C. In fact, I don't know of any floating point implementation that doesn't, for the most part, follow that standard.)

prissi

About return types:

double pow( double a, double b )
{
  return exp( b + ln(a) );
}

char *pow( char *s, double b )
{
  static char ss[1024];
  sprintf( "%s^%f", s, b );
  return ss;
}

in which the pow(0,0) returns a double and pow( "s", 8 ) returns a string.

sdog

Thanks, you gave me good pointers!

It wasn't clear what you wrote before. Cheers for the clarification Prissi. Now it is only not clear to me what it means, that will take a bit until I've caught up.

Thininking about this. Do you (pl) think the approach to teach C++ while avoiding C is reasonable? The text book I chose (Koenig and Moo, Accelerated C++) made a point of not teaching C and not teaching pointers. Arguing that that would lead to bad practice in OO C++. But when standard lib definitions of functions cannot be understood by the student, that seems to be a somewhat dangerous approach.

Combuijs

Quote from: sdog on November 26, 2016, 10:36:40 PM

Thininking about this. Do you (pl) think the approach to teach C++ while avoiding C is reasonable? The text book I chose (Koenig and Moo, Accelerated C++) made a point of not teaching C and not teaching pointers. Arguing that that would lead to bad practice in OO C++. But when standard lib definitions of functions cannot be understood by the student, that seems to be a somewhat dangerous approach.

For writing new code or learning C++ that approach is not unreasonable, I feel, although you can't leave that much C away. For understanding existing code written by someone else, you should have knowledge about all kind of C-specifics and especially pointers as C and C++ are often mixed up. And while for example C# does not have pointers, it is still very useful to know how they behave. It makes it much easier for example to understand the difference between a shallow and a deep copy.
Bob Marley: No woman, no cry

Programmer: No user, no bugs



prissi

C++ is just like a preprocessor to C (indeed the very first C++ went that route) and internally makes heavy use of pointers. The class operation c->member (when c is a pointer to a class) is extremely common, and passing class variables as references is less common, and pointer are more (at least in simutrans). In principle, even C code could access C++ members (at least when on the same compiler and given similar structs and taking care of virtual function pointer in classes).

Moreover, even C++ new is a pointer operation. Given that the reason for C++ is usually speed (and second that only C libraries are available for a certain device/function), the misuse of pointers can strongly impair performance or make programs very unstable. If you want to avoid pointers, then do not use C++, use Java or some script languages.

Ters

What Java calls references is much more like what C++ calls pointers than what C++ calls references. Java references can be pointed elsewhere and they can be null, C++ references can not (unless you do evil things). The only difference is that Java references never point to uninitialized memory or otherwise led astray. You also neither can or need call delete on them to free memory, but that doesn't mean you don't have to thing about freeing memory, because it is still very much possible to leak memory.

And I'm not sure how you're supposed to do anything useful in C++ without using, or even knowing about, pointers is beyond me. Sure, you might use wrapper classes such as std::vector and the various smart pointers to such a degree that you might perhaps avoid using them directly, but good look debugging if you don't know how they work. And pointers are part of every API I have ever seen, so interacting with anything without using pointers will likely be difficult. I have seen a library which used references rather than pointers, but it treated the references like pointers anyway, making the entire thing just more confusing.

sdog

Understanding C is of course vastly more useful.

I've been very reluctant accepting the things Koenig and Moo found the most important. I wasn't so certain about it as I thought it were bias. Caused by my strong dislike (or lack of understanding) of OO. Perhaps its scope is as it is because the book predates the wide adoption of Java (2001). I think they try to get the students first to use and understand abstractions, as they fear they would just hack about in C otherwise. (That the first type the book introduces are strings speaks for much too, if i were to do string stuff today, why chose C++?)

I think now it is not a good idea for me to invest time into C now. Learning a few weeks these things will get me to a point where I could write nice 'hello worlds' or a sorting algorithm of limited use. Without learning C, which requires a very good understanding of microprocessors, I cannot claim knowing it. It seems much more sensible to focus on fruits that are easier to pick such as Cuda or MPI to open up posts that require HPC skills. (All the sweet Fortran jobs need that of course.)

Ters

Quote from: sdog on November 27, 2016, 12:40:41 AM
Caused by my strong dislike (or lack of understanding) of OO.

Object-oriented programming is so 1990s anyway. Trendy programmers do functional programming these days. However, now that functional programming is supported by C++ and Java, I guess it won't be long until functional programming is passe.

sdog

Quote from: Ters on November 27, 2016, 08:44:42 AM
Object-oriented programming is so 1990s anyway. Trendy programmers do functional programming these days. However, now that functional programming is supported by C++ and Java, I guess it won't be long until functional programming is passe.

I fear I'm not connected very much to the hip CS people. Functional is appealing to me as it is much easier to grasp from a conceptional point of view. Take this example of a power series for the exponential function: e^x = ∑_n x^n 1/n!

Prelude> (sum . take 18 . (\x ->  [ x^n / fact n | n <- [0..] ] )) 1
2.7182818284590455
Prelude> exp 1
2.718281828459045

That's Haskell (with a bit of syntactic sugar). The 'take 18' corresponds to an uper limit to the sum n=18 on the ∑, in the otherwise infinite sum. The function fact the fractional, defined elsewhere. Other from readability, the other advantage, when there is a strict type system, that one can ensure correct code (in tidbits). I don't even need to have a proofer. I made some mistakes with the type of n and did not produce valid code.

What I cannot accept in OO is that what my programme does depends on the internal state of objects, which is also entirely hidden from me. There might be methods to Give the complete nightmare side effects are, I just don't get the advantage of doing this, outside the fringe areas like i/O or GUi. Another problem i see with OO is that one has to test these objects then in all possible states for all extremes of acceptable input. That might easily lead a combinatoric explosion.

Not that this cannot be done in imperative programming (or functional) as well. I am so afraid of this exactly because I know 1000 line oblique Fortran subroutines that jugle data from global variable and may change stuff in any line.


However, now that functional programming is supported by C++ and Java, I guess it won't be long until functional programming is passe.
Well, OO was also a fad, and the hip crowd complained about imperative programming back then. And of course they weren't entirely wrong. It seems more to me that today there's much more of a realisation of where to use OO, where it is a hindrance. Likewise the strengths for imperative and functional approaches. Today most devs are multi language and multi paradigm. Those hordes of hardly trained VB and Java corporate code-slaves are apparently slowly disappearing (or are they simply outsourced to India?).

Ters

The reason for hiding the state in the objects is to avoid having lots of dependencies on the state scattered about the code. And the I don't see how you avoid multiple possible states by not using OO. The states are part of the real-life problem domain.

prissi

Again CUDA is a big subset of C with a few extensions, since it matches very well the low level code. If you want to write efficient code (and that is the point of CUDA, isn't it?) using a perl/python etc. wrapper will not give you peak performance. And there is also C++ for CUDA, so you can use complex number and exponent operator (I think those are even part of the latter implementation).

sdog

Ters,
I read my previous mail and have to apologise to you. I got carried away a bit with showing what I find so useful about functional, and the showing what my bias is against OOP. I did not mean to argue in favour of one and against the other. Yet reading my text again, it is very argumentative. This is out of place since I came to ask for advise.

If you enjoy a discussion on this topic however, I should love to reply to your last reply on state -- as a discussion in its own right, decoupled for my request for advise.

Ters

You clearly haven't seen many programming discussions. I wouldn't be surprised if it was programmers that invented the flame war.

I should perhaps have noted more clearly, although it is hinted at, that the "functional programming" that is hip now (at least in my field), is built on top of OO. It looks nothing like that Haskell example.

sdog

Thanks to all for the advise given in this thread so far!


@prissi: I meant use of CUDA in Fortran. But that was just a first thought, on a second thought, there's not much benefit from that. You are absolutely right in that regard.



I've decided now I'll go with "Koenig and Moo, Accelerated C++" focusing at the OO aspects of C++ and focus on the standard library, then going to simple C after that. Leaving understanding C to the future. One reason is simply to put something on my CV.

I've already had one negative reply on a job application. Without even an interview. Its a bit shameful to mention that publicly, in particular as none of my peers seemed to have failed that miserably. Apart from some mistakes in my cover letter and CV I think that lacking C++ knowledge is a reason. Notably, the same company posted a job three weeks later that would fit very much what I requested in my 'initiative application'.

Now that I fired off a real barrage of applications (four) and the job market is slow before the holidays I've got time for learning. My plan is to work for three weeks learning C++, one on C. Get it to a point where I can claim proficiency in a CV without lying, and try again with them. (They have by far the most intriguing projects, develop for linux, no Java(!), no .net(!!), they offer lunch, have showers, started as an university spin-off. On top of that they are in a city, in one of the German states with working school system, Nazis are rare, there's decent public and cycling infrastructure. Oh and did I mention, interesting projects.)

jamespetts

Very best wishes with your job hunt and your C++ learning. Have you started a Github account yet? Apparently, that's a good way of boosting a coding CV.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

sdog

For the sake of oversharing, I post a status update. Read on only if you don't mind a waste of time.

Not much progress so far. I've been fiddling for an umpteen amount of time to get a half-decent syntax completion engine for C running. Assuming it were better not to start by typing, and getting into good habits from the outset.

Comming from Fortran on one side, garbage collected, and immutable state languages on the other side, I've incredible trouble wrapping my head around memory management in C languages. It is effortless in the former and no concern of the programmer in the latter. The more I am afraid that if I am not diligent now I might cause at some point a memory leak or other unforgivable mistakes.

I thought as a starting point I better get to understand what the heap is, how it is structured etc. It occurred to me only after a while, that it might not have anything to do with the CS concept 'heap (data structure)' at all.

http://stackoverflow.com/questions/1699057/why-are-two-different-concepts-both-called-heap

Having only a passing acquaintance with graph theory and data structures, I had to catch up on that a bit first. *fact-to-desk* *grin*


Well compiling of some ncurses stuff is done, I may continue with fiddling on the syntax completer (YouCompleteMe) again. Since the checker is multi language, and the defaults in my configuration script are to have all turned on it is quite some work.
For example, there are stupendously many dependencies for C#. I can see now why there is so much fondness for the JRE.
(I also want to avoid having to install mono and JS stuff just to build it. Can't those people devloping stuff on Mac do anything without having half a dozen obscure frameworks that need to be installed and cleaned up for every bloody build job? Oh, it also requires for unknown reason the rather disrepute Boost library.)


ps.: I noticed that my posts here might appear aggressive, whiny, or complaining. That is a partial misconception, I hope. Firstly, I cannot deny that I am German, hence the complaining. For another, I am somewhat frustration tolerant (Fortran 77 wink, wink) and need to build up a bit of frustration to actually get somewhere. If there are no tough nuts to crack, it cannot keep my interest.

jamespetts

The interesting thing about learning C++ (and presumably also C) is that it teaches one rather more about how computers work than does learning a higher level language.

After learning the difference between the heap (i.e. system memory) and the stack (i.e. processor cache memory), the next step is learning about memory management of arrays (including when to use the delete[] command) and pointer arithmetic.
Download Simutrans-Extended.

Want to help with development? See here for things to do for coding, and here for information on how to make graphics/objects.

Follow Simutrans-Extended on Facebook.

HarrierST

Quote from: sdog on December 14, 2016, 09:49:28 PM
Notably, the same company posted a job three weeks later that would fit very much what I requested in my 'initiative application'.

Send in another application.The advert could be from a different part of the company, departments do not share info. Also human resources/personnel etc. what ever they are called now do not either.

HarrierST

Quote from: jamespetts on December 17, 2016, 11:26:26 PM
The interesting thing about learning C++ (and presumably also C) is that it teaches one rather more about how computers work than does learning a higher level language.

After learning the difference between the heap (i.e. system memory) and the stack (i.e. processor cache memory), the next step is learning about memory management of arrays (including when to use the delete[] command) and pointer arithmetic.

I never learned how to program C++ effectively, but did read up about it - by then I had moved out of programming.

But your last paragraph - brought back old memories. I started as a machine code programmer in 1970, moved on to Assembly then PL1/Cobol.  In those days to get the best performance you had to know how the machine and its language worked.

Because machines were less powerful, two programmers working on the same problem could get very different results.

Today - the attitude seems to be, as quick as possible to code, not as efficient as possible. Hey the speed of the computer will hide any poor coding.

DrSuperGood

Quote
After learning the difference between the heap (i.e. system memory) and the stack (i.e. processor cache memory),
Heap is an area of memory where dynamic memory allocation and other memory usage occurs. Stack is a memory structure that is unique to each processor thread.

Neither have anything to do with system memory or processor cache memory directly. Both can be backed by system memory and both can take advantage of processor cache memory for better memory I/O performance.

In some processor architectures the heap is physically separated from the read only processor code. In some processor architectures the stack is a unique register based data structure inside the processor (not memory backed).

Quote
Today - the attitude seems to be, as quick as possible to code, not as efficient as possible. Hey the speed of the computer will hide any poor coding.
Compilers optimize code a lot better now. It is far better to write something that is clean and easy to read than low level an efficient as chances are the compiler will take the resulting code there anyway by itself.

What is still important is to attack problems efficiently. Using a O(log2(n)) instead of O(n) makes a huge difference, more than all the micro optimizing one can ever do to a piece of code..

Ters

Quote from: jamespetts on December 17, 2016, 11:26:26 PM
The interesting thing about learning C++ (and presumably also C) is that it teaches one rather more about how computers work than does learning a higher level language.

I've read that one of the main principles in C is that it should be obvious to the programmer how the C code maps to machine instructions. So while other languages might be based more directly on mathematical concepts, C is simply an abstraction over machine code. C++ is another layer above that. The basic keywords, except exceptions, still map to assembly in the same predictable way they do in C. New features like member functions, inheritance and virtual functions also have a simple and relatively well known translation to machine code. Exceptions are more complex, with at least two rather different implementations. Templates and operator overloading can obfuscate the mapping between C++ code and machine code. Each successive C++ standard seems to increase the distance between C++ and the machine code, probably indicating that C++ developers focus more on higher level concepts than how their code maps to machine code (although it still gives them the ability to write low-level code by sticking to the "pure" C stuff). Optimizing compilers also obfuscate the mapping between C/C++ and machine code, but the developer must still ask for such optimizations as far as I know (IDEs might ask for you by default for non-debug builds, though).

Quote from: sdog on December 17, 2016, 10:57:32 PM
Comming from Fortran on one side, garbage collected, and immutable state languages on the other side, I've incredible trouble wrapping my head around memory management in C languages. It is effortless in the former and no concern of the programmer in the latter. The more I am afraid that if I am not diligent now I might cause at some point a memory leak or other unforgivable mistakes.

To think that memory management is not a concern in garbage collected languages is a big mistake. Just because there is no keyword or function to free memory, doesn't mean that you don't have to tell the runtime that you are done with a piece of memory. While doing so will often be so trivial that you don't think about it, there are as far as I know cases in all GC-ed languages where you have to be more explicit. Immutable state languages might have less of this than mutable state languages, though. I only have experience with the latter, but I can't imagine the other being completely fool-proof.

sdog

@Ters
Quote
I've read that one of the main principles in C is that it should be obvious to the programmer how the C code maps to machine instructions. So while other languages might be based more directly on mathematical concepts, C is simply an abstraction over machine code. C++ is another layer above that. ...

That's quite interesting.

I've also unearthed the old Dijkstra quote, I mentioned before, it goes into the same direction. It says something along the lines: To use a level n of abstraction one has to understand the level n-1 and have more than a vague idea of n-2. He also said that learning C is not about learning programming but learning how machines work in detail.

[He also said something along the lines that Fortran ought to be forbidden, as it spoils young minds :-( (but that was in the days were people didn't mind using conditional, numbered goto statements.)]


QuoteTo think that memory management is not a concern in garbage collected languages is a big mistake. Just because there is no keyword or function to free memory, doesn't mean that you don't have to tell the runtime that you are done with a piece of memory. ...
Memory management is much more difficult for functional languages as since state isn't changed memory is assigned, for example, at every recursion step. However, it is also much easier for the runtime/compiler to determine when it can be freed.

In most GC languages it was relatively effortless, and most of the work seems to be done by following good coding practices. However, I've not used them in so much depth that I had to bother very much.


Lastly Fortran, memory management is rather easy and straight forward there. Firstly, in most cases one statically assigns fixed size arrays. It is more important that the memory footprint is predictable and fits with the ammount one has available for each instance of the programme or each thread. Typically memory as a much less scarce resource than CPU time. In new Fortran one can dynamically assign arrays. And also deallocate them directly. That can be very useful in the most memory intense problems. It is also much easier than in C since one does it also much more abstract. One simply allocates an array of required size, the compiler takes care of getting, preferably continuous, range of addresses from free memory, to use it in allocated memory. So no questions of stack or the ominous heap.

No pointers either. If one knows a few rules one can simply read and write multi dimensional arrays sequentially by iterating through them. Subroutine calls provide transfer not a memory address but the value of the data. In old Fortran one would initialise an array of fitting, fixed, size. In modern Fortran the compiler can infer that itself for static size arrays. (Never checked for dynamic sized arrays as they are a rare fringe case.)  At close of the subroutine the memory is freed automatically again. In consequence, and much unlike C/C++, if one is not careless, memory management happens without having to think about it. Typically a mistake that would cause it to fail, would have more catastrophic causes before.

Since there are no global variables, if one wants to write to a data structure directly, it has to be included in a common block. These blocks have to be consistent in every subroutine. Thus they are dreaded, target of some hackish solutions from the olden days, a source of frequent obscure errors, and a great hindrance to modular code. I dislike them with a passion. There are two consequences. In order to reduce memory footprint there is a lot of 'impure' meddling with stuff in common blocks happening in subroutines. Such that often a routine gets only some unimportant stuff passed as arguments, while all relevant input comes and output goes by directly writing to common block variables.

In new Fortran one may pass a reference to a data structures. There are also user defined, more complicated, data structures, somewhat similar to structs.

@DrSuperGood, HarrierST
QuoteWhat is still important is to attack problems efficiently. Using a O(log2(n)) instead of O(n) makes a huge difference, more than all the micro optimizing one can ever do to a piece of code..
Well, thats the much easier part, and the stuff people, at least proper CS students, love to learn. After all, your in a branch of applied Mathematics. I remember looking at the stuff my wife learned for tests, theoretical informatics, graph theory, number theory. That was a real joy. The other stuff was so dreary,UML, data warehousing, etc.

I might just parrot others: to me it seems micro optimisations tend to lead to not very future safe code, are error prone, and typically unreadable. Computing time is in most applications much cheaper than developer time. Good code is read much more often than written. There are good reasons for high levels of abstraction.

It is arguably also easier to find staff. Anyone with a bachelor in something only remotely technical understands the mathematical formalism. Few people understand how a computational machines work (me included). It is a difficult and specialised topic. (I have to admit I've recently applied for a job as FPGA developer, I need a better narrative here, in case I get a chance for an interview...)

QuoteSend in another application.The advert could be from a different part of the company, departments do not share info. Also human resources/personnel etc. what ever they are called now do not either.
Thanks a lot for the advise. I shall do it. Unfortunately the position was posted by the same HR employee (the company only has three) and is for the same dev team. Nonetheless, I'm going to ask them again. Firstly, it is difficult for them to get people, they tend to have those adds open for months. Secondly, with some work on C, and a better CV my chances might be better. And thirdly, even for a small chance it's worth it. No other job offer made me as eager as them. Not even a technical job at a heavy Ion accelerator (they are applying the type of collisions I've researched).


@James, Harrier, Ters, DSG
Thanks a lot for the feedback. I have many new questions in my mind, caused from this discussion. It was quite fruitful.

Ters

Um, Fortran is always/was originally pass by reference, not pass by value. That is the way one could/can redefine 0 to be 1 (same for any other pair of values).

sdog

Quote from: Ters on December 18, 2016, 06:20:49 PM
Um, Fortran is always/was originally pass by reference, not pass by value. That is the way one could/can redefine 0 to be 1 (same for any other pair of values).

Fortran 95, possibly earlier Fortran, does indeed allow passing by value. I think one can find that a lot in C wrappers around Fortran functions, eg, in linear algebra libraries like BLAS.

The trouble with (old) Fortran when passing by reference is that one needs to replicate the data structure. For instance, if you pass an NxM matrix the routine needs a MxN matrix explicitly declared (dummy variable). In new Fortran there are also allocatable dummy arrays that infer size from the passed variable.

In other words, in Fortran it is abstracted from a Variable in the routine call and corresponding dummy variables in the subroutine itself. It is left to the compiler how to associate both. That usually means that the dummy variable is associated with the same memory address. But, as mentioned before, that can go wrong. Or, can be (ab)used, for example often variables of multi dimensional matrices in routine calls may be used as one dimensional arrays in dummy variables. As one assumes a contiguous range in some abstract memory (somehow mapped to physical memory) and knows the way arrays are constructed this is possible. The only thing we can take for granted is that the dummy variable is associated to the beginning of memory in which the data is provided. Depending of the 'intent' (in/out/inout) there is a mapping from the dummy variable to the variable.

Combuijs

Quote[He also said something along the lines that Fortran ought to be forbidden, as it spoils young minds :-( (but that was in the days were people didn't mind using conditional, numbered goto statements.)]

You refer to the famous "go to statement considered harmful" article, I suppose?
Bob Marley: No woman, no cry

Programmer: No user, no bugs