3

I am a fairly beginner programmer in C, and I had always assumed that the way variable declaration worked was that when you declare a variable like int x;, you were telling the compiler to set aside memory for that variable, which would then be initialised if you then wrote something like x = 3;, and that perhaps the compiler might shuffle that declaration to somewhere more efficient if it can when compiling.

However I recently read that this is not what happens. So what happens, and why? Does something concrete happen behind the scenes, or are declarations effectively just messages to the compiler with no analogue in the eventual binary that it spits out? And how does all this apply to function declarations?

10
  • 4
    first paragraph seems still true today for C... "I recently read that this is not what happens" where? what does it say? because it's wrong... Commented Jul 20, 2018 at 8:53
  • 1
    this question is too broad. and the best way to know what the compiler does is to read the generated assembly
    – Tyker
    Commented Jul 20, 2018 at 8:54
  • 4
    There is no answer that is applicable to any language, as different languages take entirely different approaches. In particular, compiled vs interpreted, and dynamically- vs statically-typed languages differ drastically in this. And even within these broad categories, there can be vast differences. Commented Jul 20, 2018 at 8:54
  • 1
    Even for a language like C, there may be "analogue in a binary" for declarations. The standard does not impose any constraints. But yes, what you said is true for most compilers. There are no instructions that actually create memory for a variable. It is just a "message" to the compiler that there exists a variable x in the program. Store it wherever you see it fit. The actual instructions you will see will be the first time that variable is used. Again, this is true for most compilers. Commented Jul 20, 2018 at 8:58
  • 3
    I think what you're after is the as-if rule. The C compiler will generate code that behaves as if there was that variable, and memory set aside and so on. It might not be what is actually happening but you shouldn't need to care. Commented Jul 20, 2018 at 9:07

2 Answers 2

8

Both statements are true, at different “levels” in the C standard.

The C standard is written largely describing how a C implementation acts inside an imaginary abstract computer. In this model, when a variable is defined (not just declared), memory is reserved for it.

However, the C standard says that an actual implementation only needs to produce results as if it followed the abstract model. The standard says that only certain parts of the abstract model must be obervable. Most notably, the output of the program is observable.

Because of this rule, a compiler may change the internal parts of a program in any way it wants, as long as the output and other observable behavior remains the same. So, when the compiler sees that you use some variable x in a particular way and that it can get the same result another way without using memory for x, the compiler is allowed to change the program so that there is no actual memory used for x.

1

I think your first paragraph is fine, and as true as it ever was.

I like to draw pictures like this, with little labeled boxes showing the memory that has been set aside for various variables:

char c = 'A';
int i = 123;
int *ip = &i;

    +---+
 c: | A |
    +---+

    +---------+
 i: |   123   |
    +---------+
         ^
         |
    +----|----+
ip: |    *    |
    +---------+

And then what I think about is making sure the contents of each box is appropriate: right type, doesn't overflow. For each pointer, I think about whether that little arrow points somewhere valid.

If variables are local, they're typically stored in a stack frame. If variables are global, they're typically stored in the data segment. But, you're right, they might get rearranged, so you can't count on one coming just before, or just after, another. (Nor, in a sane or portable program, would you want to, of course.)

1
  • 1
    I believe the point of the question is that in reality, one, two, or all three of these variables may not exist and never existed either. This model is what a C compiler is supposed to follow, arrows and all – and the only way of observing if it does is by running a program and verifying the outcome. For all we know, in reality the compiler translates your C into Assyrian, uploads it through a pyramid to the stars, and presents you with the return data. As long as the result follows your schematic, you should not care.
    – Jongware
    Commented Jul 20, 2018 at 14:45

Not the answer you're looking for? Browse other questions tagged or ask your own question.