10

Some hyper-modern C compilers will infer that if a program will invoke Undefined Behavior when given certain inputs, such inputs will never be received. Consequently, any code which would be irrelevant unless such inputs are received may be eliminated.

As a simple example, given:

void foo(uint32_t);

uint32_t rotateleft(uint_t value, uint32_t amount)
{
  return (value << amount) | (value >> (32-amount));
}

uint32_t blah(uint32_t x, uint32_t y)
{
  if (y != 0) foo(y);
  return rotateleft(x,y);
}

a compiler may infer that because evaluation of value >> (32-amount) will yield Undefined Behavior when amount is zero, function blah will never be called with y equal to zero; the call to foo can thus be made unconditional.

From what I can tell, this philosophy seems to have caught hold sometime around 2010. The earliest evidence I've seen of its roots goes back to 2009, and it's been enshrined in the C11 standard which explicitly states that if Undefined Behavior occurs at any point in a program's execution, the behavior of the entire program retroactively becomes undefined.

Was the notion that compilers should attempt to use Undefined Behavior to justify reverse-causal optimizations (i.e. the Undefined Behavior in the rotateleft function should cause the compiler to assume that blah must have been called with a non-zero y, whether or not anything would ever cause y to hold a non-zero value) seriously advocated prior to 2009? When was such a thing first seriously proposed as an optimization technique?

[Addendum]

Some compilers have, even in the 20th Century, included options to enable certain kinds of inferences about loops and the values computed therein. For example, given

int i; int total=0;
for (i=n; i>=0; i--)
{
  doSomething();
  total += i*1000;
}

a compiler, even without the optional inferences, might rewrite it as:

int i; int total=0; int x1000;
for (i=n, x1000=n*1000; i>0; i--, x1000-=1000)
{
  doSomething();
  total += x1000;
}

since the behavior of that code would precisely match the original, even if the compiler specified that int values always wrap in mod-65536 two's-complement fashion. The additional-inference option would let the compiler recognize that since i and x1000 should cross zero at the same time, the former variable can be eliminated:

int total=0; int x1000;
for (x1000=n*1000; x1000 > 0; x1000-=1000)
{
  doSomething();
  total += x1000;
}

On a system where int values wrapped mod 65536, an attempt to run either of the first two loops with n equal to 33 would result in doSomething() being invoked 33 times. The last loop, by contrast, wouldn't invoke doSomething() at all, even though the first invocation of doSomething() would have preceded any arithmetic overflow. Such a behavior might be considered "non-causal", but the effects are reasonably well constrained and there many cases where the behavior would be demonstrably harmless (in cases where a function is required to yield some value when given any input, but the value may be arbitrary if the input is invalid, having the loop finish faster when given an invalid value of n would actually be beneficial). Further, compiler documentation tended to be apologetic for the fact that it would change the behavior of any programs--even those that engaged in UB.

I'm interested in when compiler writers' attitudes changed away from the idea that platforms should when practical document some usable behavioral constraints even in cases not mandated by the Standard, to the idea that any constructs which would rely upon any behaviors not mandated by the Standard should be branded illegitimate even if on most existing compilers it would work as well or better than any strictly-compliant code meeting the same requirements (often allowing optimizations which would not be possible in strictly-compliant code).

11
  • Comments are not for extended discussion; this conversation has been moved to chat.
    – maple_shaft
    Commented Aug 4, 2015 at 1:13
  • 1
    @rwong: I don't see how that even violates the laws of time, much less causality. If illegitimate static casts terminated program execution (hardly an obscure execution model), and if the build model were such that no additional classes could be defined after linking, then no input to the program could cause shape->Is2D() to be invoked on an object that was not derived from Shape2D. There is a huge difference between optimizing out code which would only be relevant if a critical Undefined Behavior has already happened versus code which would only be relevant in cases where...
    – supercat
    Commented Aug 12, 2015 at 15:16
  • ...there are no downstream execution paths that don't invoke some form of Undefined Behavior. While the definition of "critical undefined behavior" given in (IIRC) Annex L is a bit vague, an implementation might reasonably implement static casting and virtual dispatch such that the indicated code would jump to an arbitrary address (it's critical UB). For it to always jump to Shape2D::Is­2D is actually better than the program deserves.
    – supercat
    Commented Aug 12, 2015 at 15:27
  • 1
    @rwong: As for the duplicate, the intention of my question was to ask when in human history the normative interpretation of Undefined Behavior with regard to e.g. overflow shifted from "If a platform's natural behavior would allow an application to satisfy its requirements without error checking, having the standard mandate any contrary behavior may increase the complexity of the source code, executable code, and compiler code necessary to have a program meet requirements, so the Standard should instead let the platform do what it does best" to "Compilers should assume that programmers...
    – supercat
    Commented Aug 16, 2015 at 20:21
  • 1
    ...will always include code to prevent overflows in any cases where they don't want to launch nuclear missiles." Historically, given the specification "write a function which, given two arguments of type 'int', returns the product if it's representable as 'int', or else either terminates the program or returns an arbitrary number", int prod(int x, int y) {return x*y;} would have sufficed. Complying with "don't launch nukes" in strictly-compliant fashion, however, would require code that's harder to read and would almost certainly run much slower on many platforms.
    – supercat
    Commented Aug 16, 2015 at 20:28

5 Answers 5

6

Undefined behavior is used in situations where it is not feasible for the spec to specify the behavior, and it has always been written to allow absolutely any behavior possible.

The extremely ultra-loose rules for UB are helpful when you think about what a spec conforming compiler must go through. You may have sufficient compiling horsepower to emit an error when you do some bad UB in one case, but add a few layers of recursion and now the best you can do is a warning. The spec has no concept of "warnings," so if the spec had given a behavior, it would have to be "an error."

The reason we see more and more side effects of this is the push for optimization. Writing a spec conforming optimizer is hard. Writing a spec conforming optimizer which also happens to do a remarkably good job guessing what you intended when you went outside the spec is brutal. It is much easier on the compilers if they get to assume UB means UB.

This is especially true for gcc, which tries to support many many instruction sets with the same compiler. It is far easier to let UB yield UB behaviors than it is to try to grapple with all the ways every single UB code could go wrong on every platform, and factor it into the early phrases of the optimizer.

30
  • 1
    The reason C got its reputation for speed is that programmers who knew that even in some cases where the Standard imposed no requirements the behavior of their platform would satisfy their requirements could make use of such behavior. If a platform guarantees that x-y > z will arbitrarily yield 0 or 1 when x-y is not representable as "int", such a platform will have more optimization opportunities than a platform which requires the expression to be written as either UINT_MAX/2+1+x+y > UINT_MAX/2+1+z or (long long)x+y > z.
    – supercat
    Commented Aug 2, 2015 at 19:27
  • 1
    Prior to roughly 2009, a typical interpretation of many forms of UB would be e.g. "Even if executing a shift-right-by-N instruction on the target CPU could stall the bus for several minutes with interrupt disabled when N is -1 [I've heard of such a CPU], a compiler may pass any N to the shift-right-by-N instruction without validation, or mask N with 31, or arbitrarily choose among those options". In cases where any of the above behaviors would meet requirements, programmers didn't need to waste time (theirs or the machine's) preventing them from occurring. What prompted the change?
    – supercat
    Commented Aug 2, 2015 at 19:40
  • @supercat I would say that, even in 2015, a "typical" interpretation of UB would be similarly benign. The problem is not the typical case of UB, its the unusual cases. I'd say 95-98% of UB I have accidentally written "worked as intended." It's the remaining 2-5% where you can't understand why the UB got so out of hand until you have spent 3 years digging deep in the bowels of gcc's optimizer to realize that it actually was just as outlandish of a case as they thought... it just looked easy at first.
    – Cort Ammon
    Commented Aug 2, 2015 at 23:47
  • An excellent example I ran into was a One Definition Rule (ODR) violation I wrote by accident, naming two classes the same thing, and getting severe stack bashing when I called one function and it ran the other. It seems reasonable to say "why didn't you just use the nearest class name, or at least warn me," until you understand how the linker resolves naming issues. There simply was no "good" behavior option available unless they put massive amounts of extra code in the compiler to catch it.
    – Cort Ammon
    Commented Aug 2, 2015 at 23:50
  • Linker technology is generally horrible, but there are substantial chicken-and-egg barriers to improving it. Otherwise, what I think is needed is for C to standardize a set of portability/optimization typedefs and macros such that existing compilers which support the behaviors implied thereby could support the new standard merely by adding a header file, and code wishing to use the new features could be used with older compilers that supported the behaviors merely by including a "compatibility" header file, but compilers would then be able to achieve much better optimizations...
    – supercat
    Commented Aug 3, 2015 at 1:34
4

"Undefined behaviour might cause the compiler to rewrite code" has happened for a long time, in loop optimisations.

Take a loop (a and b are pointer to double, for example)

for (i = 0; i < n; ++i) a [i] = b [i];

We increment an int, we copy an array element, we compare with a limit. An optimising compiler first removes the indexing:

double* tmp1 = a;
double* tmp2 = b;
for (i = 0; i < n; ++i) *tmp1++ = *tmp2++;

We remove the case n <= 0:

i = 0;
if (n > 0) {
    double* tmp1 = a;
    double* tmp2 = b;
    for (; i < n; ++i) *tmp1++ = *tmp2++;
}

Now we eliminate the variable i:

i = 0;
if (n > 0) {
    double* tmp1 = a;
    double* tmp2 = b;
    double* limit = tmp1 + n;
    for (; tmp1 != limit; tmp1++, tmp2++) *tmp1 = *tmp2;
    i = n;
}

Now if n = 2^29 on a 32 bit system or 2^61 on a 64 bit system, on typical implementations we will have tmp1 == limit, and no code is executed. Now replace the assignment with something that takes a long time so that the original code will never run into the inevitable crash because it takes too long, and the compiler has changed the code.

7
  • If neither pointer is volatile, I wouldn't consider that a rewrite of the code. Compilers have for a long time been allowed to resequence writes to non-volatile pointers, so behavior in the case where n is so large that pointers would wrap would be equivalent to having an out-of-bounds store clobber a temporary storage location holding i before anything else happens. If a or b were volatile, the platform documented that volatile accesses generate physical load/store operations in the sequence requested, and it the platform defines any means via which such requests...
    – supercat
    Commented Aug 2, 2015 at 21:06
  • ...could deterministically affect program execution, then I would argue that a compiler should refrain from the indicated optimizations (though I'll readily acknowledge that some might not have been programmed to prevent such optimizations unless i was also made volatile). That would be a pretty rare behavioral corner case, though. If a and b are non-volatile, I'd suggest that there would be no plausible intended meaning for what the code should do if n is so large as to overwrite all of memory. By contrast, many other forms of UB have plausible intended meanings.
    – supercat
    Commented Aug 2, 2015 at 21:12
  • Upon further consideration, I can think of situations where assumptions about the laws of integer arithmetic could cause loops to terminate prematurely before any UB occurs and without the a loop doing any illegitimate address computations. In FORTRAN, there's a clear distinction between loop control variables and other variables, but C lacks that distinction, which is unfortunate since requiring programmers to always prevent overflow on loop control variables would be far less of an impediment to efficient coding than having to prevent overflow anywhere.
    – supercat
    Commented Aug 2, 2015 at 21:56
  • I wonder how language rules could best be written so that code which would be happy with a range of behaviors in case of overflow could say that, without impeding genuinely useful optimizations? There are many cases where code must produce strictly-defined output given valid input, but loosely-defined output when given invalid input. E.g. suppose a programmer who would be inclined to write if (x-y>z) do_something();` doesn't care whether do_something executes in case of overflow, provided the overflow has no other effect. Is there any way to rewrite the above which will not...
    – supercat
    Commented Aug 2, 2015 at 22:19
  • ...generate needlessly-inefficient code on at least some platforms (compared with what those platforms could generate if given a free choice of whether or not to invoke do_something)? Even if loop optimizations were forbidden from yielding behavior inconsistent with a loose overflow model, programmers could write code in such a way as to allow compilers to generate optimal code. Is there any way to work around inefficiencies compelled by an "avoid overflow at all costs" model?
    – supercat
    Commented Aug 2, 2015 at 22:23
3

It has always been the case in C and C++ that as a result of undefined behaviour, anything can happen. Therefore it has also always been the case that a compiler can make the assumption that your code doesn't invoke undefined behaviour: Either there is no undefined behaviour in your code, then the assumption was correct. Or there is undefined behaviour in your code, then whatever happens as a result of the incorrect assumption is covered by "anything can happen".

If you look at the "restrict" feature in C, the whole point of the feature is that the compiler can assume there is no undefined behaviour, so there we reached the point where the compiler not only may but actually should assume there is no undefined behaviour.

In the example that you give, the assembler instructions usually used on x86 based computers to implement left or right shift will shift by 0 bits if the shift count is 32 for 32 bit code or 64 for 64 bit code. This will in most practical cases lead to undesirable results (and results that are not the same as on ARM or PowerPC, for example), so the compiler is quite justified to assume that this kind of undefined behaviour doesn't happen. You could change your code to

uint32_t rotateleft(uint_t value, uint32_t amount)
{
   return amount == 0 ? value : (value << amount) | (value >> (32-amount));
}

and suggest to the gcc or Clang developers that on most processors the "amount == 0" code should be removed by the compiler, because the assembler code generated for the shift code will produce the same result as value when amount == 0.

9
  • 1
    I can think of very few platforms where an implementation of x>>y [for unsigned x] that would work when variable y held any value from 0 to 31, and did something other than yield 0 or x>>(y & 31) for other values, could be as efficient as one which did something else; I know of no platform where guaranteeing that no action other than one of the above would occur would add significant cost. The idea that programmers should use some more complicated formulation in code that would never have to be run on obscure machines would have been viewed as absurd.
    – supercat
    Commented Aug 2, 2015 at 19:07
  • 3
    My question is when did things shift from "x>>32` might arbitrarily yield x or 0, or might trap on some obscure platforms" to "x>>32 might cause the compiler to rewrite the meaning of other code"? The earliest evidence I can find is from 2009, but I'm curious if earlier evidence exists.
    – supercat
    Commented Aug 2, 2015 at 19:09
  • @Deduplicator: My replacement works perfectly fine for values that make sense, that is 0 <= amount < 32. And whether your (value >> 32 - amount) is correct or not, I don't care, but leaving the parentheses out is as bad as a bug.
    – gnasher729
    Commented Aug 2, 2015 at 19:53
  • Didn't see the restriction to 0<=amount && amount<32. Whether bigger/smaller values make sense? I thought whether they do is part of the question. And not using parentheses in the face of bit-ops is probably a bad idea, sure, but certainly not a bug. Commented Aug 2, 2015 at 21:19
  • @Deduplicator this is because some CPUs natively implement the shift operator as (y mod 32) for 32-bit x and (y mod 64) for 64-bit x. Note that it is relatively easy to emit code that will achieve uniform behavior across all CPU architectures - by masking the shift amount. This usually takes one extra instruction. But alas ...
    – rwong
    Commented Aug 6, 2015 at 2:05
0

"and it's been enshrined in the C11 standard which explicitly states that if Undefined Behavior occurs at any point in a program's execution, the behavior of the entire program retroactively becomes undefined." This is not incorrect. The ISO C standard never said anything like this and such an interpretation is not consistent with the wording in the standard. There is also a new note to C23 that explicitly makes clear that this is not the case.

1
  • Which statements are you saying are correct or incorrect, and what is changing?
    – supercat
    Commented Jul 1 at 15:14
-1

This is because there is a bug in your code:

uint32_t blah(uint32_t x, uint32_t y)
{
    if (y != 0) 
    {
        foo(y);
        return x; ////// missing return here //////
    }
    return rotateleft(x, y);
}

In other words, it only jumps the causality barrier if the compiler sees that, given certain inputs, you are invoking undefined behavior beyond doubt.

By returning right before the invocation of undefined behavior, you tell the compiler that you are consciously preventing that undefined behavior from executing, and the compiler acknowledges that.

In other words, when you have a compiler that tries to enforce the specification in a very strict way, you have to implement every possible argument validation in your code. Furthermore, these validation must occur before the invocation of the said undefined behavior.

Wait! And there is more!

Now, with compilers doing these super-crazy yet super-logical things, it is your imperative to tell the compiler that a function isn't supposed to continue execution. Thus, the noreturn keyword on the foo() function now becomes mandatory.

14
  • 4
    Sorry, but that's not the question. Commented Aug 1, 2015 at 23:34
  • @Deduplicator Had that not been the question, this question should have been closed for being opinion-based (open debate). Effectively it is second-guessing the motivations of the C and C++ committees and that of the compiler vendor/writers.
    – rwong
    Commented Aug 1, 2015 at 23:37
  • 1
    It's a question about historic interpretation, use and rationale of UB in the C language, not restricted to the standard and the committee. Second-guessing the committees? Not so. Commented Aug 1, 2015 at 23:42
  • 1
    ...recognize that the latter should only take one instruction. I don't know if the hyper-modernists are desperately trying to make C competitive in fields where it's been replaced, but they're undermining its utility in the only field where it's king, and are from what I can tell making useful optimizations harder rather than easier.
    – supercat
    Commented Aug 1, 2015 at 23:55
  • 1
    @ago: Actually, on a system with 24 bit int the first would succeed and the second would fail. In embeddeed systems, anything goes. Or a lot goes; i can remember one where int = 32 bit, and long = 40 bit which is the only system I ever saw where long is a lot less efficient than int32_t.
    – gnasher729
    Commented Aug 2, 2015 at 9:32

Not the answer you're looking for? Browse other questions tagged or ask your own question.