43

For binary operators we have both bitwise and logical operators:

& bitwise AND
| bitwise OR

&& logical AND
|| logical OR

NOT (a unary operator) behaves differently though. There is ~ for bitwise and ! for logical.

I recognize NOT is a unary operation as opposed to AND and OR but I cannot think of a reason why the designers chose to deviate from the principle that single is bitwise and double is logical here, and went for a different character instead. I guess you could read it wrong, like a double bitwise operation that would always return the operand value. But that does not seem a real problem to me.

Is there a reason I am missing?

21
  • 7
    Because if !! meant logical not, how would I turn 42 into 1? :) Commented Sep 30, 2019 at 6:15
  • 10
    Would ~~ then not have been more consistent for logical NOT, if you follow the pattern that the logical operator is a doubling of the bitwise operator? Commented Sep 30, 2019 at 6:24
  • 9
    First, if it was for consistency it would have been ~ and ~~ The doubling of and and or is associated to the short circuit; and the logical not doesn’t have a short circuit.
    – Christophe
    Commented Sep 30, 2019 at 6:55
  • 3
    I suspect the underlying design reason is visual clarity and distinction, in the typical use cases. The binary (that is, two-operand) operators are infix (and tend to be separated by spaces), whereas the unary operators are prefix (and tend not to be spaced).
    – Steve
    Commented Sep 30, 2019 at 7:01
  • 7
    As some comments have already alluded to (and for those who don't want to follow this link, !!foo is a not-uncommon (not not common?) idiom. It normalizes a zero-or-nonzero argument to 0 or 1. Commented Oct 1, 2019 at 1:02

3 Answers 3

113

Strangely, the history of C-style programming language doesn’t start with C.

Dennis Ritchie explains well the challenges of C’s birth in this article.

When reading it, it becomes obvious that C inherited a part of its language design from its predecessor BCPL, and especially the operators. The section “Neonatal C” of the aforementioned article explains how BCPL’s & and | were enriched with two new operators && and ||. The reasons were:

  • different priority was required due to its use in combination with ==
  • different evaluation logic: left-to-right evaluation with short-circuit (i.e when a is false in a&&b, b is not evaluated).

Interestingly, this doubling does not create any ambiguity for the reader: a && b will not be misinterpreted as a(&(&b)). From a parsing point of view, there is no ambiguity either: &b could make sense if b were an lvalue, but it would be a pointer whereas the bitwise & would require an integer operand, so the logical AND would be the only reasonable choice.

BCPL already used ~ for bitwise negation. So from a point of view of consistency, it could have been doubled to give a ~~ to give it its logical meaning. Unfortunately this would have been extremely ambiguous since ~ is a unary operator: ~~b could also mean ~(~b)). This is why another symbol had to be chosen for the missing negation.

26
  • 10
    The parser is unable to disambiguate the two situations, therefore the language designers must do so. Commented Sep 30, 2019 at 14:03
  • 16
    @Steve: Indeed, there are many similar problems already in C and C-like languages. When the parser sees (t)+1 is that an addition of (t) and 1 or is it a cast of +1 to type t? C++ design had to solve the problem of how to lex templates containing >> correctly. And so on. Commented Sep 30, 2019 at 18:25
  • 6
    @user2357112 I think the point is that it's okay to have the tokenizer blindly take && as a single && token and not as two & tokens, because the a & (&b) interpretation isn't a reasonable thing to write, so a human would never have meant that and been surprised by the compiler treating it as a && b. Whereas both !(!a) and !!a are possible things for a human to mean, so it's a bad idea for the compiler to resolve the ambiguity with an arbitrary tokenization-level rule.
    – Ben
    Commented Oct 1, 2019 at 5:12
  • 19
    !! is not only possible/reasonable to write, but the canonical "convert to boolean" idiom. Commented Oct 1, 2019 at 13:50
  • 4
    I think dan04 is referring to the ambiguity of --a vs -(-a), both of which are valid syntactically but have different semantics.
    – Ruslan
    Commented Oct 1, 2019 at 21:08
50

I cannot think of a reason why the designers chose to deviate from the principle that single is bitwise and double is logical here,

That's not the principle in the first place; once you realize that, it makes more sense.

The better way to think of & vs && is not binary and Boolean. The better way is to think of them as eager and lazy. The & operator executes the left and right side and then computes the result. The && operator executes the left side, and then executes the right side only if necessary to compute the result.

Moreover, instead of thinking about "binary" and "Boolean", think about what is really happening. The "binary" version is just doing the Boolean operation on an array of Booleans that has been packed into a word.

So let's put it together. Does it make any sense to do a lazy operation on an array of Booleans? No, because there is no "left side" to check first. There are 32 "left sides" to check first. So we restrict the lazy operations to a single Boolean, and that's where your intuition that one of them is "binary" and one is "Boolean" comes from, but that is a consequence of the design, not the design itself!

And when you think of it that way, it becomes clear why there is no !! and no ^^. Neither of those operators have the property that you can skip analyzing one of the operands; there is no "lazy" not or xor.

Other languages make this more clear; some languages use and to mean "eager and" but and also to mean "lazy and", for instance. And other languages also make it more clear that & and && are not "binary" and "Boolean"; in C# for instance, both versions can take Booleans as operands.

26
  • 2
    Thank you. This is the real eye opener for me. Too bad I cannot accept two answers. Commented Sep 30, 2019 at 18:49
  • 11
    I don't think this is a good way to think of & and &&. While eagerness is one of the differences between & and &&, & behaves completely differently from an eager version of &&, particularly in languages where && supports types other than a dedicated boolean type. Commented Oct 1, 2019 at 1:51
  • 14
    For example, in C and C++, 1 & 2 has a completely different result from 1 && 2. Commented Oct 1, 2019 at 2:37
  • 7
    @ZizyArcher: As I noted in the comment above, the decision to omit a bool type in C has knock-on effects. We need both ! and ~ because one means "treat an int as a single Boolean" and one means "treat an int as a packed array of Booleans". If you have separate bool and int types then you can have just one operator, which in my opinion would have been the better design, but we're almost 50 years late on that one. C# preserves this design for familiarity. Commented Oct 1, 2019 at 13:12
  • 3
    @Steve: If the answer seems absurd then I have made a poorly expressed argument somewhere, and we ought not to rely on an argument from authority. Can you say more about what seems absurd about it? Commented Oct 1, 2019 at 13:13
23

TL;DR

C inherited the ! and ~ operators from another language. Both && and || were added years later by a different person.

Long Answer

Historically, C developed out of the early languages B, which was based on BCPL, which was based on CPL, which was based on Algol.

Algol, the great-granddaddy of C++, Java and C#, defined true and false in a way that came to feel intuitive to programmers: “truth values which, regarded as a binary number (true corresponding to 1 and false to 0), is the same as the intrinsic integral value”. However, one disadvantage of this is that logical and bitwise not cannot be the same operation: On any modern computer, ~0 equals -1 rather than 1 and ~1 equals -2 rather than 0. (Even on some sixty-year-old mainframe where ~0 represents -0 or INT_MIN, ~0 != 1 on every CPU ever made, and the C language standard has required it for many years, while most of its daughter languages don’t even bother to support sign-and-magnitude or one’s-complement at all.)

Algol worked around this by having different modes and interpreting operators differently in boolean and integral mode. That is, a bitwise operation was one on integer types, and a logical operation was one on boolean types.

BCPL had a separate boolean type, but a single not operator, for both bitwise and logical not. The way this early forerunner of C made that work was:

The Rvalue of true is a bit pattern entirely composed of ones; the Rvalue of false is zero.

Note that true = ~ false

(You’ll observe that the term rvalue has evolved to mean something completely different in C-family languages. We would today call that “the object representation” in C.)

This definition would allow logical and bitwise not to use the same machine-language instruction. If C had gone that route, header files the world over would say #define TRUE -1.

But the B programming language was weakly-typed, and had no boolean or even floating-point types. Everything was the equivalent of int in its successor, C. This made it a good idea for the language to define what happened when a program used a value other than true or false as a logical value. It first defined a truthy expression as “not equal to zero.” This was efficient on the minicomputers on which it ran, which had a CPU zero flag.

There was, at the time, an alternative: the same CPUs also had a negative flag, and BCPL’s truth value was -1, so B might have instead defined all negative numbers as truthy and all non-negative numbers as falsy. (There is one remnant of this approach: UNIX, developed by the same people at the same time, defines all error codes as negative integers. Many of its system calls return one of several different negative values on failure.) So be thankful: it could have been worse!

But defining TRUE as 1 and FALSE as 0 in B meant that the identity true = ~ false no longer held, and it had dropped the strong typing that allowed Algol to disambiguate between bitwise and logical expressions. That required a new logical-not operator, and the designers picked !, possibly because not-equal-to was already !=, which looks sort of like a vertical bar through an equal sign. They didn’t follow the same convention as && or || because neither one yet existed.

Arguably, they should have: the & operator in B is broken as designed. In B and in C, 1 & 2 == FALSE even though 1 and 2 are both truthy values, and there is no intuitive way to express the logical operation in B. That was one mistake C tried to partly rectify by adding && and ||, but the main concern at the time was to finally get short-circuiting to work, and make programs run faster. The proof of this is that there is no ^^: 1 ^ 2 is a truthy value even though both its operands are truthy, but it cannot benefit from short-circuiting.

7
  • 4
    +1. I think this is a pretty good guided tour around the evolution of these operators.
    – Steve
    Commented Oct 1, 2019 at 19:37
  • BTW, sign/magnitude and one's complement machines also need separate bitwise vs. logical negation, even if the input is already booleanized. ~0 (all bits set) is one's complement negative zero (or a trap representation). Sign/magnitude ~0 is a negative number with maximum magnitude. Commented Oct 3, 2019 at 10:51
  • @PeterCordes You’re absolutely right. I was just focusing on two’s-complement machines because they’re a lot more important. Maybe it’s worth a footnote.
    – Davislor
    Commented Oct 3, 2019 at 13:05
  • I think my comment is sufficient, but yeah, maybe a parenthetical (doesn't work for 1's complement or sign/magnitude either) would be a good edit. Commented Oct 3, 2019 at 13:15
  • According to K&R explanations, the main purpose of short circuit is not to make it faster, but to allow conditions that avoid hard failures (e.g: if (i!=0 && k/i>x) will never lead to a division by 0, which would not be guaranteed without short circuits; same for p==NULL || *p=='\0'). Then, C before 1978 K&R was not yet C but a development version of it, isn’t it? Finally, I do not get from your explanations why the designers dit chose ! an not ~~ (which was in fact the question that we had to answer). You just say they picked the first. But why?
    – Christophe
    Commented Feb 28, 2020 at 0:15

Not the answer you're looking for? Browse other questions tagged or ask your own question.