17
$\begingroup$

C/C++ use = for assignment and == for equality comparision. For example, a snippet in C++ can be like this:

int nuclear_code; 
nuclear_code = 1111; // assign nuclear_code  with the value 1111

if(nuclear_code == 1234) 
/* compare nuclear_code with the value 1234
** This will return false because 1111 != 1234
*/
{
    // This will never be executed.
    std::cout << "Nuclear fired!"; 
}

But this can lead to unwanted bugs when programmers accidentally make a typo. Consider this example:

int nuclear_code = 1111; 
if(nuclear_code = 1234) 
/* Oops, instead of ==, this is = now!
** This will assign nuclear_code with the value 1234
** and it will return the lvalue referring to the left-hand operand, which is 1234
** This is now true.
*/
{
    // This code will be run.
    std::cout << "Nuclear fired!"; 
}

This is a (very) common typo in C/C++, partly because the two operators are so similar.

So what are the alternatives?

$\endgroup$
6
  • 4
    $\begingroup$ Whatever syntax you choose for these, you can prevent this exact mistake by not having assignment expressions at all. nuclear_code = 1234; can be a statement without nuclear_code = 1234 needing to be allowed as an expression; Python did this before eventually relenting and adding the assignment operator :=. Another issue that works against C in this regard is the coercion to boolean; in Java, if(nuclear_code = 1234) would be a type error because the condition must be a boolean, not an int. $\endgroup$
    – kaya3
    Commented May 21, 2023 at 10:27
  • $\begingroup$ @kaya3 while that is true, the C/C++ way of doing it is a separate question in itself. I think that this still need to be asked. $\endgroup$ Commented May 21, 2023 at 10:48
  • $\begingroup$ Yes, nothing wrong with the question. Just pointing out that there are more approaches to addressing this issue than just the choice of syntax for the two operators ─ and indeed, some languages use the same syntax = for both things because the context determines whether it's a name binding or an equality check. $\endgroup$
    – kaya3
    Commented May 21, 2023 at 12:02
  • $\begingroup$ so are you asking particularly about design strategies to prevent these kinds of mistakes? Because I don't see that represented in your title, and if you are (asking about that), it should be (represented in your title) $\endgroup$
    – starball
    Commented May 21, 2023 at 19:03
  • 3
    $\begingroup$ Not syntax, but in Swift assignment expressions are of type void, so this error can't happen either. $\endgroup$
    – chrysante
    Commented May 21, 2023 at 21:36

17 Answers 17

17
$\begingroup$

Just throw Unicode symbols at it (the APL way)

APL uses for assignment and =/ for equality (= is term-wise, is for the whole array). If you can afford to just throw Unicode at the problem, sometimes it's a fun solution :) really depends on how the language is meant to be written, though.

$\endgroup$
8
  • 1
    $\begingroup$ Not just Unicode have those symbols. Some calculators' BASIC dialects use as assignment too, from their limited character set. $\endgroup$
    – Longinus
    Commented May 22, 2023 at 1:52
  • 1
    $\begingroup$ Some dialects of Smalltalk use _ for assignment, because early versions of ASCII had an arrow at that codepoint. $\endgroup$
    – Bbrk24
    Commented May 22, 2023 at 4:16
  • $\begingroup$ @Longinus sure, but nowadays SBCS's are basically just Unicode mappings. I'd assume modern implementations of those BASIC's just read the (potentially sbcs-encoded) file and then display it as Unicode. $\endgroup$
    – RubenVerg
    Commented May 22, 2023 at 5:03
  • $\begingroup$ I can't really tell if this answer is serious. If it is, how do you expect users to type these symbols? $\endgroup$
    – chrysante
    Commented May 27, 2023 at 20:29
  • 1
    $\begingroup$ @chrysante It is completely serious. In Dyalog APL, you'd hit your APL key (backtick by default) and then [. $\endgroup$
    – RubenVerg
    Commented May 27, 2023 at 21:27
13
$\begingroup$

Distinguish Assignment and Equality by Context

Some languages use = for both assignment and equality. The way the language differentiates the two use cases is by constraining the grammar so it's obvious which context is meant.

For instance, in BASIC, there is no such thing as an "expression statement", so if you write something like x=1 on a line by itself, the only possible interpretation is that you are assigning 1 to x. The alternative interpretation of "check if x is equal to 1 and discard the result" isn't an allowed interpretation in the language. This is in contrast to a language like C where both interpretations are legal (and thus, the compiler needs an additional symbol to disambiguate the two interpretations). Some languages also use a keyword like let or set to begin an assignment statement which can also help distinguish the two.

The advantage of having only one operator for both operations is that it simplifies the language. It also means that the compiler will never misinterpret your program because of a careless error. On the other hand, it is less expressive than languages that separate out the two operations. For instance, in C++, you can assign to variables within a condition or loop. In a language like BASIC, you cannot do this which may lead to more verbose code in some cases.

$\endgroup$
12
$\begingroup$

Even though the OP sepcifically asks for different syntax to prevent this kind of error, I would make the case that a better approach is a semantic solution:

The reason why this error can occur in C++ (and in C for that matter) is that an assignment expression evaluates to a reference to the assigned variable (int& in this case) and that int is implicitly convertible to bool (*). By changing either of these properties we can prevent this error.


If we forbid implicit narrowing conversions or just the implicit int -> bool conversion we are already fine. This however prevents the following common C idiom:

int status = my_api_call(/* ... */);
if (status) { /* Handle error */ }

But I would argue that this alternative code is easier to understand and just as simple to write:

int status = my_api_call(/* ... */);
if (status != 0) { /* Handle error */ }

If we evaluate assignment expressions not to a reference to the assigned value but to void we also solve the problem, but this time we prevent this idiom:

a = b = c; // Assign the value of c to a and b

However this is rarely used and confusing if not seen before, and the alternative

a = c;
b = c;

is not significantly more verbose (and much clearer), so I would argue that losing this idiom is also not much of a loss if not an improvement.


So in conclusion I argue that while either of the approaches solves the problem, both of them are reasonable in their on right and can prevent other possible code smells.

Also now you are free to choose whatever syntax you like without the restriction of having to solve this problem.

(*) On top of that, in clauses of if statements, also explicit conversions to bool are considered in C++. This rule exists so class authors can make conversions to bool explicit, which prevents implicit conversions from the custom type to integral types (via class X -> bool -> int) while still allowing the class to be used as a condition in if statements, but in our case it makes matters worse.

$\endgroup$
2
  • 1
    $\begingroup$ An alternative is only forbidding raw assignment in a boolean context. Extra-parentheses are cheap enough. Also, if a, b and/or c are long and/or expressions with side-effects, the rewrite will be quite cumbersome and not actually that trivial. $\endgroup$ Commented Jul 3, 2023 at 15:44
  • $\begingroup$ @Deduplicator If a, b or c are complex expressions, you can store them to temporaries. That's what the compiler does under the hood if you write a nested assignment expression anyhow. I agree that forbidding raw assignment in if-conditions is also a solution, but that does not prevent the error in other boolean expressions. $\endgroup$
    – chrysante
    Commented Jul 3, 2023 at 16:38
11
$\begingroup$

The Pascal way

i.e using := for assignment and = for equality check.

The example in the question can be written in Pascal as:

program Hello;
var nuclear_code : integer;
begin
  nuclear_code := 1111;
  if(nuclear_code = 1234) then writeln('Nuclear fired!');
end.

This is so harder to mistype the two.

$\endgroup$
3
  • 2
    $\begingroup$ Just as a note about this, := was chosen because it looked a bit like the left-arrow symbol in mathematics, ⟸, but wouldn't be confused for "less than or equal to". Prolog uses :- for a similar reason. $\endgroup$
    – Pseudonym
    Commented May 22, 2023 at 1:02
  • 5
    $\begingroup$ @Pseudonym I believe the := symbol exists in mathematics as well, and has the meaning of introducing the definition for a new variable. The double arrows, on the other hand, I've never seen used for anything besides implication. Might be a case of poorly internationalized syntax, though) $\endgroup$
    – abel1502
    Commented May 27, 2023 at 16:26
  • $\begingroup$ That's what I am doing in my programming language, AEC. $\endgroup$ Commented Jul 10, 2023 at 18:47
10
$\begingroup$

The GNU solution to the if(x = y) {} problem is to require double parenthesisation where only if((x = y)) {} is unambiguous.
That said, to mean assignment, the =, :=, <- and operators are common across many languages descending from C, Pascal and BASIC.
Vale uses a set keyword to mean reassignment.

In languages where the concept of "(re)assignment" does not exist, like most Functional and Logical ones, it is not uncommon to see = be used both for bindings and as the equality binary operator, along with ==.
For instance, in a language lacking builtin mutation support, this code is unambiguous:

main =
    let x = 42 in
    putStrLn (if x = 42 then "h" else "Ü")

Some English-oriented languages, like CoffeeScript, also use a is keyword.

For completeness, are also common as inequality operators !=, /=, <>, and isnt, all used in various languages.

$\endgroup$
0
7
$\begingroup$

Common Lisp

Most of the time when you need a variable, you'll use let.

(let ((x 10))
    ;; Now I can use x in this scope, and it has a value of 10
    ...)

There's also setf.

(setf *x* 10)

Equality checking looks like one of the following (depending on what exactly you want—there are more equality operators).

(= x 10)
(eql x 10)
(equalp x 10)

It's really hard to accidentally mix the two up.

$\endgroup$
0
7
$\begingroup$

Use <-.

using <- for assignment and = for equality check.

Like this:

a <- 5
b <- 5

if a = b:
    . . .

PS: like APL, but not UNICODE.

$\endgroup$
3
  • 4
    $\begingroup$ Or go full R and also allow -> for right assignment, e.g. f() -> x $\endgroup$ Commented Jul 2, 2023 at 12:11
  • 1
    $\begingroup$ But is if a <- 5 an assignment, or a lesser-than comparison between a and -5? I think this particular notation just moves the problem, not prevent it, unless you require spaces around binary operators. $\endgroup$
    – G. Sliepen
    Commented Jul 12, 2023 at 14:53
  • $\begingroup$ @G.Sliepen Tokenisation should take care of that, the same way ++x in C is a prefix incrementation, and neither two unary plus operations, nor one nor two additions. (not saying it's ideal for reading code in all langs, but it works) $\endgroup$
    – Longinus
    Commented Nov 9, 2023 at 16:03
6
$\begingroup$

Use :, like JSON

Assignment and comparison have different purposes, return values, and allowed syntactic locations. Therefore, they should be represented with bigger differences than a single extra character.

The use of : for "assignments" is already fairly common and should be familiar to users:

  • JSON/object notation: uses colon to bind a value to a key.
  • English: like this list, where a colon associates a value or comment with an entry.

This gives you a one-character solution for assignment, leaves = free for equality, and should be immediately clear to readers.

$\endgroup$
1
  • $\begingroup$ : is used for assignment in K and Q. $\endgroup$
    – Adám
    Commented Oct 11, 2023 at 21:51
4
$\begingroup$

It partly depends on the semantics of variables and the semantics of equality.

Some languages distinguish between intensional and extensional equality. One example of this might be a set of integers represented as a binary search tree. Two sets are intensionally equal if the trees have the same structure, but extensionally equal if the numbers in the sets are the same.

Yet other languages have different notions of equality testing that depends on whether or not they will coerce types. Does 3 == 3.0 in a dynamically typed language? Sometimes you want that and sometimes you don't.

Other languages have object identity, which is different from equality.

This is a lot of symbols for equality testing that you might need to invent! Common Lisp has four of them.

Logic languages have only one symbol =, and it means unification. Just looking at simple values like numbers for the moment, if you type X = Y, then it can mean several things:

  • If X is a free variable but Y has a value already bound to it, this assigns that value to X.
  • Same thing if Y is free but X is bound; this is left-assignment.
  • If X and Y are both bound, then this is an equality test.
  • If X and Y are both free, then this aliases the two variables together. A subsequent binding of X will automagically also bind Y.

In general, X and Y could be data structures with some bound and some unbound variables inside it, in which case the full unification algorithm is run.

$\endgroup$
4
$\begingroup$

In Assembly (mov) or more basic languages there is a keyword for assignment, such as:

set X to 10

I am personally not a fan of this because I prefer punctuation as operators, but this is an option that resolves ambiguity.

Or in Assembly:

mov rdi rax
$\endgroup$
4
$\begingroup$

Different Spellings

Other common variations for assignment are := and (so far unmentioned) <-. JavaScript is notorious for having many different comparison operators.

Vale has the interesting variation that x = 42 is only a variable declaration, and assignment must be written set x = 42. I find, like the author, that I also write many more dclarations than assignments.

Different Semantics

A functional language, using static single assignments, or one using linear types, allows assignment only within bindings, which can be marked with let. Alternatively, since newer languages tend to discourage mutability and assignments, you could mark assignments with set, or do both. Since is is no longer than == and be or to no longer than := or <-, this could be valid syntax:

let there be light
if there is light then
    set heavens to waters.above
    set seas to waters.below
$\endgroup$
2
$\begingroup$

Modern programming languages like Go use the "="/":=" for assignment and "==" for the comparison. It gives a compile time error if you mistype "=" instead of "==". Another suggestion would be to use constants when comparing, but that wouldn't solve your problem with mistyping in C++.

$\endgroup$
3
  • $\begingroup$ Welcome to PLDI. How does your answer differ from justANewbie's answer? $\endgroup$
    – Isaiah
    Commented May 21, 2023 at 16:03
  • $\begingroup$ @Isaiah Pascal uses = for equality comparison, and Go uses == for equality comparison. The point is that Go allows = instead of := for assignment when it's not in an expression. This is also the same as Python ─ = is assignment but not an expression, := is an assignment expression, == is an equality comparison. $\endgroup$
    – kaya3
    Commented May 21, 2023 at 17:53
  • 4
    $\begingroup$ "Modern programming languages like F# use = for both bindings and equality." It's a bold move to make a blanket statement like "modern PLs do x" when they disagree with each other on how to do x. $\endgroup$
    – Longinus
    Commented May 22, 2023 at 1:56
2
$\begingroup$

Coming relatively out of left field is Swift’s pattern-matching operator ~=. It’s rarely written in source, instead the result of desugaring a switch or if case statement, but you could use it instead of == for regular equality comparison if you wanted to.

$\endgroup$
2
$\begingroup$

The original Smalltalk uses for assignment and = for equality. It also uses for return.

The original Smalltalk developed at Xerox used its own workstation developed at Xerox with its own CPU developed at Xerox and its own input devices (keyboard and mouse) developed at Xerox. Smalltalk was its own Operating System as well.

All this to say that it also had its own character set and character encoding.

When Smalltalk first was ported to other systems using the ASCII character encoding, it turned out that ASCII had the character ^ at the same code point where Smalltalk had , so all Smalltalk was rendered with ^ for return on machines that used ASCII. It still looks like an upwards arrow, so it still kind of works.

Assignment was not so lucky, though: where Smalltalk had , ASCII has _, so all assignments in Smalltalk were rendered as _ in ASCII. This was somewhat unsatisfactory, so it was decided to change the language specification use := for assignment. However, some Smalltalks still accept _ even today.

$\endgroup$
1
  • $\begingroup$ I mentioned this in a comment on another answer, but this is much more thorough (as is expected from the nature of comments). $\endgroup$
    – Bbrk24
    Commented Jul 6, 2023 at 17:05
2
$\begingroup$

Use ,= or ;=

We could already understand ,= as the +=-like version of , in C-style operators. And ;= looks less confusing, if you already merge , with ;.

$\endgroup$
2
$\begingroup$

Most compiler have the decency to give you a warning if you use the wrong operator, or if the compiler isn't sure. For example, with a good C compiler I expect that

if (nuclear_code = 1234) ...

gives a warning. And if I really want to store 1234 and then examine whether it is zero, these compilers may allow

if ((nuclear_code = 1234))

which tells the compiler "be quiet, I know what I'm doing".

Newer languages don't allow integers to be used as boolean values. So the first example might not compile if the = operator doesn't yield a result, or it yields an integer result which cannot be used in an "if" statement. That's assuming that

if (condition = false)

would be very rare.

You could use different pairs of tokens. := and =, or = and .eq. like FORTRAN did, or a leftarrow for assignment. The warning, and possible changed semantics, is the simplest solution. In the end, if the code you type is not the code you wanted, that's your problem.

$\endgroup$
1
$\begingroup$

You can raise an error

For example this code in Python is correct:

a = 11

if a == 22:
    print("Something")

But this:

a = 11

if a = 22:
    print("Something")

raise an error (no confusing bug like C++):

  File "somefile.py", line 3
    if a = 22:
       ^^^^^^
SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='?
$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .