28

I have an engine that does some mathematical and logical operations by taking formulas, operands and operators from a file. All the operations are executed in an eval scope and the final result is saved in another file.

These files are often transferred through a network, so I am trying to minimize them by stripping all the spaces before and after operations. As far as I know there are no strict rules about that matter, however I stumbled upon this behavior:

$x = 1;
$result = $x++-++$x; // works
$result = $x+++++$x; // fails
$result = $x++ + ++$x; // works again
  1. Why is PHP confused by the "+++++" syntax, but accepts "++-++"? How is "plus" better than "minus"?

  2. Is there a list anywhere of operators that are sensitive about spaces?

11
  • 9
    don't know the answer but I hope you're being really careful about those input files that you eval() because they would otherwise present a huge vulnerability Commented Aug 13, 2017 at 0:10
  • 4
    Be really, really careful with eval. And this seems like a messy way to go about it, but its an interesting question none the less.
    – Qirel
    Commented Aug 13, 2017 at 0:12
  • 5
    @zerkms Actually, there is a PHP Language Spec, but it exists largely to enshrine the behaviour of the original implementation so alternatives like HHVM have something to benchmark against. github.com/php/php-langspec
    – IMSoP
    Commented Aug 13, 2017 at 1:38
  • 5
    Coming from C, I cringe when I see something like x++-++x, which is textbook undefined behavior. Does PHP nail down its meaning? Commented Aug 13, 2017 at 12:46
  • 2
    “I am trying to minimize them by stripping all the spaces before and after operations” Have you considered using something like GZIP? Commented Aug 13, 2017 at 19:48

3 Answers 3

35

The PHP parser is searching for the ++ sign before the last + sign, and the syntax ($x++)++ makes no sense, due to the fact that the increment operator should be applied to a variable (and not an integer, which is the result of the first $x++).

The precedence of the operator's operations can be found here:
http://php.net/manual/en/language.operators.precedence.php

$x+++++$x;
^ php parser starts here, find $x++
    ^ here there is a new ++, which has hight precedence to the next + char
      ^ here is the last +, which the php parser will find last.

When the two ++, ++ split with a minus sign, the code is actually $x++ - ++$x, one which the PHP parser can understand.

This is also the reason why $x++ + ++$x works.

2
  • 1
    Yes, it does. The result will be same as $x++ + ++$x
    – Dekel
    Commented Aug 13, 2017 at 13:18
  • 3
    By the way, the name of this behaviour is the "maximal munch" rule. Commented Aug 13, 2017 at 15:34
12

The answer is that the parser looks for longer tokens before looking for shorter ones. Therefore ++++++ becomes ++ ++ +, which is unsensible to the interpreter.

PHP is one of many those languages that borrows its expression grammar from C, so this note may be of interest to you. In the C11 draft, section 6.4 clause 6 gave an example:

The program fragment x+++++y is parsed as x ++ ++ + y, which violates a constraint on increment operators, even though the parse x ++ + ++ y might yield a correct expression.

0
10

A little more supplementary info. When a PHP script is "lexed", i.e. when it is scanned, the tokens comprising the script are examined. A pair of characters like "++" signify an increment token, as follows:

<ST_IN_SCRIPTING>"++" {
    RETURN_TOKEN(T_INC);
}

This "rule" is in the file Zend/language_scanner.l which accompanies PHP when you download and install it. The only way that a script when scanned becomes intelligible to the lexer with regards to pre- or post-incrementing variables is if there is some kind of demarcation such as spacing so that each "+" is properly evaluated in context.

Note that writing code like the following is inadvisable:

  <?php

  $x=0;
  echo $x++ + ++$x;

even tho' it will be lexed correctly. The reason for objecting to this coding style is because it can be less than apparent to human brains as to what is really occurring. The order of evaluation is not what it may seem, i.e. a variable being post-incremented and then being added to itself with its pre-incremented value.

Per the opcodes, the pre- and post-incrementations occur before addition takes place. Also, note that post-incrementation returns a variable's value and then increments it. So, initially, $x is assigned a value of 0. Then, it is post-incremented so that a temporary variable returns with a value of zero. Then, $x gets incremented to acquire a value of one. Next, $x is pre-incremented, and so $x moves from a value of one to two and its temporary variable evaluates as two. Lastly, the the two temporary variables are added, so that 0 + 2 == 2; see here as well as here.

Also, an excellent read here.

Incidentally, in this case PHP conforms with its forbear The C Programming Language; see here.

Not the answer you're looking for? Browse other questions tagged or ask your own question.