PHP 7 is probably the ‘the easiest upgrade yet‘. After having checked that the old PHP 5.x is lintable with PHP 7, the next challenge is to read the ‘ backward incompatible changes‘ and see if it applies to the legacy code.
The list is not long, and a quick text search will lead us to many places in the code. For example, usort() has a new sorting behavior for equals values, so targeting usort() and uksort() is a good idea.
On the other hand, there is also room for a lot of false positives. For example, foreach() has different changes depending on by-value and by-reference type, and impact within the loop block or right after it. This is not trivial to check with a text search.
Note : I won’t detail what is changed but focus on what can be done to pinpoint interesting issues in the code and actually find them.
Indirect variable
Some complex variables access have a new interpretation. For example, $$foo[‘bar’][‘baz’], or $foo->$bar[‘baz’]. Multiple dimensions arrays are OK, so is properties/methods chaining. But mixed syntax and variable parts, such as variable variable or variable properties are in need for some curly braces. Targeting $$ and ${$ is a good start.
Global requires simple variables
The global keyword requires now simple variables, like $x. Anything more complex must be reviewed and removed. Global ${$a->b}, $$p, $$q[3], $$o->b;
Parentheses around variables or function calls
Parenthesis may be used to hide some errors, just like the @ operator. That won’t be the case in PHP 7, and it is wrong in PHP 5 anyway. One solution is to rely on PHP error log : crank it up to strict standards and search for ‘Strict Standards:’.
Alternatively, search for parenthesis in arguments calls and check if they are useless :
<?php f( (array_pop($d))) ; echo ($d) ; ?>
Array elements or object properties that are automatically created by reference
Spot all array elements that are created with references (both &= and = &). The created order may change. Just like the previous, spot PHP Notice: Undefined index: in the logs, since this report that a variable is created without value, especially with &=. Otherwise, search for reference assignation to an array. $a[‘b’] = & $a[‘a’] ; or $a[‘b’] &= $a[‘a’] ;
list() will no longer assign variables in reverse order
Spot array appends within a list call. list($a[], $a[]) = $something ;
Empty list() assignments are no longer allowed
List now requires at least one valid argument to be called. It can’t be called with without argument, nor with only empty slots or empty version of itself. Even just one element in the list is legit. Search for list(), and refine with double commas, ‘(,’ or ‘,)’. Then review.
list() no longer supports unpacking strings
There is no easy way to find the string in the right operand of an assignation to list, unless it is a literal. Just like the previous, search for list() and review the other side.
list() is now always guaranteed to work with ArrayAccess
Just like the previous, there is no easy way to refine beyond searching for list() itself and reviewing the right operand.
Iteration with foreach() no longer has any effect on the internal array pointer
This means that code placed after a foreach loop, that breaks at some point and then continue processing from there, is impacted. Search for foreach loops, and check it is followed by calls to current(), next() or prev() on the source array.
<?php $a = [1,2,3,4] ; foreach($a => $b) { if ($a == 2) { break 1 ; } } $b = current($a) ; // 3 in PHP 5.6, 1 in PHP 7 ?>
Search for foreach(), and review the code after.
When iterating arrays by-value, foreach will now always operate on a copy of the array
Any by-value (no reference) foreach that uses current(), prev(), next() on the source array in the loop is impacted. Search for foreach() with by-value (no &) on the key, then checks the loop.
<?php $a = [1,2,3,4] ; $c = 0 ; // count half the array foreach($a as $b) { next($a) ; $c++ ; } ?>
When iterating arrays by-reference, modifications to the array will continue to influence the iteration
Spot modifications of the source array in the by-reference loop. Any modification to the source array has to be reviewed. It is especially true if the source array is appended with something, or merged with another array. Without condition : this is a infinite loop.
<?php $a = [1]; foreach($a as &$b) { $a[] = 3 ; } // PHP 7 infinite loop ?>
Iteration of plain (non-Traversable) objects by-value or by-reference will behave like by-reference iteration of arrays.
This didn’t work before, so PHP 5 code has workaround to do this, or avoid doing it. No need to search.
It is no longer possible to define two function parameters with the same name.
Indeed, this happens. It is easy to spot the method definition, but harder to find the double argument, since we need to find the name first. Spotting the argument is difficult in itself, since Constant Scalar Expression, like shown below. That requires a real parser to understand the various parts of the definition.
<?php function f($a, $a = (2 + 3) * 4) {} ?>
The func_get_arg() and func_get_args() functions will no longer return the original value
Spot functions that uses func_get_arg and func_get_args. Then, review if the arguments have been changed. func_get_arg use first thing in the function is probably OK.
Exception backtraces no longer display the original value
Same as above. Search for ‘getTraceAsString’ non-static method usage, with -> and empty arguments.
Invalid octal literals
Any number starting with 0 has to be checked, such as $x = 0783 ; Most of the time, octals are only used with mkdir and chmod, so you may spot them there, with some classic values of 0777, 0666, 0755, etc. Otherwise, regex like [^”0-9′]0[0-9]+ should bring a short number of solutions to review.
Bitwise shifts by negative numbers
Shifts are done with the << and >> operators. Then, review them.
Large bitwise shifts
Same as above.
Strings that contain hexadecimal numbers are no longer considered to be numeric
Hexadecimals look like 0x[0-9a-fA-F]+. They may be standalone, or they may be inside a string . When in a string, make sure they are at the beginning of the string, being the ‘, “, <<<HEREDOC and <<<‘NOWDOC’ : anything deeper in a string is not interpreted by PHP.
Codepoint Escape Syntax
Lint the code, and it will report any problems with invalid ‘\u{‘ in strings.
Removed support for static calls to non-static calls form
Statically calling a non-static method is now forbidden. Searching for :: is probably going to yield a lot of results, and finding them manually is made difficult by class hierarchies and use statements. This is better left to PHP logs, or the exakat engine
The yield language construct no longer requires parentheses
Yield has changed of precedence. It may now be used without parenthesis within expression, though it may need some parenthesizing to be consistent with PHP 5 allowed syntax. Search for Yield and review anything that has more than one variable until the semi-colon (yield $y ;).
$HTTP_RAW_POST_DATA is no longer available
Easy textual search.
Removed support for assigning the result of new by reference
$x &= new someClass() ; or $x &= new someClass() ; & and new on the same line is the target. PHP 5 emits a ‘Deprecated’ error and PHP 7 a ‘Parse error’ so linting is the solution here.
Removed support for /e
85 % of preg_call use a hard coded regex or a in-site concatenation. Search for preg_replace, and see if the regex uses e as option. Of the 15 % remaining, most of them have variable that are defined close-by, so review them.
Finally, this will generate a PHP Warning : PHP Warning: preg_replace(): The /e modifier is no longer supported, that you can check in the logs.
Removed string category support in setlocale()
Check that setlocale has no more strings (‘ and « ) as first argument. Searching for setlocale(« or setlocale(‘ should yield some results.
usort does not return the same result
When using a custom function for sorting, the order of the elements that are equals may have changed. Check the code there : https://bugs.php.net/bug.php?id=69158&edit=3.
Search for ‘usort’, ‘uksort’, and then review the callback function associated to this. If the callback never returns 0, for ex-aequo, then it is safe. Otherwise, the order of those values may change.
Wrap up
Searching with textual search is a good first step. This will provide quick results and also bring attention to part of the code that may be in need of spring cleaning.The large amount of false positive will slow the process.
Often, textual search is blind to the context of the code, and miss the actual semantic value of PHP tokens. The exakat engine version 0.3.8 (coming up) cover more than 50% of the previous issues, and, of course, new or removed functions, classes, keywords and interfaces.
Done | To do | Not | |
Indirect variable | |||
The global keyword | |||
Parentheses around variables or function calls | |||
Array elements or object properties that are automatically created by reference | |||
Empty list() assignments are no longer allowed | |||
list() will no longer assign variables in reverse order | |||
list() no longer supports unpacking strings | |||
list() is now always guaranteed to work with ArrayAccess | |||
Iteration with foreach() no longer has any effect on the internal array pointer | |||
When iterating arrays by-value, foreach will now always operate on a copy of the array | |||
Iteration of plain (non-Traversable) objects by-value or by-reference will behave like by-reference iteration of arrays. | |||
It is no longer possible to define two function parameters with the same name. | |||
The func_get_arg() and func_get_args() functions will no longer return the original value | |||
Exception backtracesno longer display the original value | |||
Invalid octal literals | |||
Bitwise shifts by negative numbers | |||
Left bitwise shifts by a number of bits beyond | |||
Strings that contain hexadecimal numbers | |||
Codepoint Escape Syntax | |||
The yield language construct no longer requires parentheses | |||
$HTTP_RAW_POST_DATA is no longer available | |||
Removed support for assigning the result of new by reference | |||
Removed support for /e | |||
Removed string category support in setlocale() | |||
usort does not return the same result | |||
New Functions | |||
Removed Functions | |||
New Classes | |||
Removed Classes | |||
New Constants | |||
Totals : 30 | 18 | 10 | 2 |