14 PHP arguments that are not enough used
PHP has a few thousands functions under its belt, and there are even more arguments. And just like for functions, some of those arguments are not often used, while they provide a useful extension to the original behavior of the function. To catchup with them, here are 14 PHP arguments that are not enough used.
The posterboy of the unknown arguments is the second argument of dirname(), which applies the function multiple times on the first argument. No need to use long list of /../
or repeat it to go up several directories.
<?php $path = '/a/b/c/d/e/f'; $root = dirname(dirname(dirname(dirname($path)))); $root = $path.'/../../../../'; $root = dirname($path, 4); ?>
Based on Exakat’s corpus of 2800+ PHP repositories, we checked all PHP standard functions and the usage of their parameters. Obviously, functions without optional arguments were omitted : there is no point counting compulsory arguments. The rest shows which arguments are rarely used : we selected the most interesting when they were used less than 1% of the time.
Here are several parameters that are rather unknown, but so useful. There are golden nuggets for everyone!
- dirname() 2nd argument
$levels
- explode() 3rd argument
$limit
- count() 2nd argument
$mode
- array_column() 3rd argument
$key
- trim() 2nd argument
$characters
- in_array() 3rd argument
$strict
- array_keys() 2nd argument
$value
- preg_replace() 5th argument
$count
- str_replace() 4th argument
$limit
- range() 3rd argument
$step
- iterator_to_array() 3rd argument
$preserve_keys
- file() 2nd argument
$flags
- mkdir() 3rd argument
$recursive
- preg_match_all() 4th argument
$flags
In no particular order, here are the details.
explode() 3rd argument $limit
explode()’s third argument is a limit : it caps the number of split strings to be returned. This is useful in multiples cases:
- when only the firsts of the elements have to be collected, and the last may be ignored.
- to speed up the code, as fewer exploded elements are faster to extract and return
- to avoid processing too many elements from an unchecked value
<?php // elements starting from c are unused list($a, $b) = explode(':', 'a:b:c:d', 2); // avoid processing potential thousands of : $details = explode(':', $_POST['x'], 2); ?>
I wonder if an $offset
argument would be useful too, skipping the early elements. It won’t speed up the processing, though, so may be omitting them in the list() call is similar.
<?php // skipping the first returned element list(, $b, $c) = explode(':', 'a:b:c:d', 3); ?>
count() 2nd argument $mode
What can count() be upgraded to? To recursive arrays. The second argument makes PHP count recursively all the arrays.
Somewhat surprisingly, this is a very literal definition for this recursive mode. count() returns the number of elements in each array, including the intermediate arrays. Check the following code:
<?php $a = [[1,2], 3, []]; print count($a); // 3 print count($a, true); // 5 print count($a, true) - count($a); // 2 // won't work here, as some elements are not arrays print count(array_merge(...$a)); ?>
$a
has two elements, and then, the first element is an array with two elements itself. Scalar values are counted as 1, and empty arrays as 0.
When counting elements in a consistent array of arrays, it feels safer to use the regular count() with a loop or array_merge(). In the end, the alternative is only slower.
Recursive count() gets harder to use when counting final elements in nested array of 3 levels or more : intermediate counts needs to be factored in.
array_column() 3rd argument $key
array_column() is one of my beloved functions in PHP : it extracts the elements from an array (or an object), by their name.
<?php $a = [['a' => 1, 'b' => 2], ['a' => 3, 'c' => 4], ['a' => 5], ]; print_r(array_column($a, 'a')); /* Array ( [0] => 1 [1] => 3 [2] => 5 ) */ ?>
This is already very convenient for extracting data from datasets (Database results, JSON/Yaml extract…) and turning them into a simple list. Non-existent values are skipped silently, so this also acts as a filter.
The third argument $key
is used when building the returned array, to create a hash.
<?php $a = [['a' => 1, 'c' => 'A'], ['a' => 3, 'c' => 'B'], ['a' => 5], ]; print_r(array_column($a, 'a', 'c')); /* Array ( [A] => 1 [B] => 3 [0] => 5 ) */ ?>
When available, PHP uses the value as the key for the returned array. Otherwise, it defaults to a normal auto-generated index for the array. Guess what happens if that value is a integer…
This trick builds hash structures from values that are scattered in arrays.
I think PHP needs the reverse function: it takes a hash, and produces or updates an array of arrays. That would be the reverse of this function. Of course, it’s an easy loop, just like array_column(), right?
trim() 2nd argument $characters
trim(), and its close cousins ltrim() and rtrim(), removes spaces from extremities of a string. The second argument $characters
allows the configuration of the function with other characters.
<?php $name = ' john doe '; print "'".trim($name)."'"; // "john doe", without the spaces. $namespace = '\A\B\C\\'; print "'".trim($namespace, '\\')."'"; // "A\B\C", ready for 'use' // cleaning multiple useless + at the beginning of a number $number = '+++-1'; print "'".intval(ltrim($number, '+'))."'"; // "-1", not 0 ?>
in_array() 3rd argument $strict
in_array() looks for values inside an array. The third parameter set the search to be strict, instead of loose, by default. It is the same difference than between switch() / match(), or == and === : the type is also used to compare values.
This is very useful to avoid comparing a 0 to an empty string, and getting bitten by a PHP version change. In the examples below, returned values are different in PHP 7 and 8.
<?php $a = ['', 1, 2, '3hello']; var_dump(in_array(0, $a)); // true in PHP 7, false in PHP 8 var_dump(in_array(0, $a, true)); // always false var_dump(in_array(3, $a)); // true in PHP 7, false in PHP 8 var_dump(in_array(3, $a, true)); // always false ?>
array_keys() 2nd argument $value
array_keys() with its second argument is the hidden child of in_array(): that include in_array()’s 3rd argument, $strict
.
The second argument tells array_keys() to returns the keys of the array which match that target value. It basically filter the array on a value, and return the keys.
This is very convenient to use, instead of running array_count_values() to count the number of occurrences, and then, working on multiple values.
<?php $a = ['a' => 1, 'b' => 2, 'c' => 1]; print_r(array_keys($a, 1)); ?>
preg_replace() 5th argument $count
preg_replace() has already a lot of parameters. Yet, the two lasts are quite interesting: there is a $limit
argument, which caps the number of replacements. Just like explode(), there is no offset, so it replaces the first occurrences, then stops.
And then, the fifth argument is $count
, which is passed by reference, and returns the number of occurrences that were replaced. This is the same result than preg_match_all(), which counts the number of found occurrences. Since preg_replace() returns the modified string, this count must be returned another way. This value is important to monitor what happened.
<?php $s = 'ac bc ac dc'; print preg_replace('/[ab]c/', 'z', $s, 2, $count); // z z ac dc print PHP_EOL; print $count . " string changed\n"; // 2 strings changed : we hit the requested limit ?>
str_replace() 4th argument $limit
str_replace(), just like preg_replace(), also returns the number of replacements that were operated. There is no limit with this function, though.
<?php $s = 'ac bc ac dc'; print str_replace('ac', 'z', $s, $count); // z bc z dc print PHP_EOL; print $count . " string changed\n"; // 2 strings changed ?>
range() 3rd argument $step
range() is already full of tricks : initially, it is a function that generates an interval, from the first argument to the last.
Then, it is a function to generate chars, for example, the alphabet, by using strings as initial and last argument.
And then, the 3rd argument is a step, and it skips some of the generated elements. It works both for numbers and for chars.
<?php // 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 $digits = range(0, 9); // 0, 2, 4, 8 $evenDigits = range(0, 9, 2); // a, b, c, d, e, f, g, h, i, j...y, z $alphabet = range('a', 'z'); // a, c, e, g, i, k... y $evenAlphabet = range('a', 'z', 2); // Chinese characters // 一, 丁, 丂, 七, 丄, 丅, 丆, 万, 丈, 三, 上, 下, 丌, 不, 与, 丏, 丐 $evenAlphabet = range(0x4E00, 0x4E10); $evenAlphabet = array_map('mb_chr', $evenAlphabet); ?>
Note that UTF-8 is not supported by range, and it needs some extra processing with mb_chr() to reach them.
iterator_to_array() 3rd argument $preserve_keys
iterator_to_array() turns an iterator to an array. This is sometimes desirable to hand them to a method which requires an array, for example. By default, this function returns also the keys in the array, just like yield
emits them.
Instead of using array_values(), it is useful to switch off this feature. It is also a bit faster to auto-index those keys, rather than fetch them.
<?php function foo() { foreach(range('a', 'd') as $i => $l) { yield $l => $i + 1; } } print_r(iterator_to_array(foo())); /* Array ( [a] => 1 [b] => 2 [c] => 3 [d] => 4 ) */ print_r(iterator_to_array(foo(), false)); /* Array ( [0] => 1 [1] => 2 [2] => 3 [3] => 4 ) */ ?>
file() 2nd argument $flags
file() reads a file and returns it in an array, one line per element. This is not memory efficient, as the whole file has to fit in memory. But the worst is that the read lines include the final new line : in many case, this must be cleaned with a call to trim() or rtrim().
The second argument of file() is an option, which can be used to remove them automatically : pass it the constant FILE_IGNORE_NEW_LINES.
<?php $list = file('alphabet.txt'); print join(', ', $list); // a // ,b // ,c ... // New lines are still here $list = file('alphabet.txt', FILE_IGNORE_NEW_LINES); print join(', ', $list); // a, b, c, d, ... // this is the intended code ?>
mkdir() 3rd argument $recursive
Creating directories easily turns into a recursive task : when storing directories deep in the file system, there might be several directories to traverse before reaching the target one.
This is were the 3rd argument of mkdir() is handy : $recursive
, which is set to false by default. With this, mkdir() will create all intermediate folders, as they are needed, until the full path is created.
No need to check, nor loops to make the missing folders.
<?php // No such file or directory : /tmp probably exists, // but /tmp/a probably doesn't and needs to be created mkdir('/tmp/a/b/c/', 0755); // create any missing folder mkdir('/tmp/a/b/c/', 0755, true); ?>
preg_match_all() 4th argument $flags
preg_match_all() 4th argument allows the configuration of the result set.
preg_match_all() looks for patterns in the string. With capturing sub-patterns, there may be several values to be returned. The fact is that two representation are possible : by column or by row.
By column is the default representation : preg_match_all() stores each sub-patterns in one array dimension. The $matches[0]
contains all the strings that match the whole regex, the $matches[1]
contains all the strings that match the first sub-pattern, etc. It is possible to match the results across the column by using the index : $matches[1][0]
is the first sub-pattern related to the whole match of $matches[0][0]
, and the third sub-pattern of $matches[2][0]
. This is the PREG_PATTERN_ORDER.
If this matching via index in not convenient, or look too different from databases-related extension, there is the PREG_SET_ORDER order, which gather all the sub-patterns per finding. Then, the results may be processed like a database row.
<?php preg_match_all('/a([bc])/', 'ab ac ad', $matches, PREG_PATTERN_ORDER); print_r($matches); /* Array ( [0] => Array ( [0] => ab [1] => ac ) [1] => Array ( [0] => b [1] => c ) ) */ preg_match_all('/a([bc])/', 'ab ac ad', $matches, PREG_SET_ORDER); print_r($matches); /* Array ( [0] => Array ( [0] => ab [1] => b ) [1] => Array ( [0] => ac [1] => c ) ) */ ?>
In a related topic, databases interfaces mostly provides the equivalent of PREG_SET_ORDER : one row is one array. Sometimes, the resulting dataset has to be processed by column: there, a call to array_column() is often necessary.
dirname() 2nd argument $levels
This was shown in introduction, so it won’t be repeated.
Enjoy better PHP coding
Those arguments are rarely used, and should be better known. They often help in classic situations, and makes the code faster and less error prone.
The same statistical approach could be used with frameworks, components and PHP extensions : they often provide extra features that are not always used, due to learning, habits or simple (un)popularity. Knowing which argument is used or unused helps cleaning the code and adapting the API to real world usage.