Recently, I delved into a piece of code that involved a case-insensitive comparison of a character to a specific letter. The code, though straightforward, had room for optimization. Not being a fan of lengthy logical expressions, my curiosity led me to explore ways to make it more concise and potentially faster. What is the fastest case insensitive char comparison in PHP?
<?php $singleChar = 'a'; // or another value if ($singleChar == 'a' || $singleChar == 'A') { doSomething(); } ?>
Let’s embark on a brief journey and visit a collection of alternatives for the original syntax and a comparative analysis of their performance.
And before going in, this is another micro-optimisation. So, the final teachings will not be about speeding up PHP code: they are elsewhere.
Case insensitive comparison of characters
There are several options to check if a character $singleChar is a certain literal value, in a case insensitive manner. The value itself is not important, and we’ll stick to Roman alphabet in this article (feel free to test with Cyrillic or Armenian).
$singleChar == 'a' || $singleChar == 'A'
is an obvious candidate. This is the exact expression of the requirements, and it is quite readable. It tend to be less updatable, as adding more options grows the expression significantly.$singleChar === 'a' || $singleChar === 'A'
. Same as before, with a string comparison.in_array($singleChar, ['a', 'A'])
is the next canditate, with several options.in_array($singleChar, ['a', 'A'], true)
is the same a the previous one, with a strict comparison.in_array($singleChar, $letters)
is the same a the previous one, with the array in a variable, rather than hard coded.$letters = ['a' => 1, 'A' => 1]; isset($letters[$singleChar])
is a close cousin fromin_array()
. It costs an extra array too.strtolower($singleChar) == 'a'
removes the casing of the string, and simplify comparison.mb_strtolower($singleChar) == 'a'
is the same a before, but in a multi-byte environment.mb_strtolower()
is a drop-in replacement forstrtolower()
, but it does process the string in a different way.strcasecmp($singleChar, 'a')
is a lesser know native PHP function which compares strings, in a case insensitive manner. Unlikestrtolower()
, we do not need to configure the expected case. It also returns the opposite: 0 means that both strings are equal.stripos($singleChar, 'a')
is a lesser know native PHP function which compares strings, in a case insensitive manner.match($singleChar) { 'a', 'A' => true, default => false}
with a double usage case, and default. === is used here.ord($singleChar) == 65 || ord($singleChar) == 97
makes use of ASCII chars. It does require some initial research to get the code right.($d = ord($singleChar)) == 65 || $d == 97
is an optimized version of the above, with a local cache.preg_match('/a/i', $singleChar)
is our final candidate in the line up. The regex engine can be configured with a case insensitive search, and fulfill our requirements.
Performances
We’ll run each of the 11 candidates with PHP 8.2 and compare their timing. Since these are very small operations, we’ll need to run those 10 millions times to get a significant time difference.
exp. | time (ms) |
---|---|
isset | 378 |
array_key_exists | 378 |
in_array | 380 |
match() | 383 |
or | 390 |
in_array strict | 391 |
ord(), single call and == | 454 |
strcasecmp | 422 |
stripos | 432 |
or strict | 450 |
ord(), single call and === | 454 |
chr() | 454 |
strtolower | 553 |
in_array variable | 686 |
in_array constant | 726 |
preg_match | 786 |
mb_strtolower | 1163 |
So, as anticipated, the difference of processing between the best and the worst is 785ms, over 10 Millions iterations. None of the solutions are really bad.
in_array()
, isset()
and or
steal the show with a very narrow margin over strcasecmp()
and strtolower
.
Interestingly, strict is faster within in_array()
while it is slower when used with or
.
in_array()
is a bit faster when the list of values is in a literal, rather than in a variable or a constant. This was a bit surprising, though PHP may have to fetch it every time, unlike with a literal.
preg_match()
is by the bottom of the list. It may not shine with 2 alternatives, but may raise to proeminence with more options.
mb_strtolower()
is really slower than the others, due to the multi-byte support.
Performances when failing to find anything
This is a complement scenario: this time, a string C
is used for comparison, and will yield a failure. The results are very similar, with a bit of variations. or
seems to be loosing significant performances in case no result can be found.
exp. | Success | Failure |
---|---|---|
isset | 378 | 405 |
in_array | 380 | 348 |
or | 390 | 626 |
in_array strict | 391 | 369 |
strcasecmp | 422 | 437 |
or strict | 450 | 670 |
strtolower | 553 | 741 |
in_array variable | 686 | 827 |
in_array constant | 726 | 760 |
preg_match | 786 | 722 |
mb_strtolower | 1163 | 1277 |
Performances over PHP versions
Expression | PHP 8.3 | PHP 8.0 | PHP 7.3 |
---|---|---|---|
isset | 378 | 382 | 445 |
in_array strict | 355 | 386 | 419 |
strcasecmp | 407 | 425 | 521 |
in_array | 412 | 401 | 440 |
or | 541 | 600 | 657 |
in_array constant | 632 | 761 | 768 |
or strict | 660 | 661 | 694 |
strtolower | 661 | 777 | 808 |
in_array variable | 749 | 866 | 953 |
preg_match | 786 | 911 | 1502 |
mb_strtolower | 1103 | 1465 | 1844 |
The same tasks were run with 3 versions of PHP : 8.3, 8.0 and 7.3. The overall ranking of the solutions is the same. Yet it is visible that PHP performances are improving over the years with a gain between 18 to 30%.
The fastest case insensitive char comparison
There are several interesting tips to learn here.
- in_array() is faster with strict comparison than without. Another good reason to use that parameter by default.
- in_array() is faster than a list of
or
, even with 2 elements. - strcasecmp() is a nice PHP native function for this case.
- there are 11 ways to compare 2 characters in PHP.
My personal favorite is in_array(), as it is more readable, and its performances degrade less with the size of the array. I’m a bit disappointed that setting that same array in a remote container (variable, constant, property…) degrades its performances.