Detecting dead code in PHP
PHP applications are under constant evolution. The code tents to grow bigger, more complex, and finally, to collect dust : this also know as dead code. Dead code is actual code that is not being used in production, even if the code is deployed. It is important to remove dead code.
Missing code is actually easy to spot : at some point, a part of the application will fail, and die hard, with an error message. On the other hand, supplementary code just sit idle, doing nothing. Unused code never report bugs, and it may stay there for long time.
The nasty effect of dead code on the application is at the developer level : more code to read, more code to maintain, more clutter to understand when a bug strikes somewhere. It also have an impact on PHP execution time and memory consumption.
Here is a list of type of dead code that is good to remove.
The comment
PHP code in comments does happen quite often. When debugging it is easier to put code into comments. When attention shift to another part of the code, those comments are left there. They shouldn’t be there. The best is to remove them as soon as the debug is done, and whenever they are spotted in the code. VCS will allow to go back to previous code, so there is not reason to keep it twice.
The thoughtlessness
This kind of code is actually some coding error. It may take the following shape :
function x () {
return 1 ;
$a++ ;
}
The increment will never be reached, as the function is finished before.
This kind of situations happens with various keywords : return, break, continue, die, exit. Sometimes also with structures such as if or while, with constant values in the condition : this is a variation of the previous section. At a higher level, it also occurs with functioncall or methodcall, when those call die or exit.
When all the code is at the same level, like in the above example, it may be cleaned easily : either remove unattainable code, either remove the terminating instruction.
Obsolete definitions
Constants, functions, classes, interfaces or traits may be defined but not used. The evolution of the code means that some of those structures will be created, used then unused. The trick is that no error will be emitted in case of unused functions.
Each of those structures have specific ways of being used and all of them must be checked.
- Constants are directly used in code, or may be reached with the constant() function
- Functions are called in the code. They may be dynamically called with call_user_func (and similar), or with variables $function().
- Traits are implemented in classes or traits.
- Interfaces are implemented in classes or interfaces, or checked with instanceof.
- Classes are used in new, extends, catch structures, typehint structures, or instanceof calls. Some of those calls are also dynamical, such as new or instanceof.
To remove those structures, all the code has to be checked. If no occurrence is found, then it may be removed.
Classes features
Class constants, properties and methods are the same as the above structure, but defined and used within a limited scope. Finding this scope is difficult in PHP, although, after finding it, the same tactics as the previous one may be used.
Unused private methods have to be unused by the class itself. Unused protected methods are not used by the class, the parents and the extension classes. Public methods have the same requirements, plus no usage from objects point of view. The two first one may be checked with static auditing. Public values require a full application scan, and are more difficult to find. Too bad, PHP set all methods public by default. This is definitely costly in the long run.
Unused inclusion
If the application rely on autoload, the classes will be loaded just in time. When PHP needs a class, as stated above in the Obsolete structures section, it will use autoload() to find it and load it. There is no waste of included code with this approach. Unused classes means that the class is never mentioned in the code. This is pretty straightforward process.
On the other hands, if files are included, we need to check if any of the structures that it are defined in that file, are used. For example, tools libraries gather large number of utility functions. If any one of those functions is used in the calling script, then the file is not dead code, and the inclusion must stay. Otherwise, it may be dropped. Cleaning those libraries must be done at the function level, as presented above.
Beware of inclusion that have global code : such code will be executed at inclusion time, and removing the inclusion, even not using the defined structures, might break some code.
Never included files
Finally, whole source files may be dead code, simply because they are never included, nor directly called. It is easy to detect such files, as they end up empty when pruning obsolete classes or functions : when cleaning a file, if you’re removing too many functions or classes, you wonder if the whole file will still needed.
Unit tests
Finally, when cleaning code, do not forget to remove the associated tests. It is a common pitfall : code is kept in repository, because it is used in the another part of the application : this part are … the unit tests ! This ends up as a chicken-egg problem : Unit tests needs to call the class, then the class is used and needs…. more tests.
Finally
Removing dead code is a matter of daily hygiene. The most difficult part is to overcome the scare. Scare of ridicule, that will be the terrible reward of too much removal and too little checking. I like to clean code by little touches: when I spot some that looks dead, I review the above checklists and make a specific commit for it so I can revert it quickly. I manage to remove 5 to 20 lines of code every day, in one commit. And it does feel as good as brushing one’s teeth !