Auditing code is actually a fancy word for reading the code. Sometimes, I even wonder if a programmer isn’t reading code more often than write. Good piece of code will be written once, and changed seldom afterwards. Bad piece of code will be reviewed, re-read for debugging, and sometimes forgotten : this is when we spend a lot more time than we want trying to understand what the hell the original author tried to do. Even when it’s oneself.
Proof-reading the code means that the reader will check as many things as possible. It’s very difficult to stay focused while doing so, because there are so many aspects of the code to take into account. Let’s make a short list of them, even as we know this never be complete :
- coding conventions
- PHP recommendations
- Framework guides
- Security
- Performances
- Algorithms
- Business rules
Each of those subjects would contains lots of rules. Sometimes, those are written in a reference document, and even they are tooled : for example, Symfony or Zend Framework have coding conventions, and use the Code_Sniffer to check them. PHP manual has a lot of warnings and recommendations, but they are scattered all over the manual, and there is no tool to check it.
This lack of explicit reference leads to an ever raising level of entropy in the code. At the beginning, the project is small enough to allow for review. Later, with an ever larger code base and more people contributing, this tends to get long and boring. This both a needed task and a chore.
Static auditing provide a soothing help to this problem. Static means that the code will not be run, and as such, it is not a competitor to unit testing. The code will be read just like you and me, right from the source. And this is an automated process, so the auditor will never get bored at doing it. It is now possible to run this as often as possible.
It also means that we audit much larger code base, as the auditor will now work without loosing its edge even on ten millions lines. Such applications are probably the largest in the PHP world, nevertheless I like the idea that every lines of code will get the same treatment. May be this is auditing neutrality, or simply a way to avoid human bias : etc folder ? Nah, there won’t be code in there….
The auditor will also tirelessly a large number of analysis to the code. Once experience has been acquired, it is registered in the auditor library, and may be reused the next time. This way, it won’t appear again in the same circumstances. Even if this is a rather rare situation, it is possible to apply previous experience to any new code, or separate project, and check if the same situation arise again.
The last advantage I tremendously appreciate is its summarizing capabilities. We mentioned earlier that code base tends to grow without limit. Running some analysis on such code base makes sense, both in the level of details it will reach, but in the aggregated data it may gather for us. Phpinfo() is one of the most popular PHP function, and it is time we have an ‘appinfo()’ function, that will collect various aspects of the application and make it a summary. It is useful for devops to know what configs are expected or which extensions are needed, but also it is good for programmers to realize that PHP features are used. Making code simpler to understand is definitely a major advantage.