Largest PHP applications
When testing the exakat static analysis engine, I need to run it on real code. Open Source projects are a real blessing there, since they come in different shapes and stripes. Some projects dates back from PHP 3 and have evolved until now, some are PHP 7.2 only ; some are full OOP, while others are fully functional ; some apply ‘East programming’ paradigm, others use the ‘bazaar way’. Some are in weird languages…
Nowadays, code bases tends to be smaller, compared to more ancient applications. Components are the norm, and they impact both the development of the application, and its extension. Usually, the application takes advantage of composer and packagist : some large portions of the code are not developed internally. And, on the other hand, the application acts as a platform, providing hooks for modules, and sometimes, a market place.
For this survey, we collected 1885 Open Source applications, and counted only their tokens. Tokens are PHP atomic elements, that are needed to understand and run code. Comments, white spaces and delimiters were not counted, leaving only the useful tokens. Then, the more the larger is the application.
# | Project | Tokens |
---|---|---|
1 | wikia | 15228275 |
2 | ilias | 13721259 |
3 | moodle | 13259355 |
4 | limesurvey | 12263127 |
5 | dolibarr | 11506248 |
6 | magento2 | 11252679 |
7 | x2crm | 8497433 |
8 | webtrees | 7467258 |
9 | xcart | 7272215 |
10 | openatrium | 6173802 |
Some interesting results
- Modern frameworks tends to be broken down into interoperable components, and are rarely published with everything. This means they are ranking in lower levels : laravel is #387, symfony4 #52, wordpress #230, cakephp #287.
- Drupal actually ranks #11, while openatrium, based on drupal, ranks #10.
- libphonenumber-for-php is ranking high, thanks to its internal database of phone numbers and area code. The arrays are counted as tokens, though there is actually little code.
- Static analysis tools are also part of this study: they are the worst code bases, and they include lots of weird code for testing purposes. Phan is #46, and exakat is #280 and the ‘PHP vulnerability test suite’, a set of 42000 PHP scripts to test SCA is actually ranked #19.
- There are several repository collecting PHP backdoors.
- There are a lot of unusual application out there, including tools to manage a cemetery (OpenCimetiere), a church (OpenChurch), a farm (Tania), and a full Javascript parser in PHP (js2php).
- The majority of published component is less then 25k tokens, and 75% of them have less than 250k tokens.
- Only the Top 6 is beyond 10M tokens, though company code, including the underlaying framework/platform is usually near that size.
Top 100 largest PHP applications
# | Project | Tokens |
---|---|---|
1 | wikia | 15228275 |
2 | ilias | 13721259 |
3 | moodle | 13259355 |
4 | limesurvey | 12263127 |
5 | dolibarr | 11506248 |
6 | magento2 | 11252679 |
7 | x2crm | 8497433 |
8 | webtrees | 7467258 |
9 | xcart | 7272215 |
10 | iboxgento2 | 7024030 |
11 | openatrium | 6173802 |
12 | drupal | 5919413 |
13 | tcpdf | 5744856 |
14 | mediboard | 5413723 |
15 | horde | 5386885 |
16 | pmb | 5337785 |
17 | garp3 | 5276866 |
18 | tikiwiki | 5151860 |
19 | php-vulnerability-test-suite | 5054092 |
20 | epesi | 5026234 |
21 | yii | 4920417 |
22 | magento | 4668605 |
23 | sndtrack | 4605822 |
24 | tuleap | 4501781 |
25 | kaltura | 4315292 |
26 | efront | 3919036 |
27 | vtiger | 3816561 |
28 | libphonenumber-for-php | 3758145 |
29 | zurmo | 3608028 |
30 | suitecrm | 3588387 |
31 | inoerp | 3524053 |
32 | tableless | 3516734 |
33 | bitrix | 3496216 |
34 | matacms | 3482393 |
35 | typo3 | 3411981 |
36 | joomla | 3401841 |
37 | fengoffice | 3285536 |
38 | thelia | 3282726 |
39 | phabricator | 3247181 |
40 | civicrm | 3187209 |
41 | mediawiki | 3177478 |
42 | chamilo | 3150857 |
43 | aws-sdk-php | 3142650 |
44 | craft | 3133206 |
45 | groupoffice | 3060838 |
46 | phan | 2999688 |
47 | nukeviet | 2891856 |
48 | zf1 | 2779468 |
49 | openemr | 2742813 |
50 | opencart | 2661819 |
51 | precurio | 2507782 |
52 | symfony | 2499162 |
53 | revolution | 2434566 |
54 | discuz | 2430595 |
55 | b2evolution | 2378927 |
56 | melisa | 2376508 |
57 | geeklog | 2348603 |
58 | mahara | 2343338 |
59 | yetiforcecrm | 2338122 |
60 | phpt | 2324530 |
61 | fabereo | 2273026 |
62 | shopware | 2225386 |
63 | exponent | 2155793 |
64 | modocms | 2152250 |
65 | tine | 2128822 |
66 | ez | 2106443 |
67 | sonerezh | 2103011 |
68 | cerb | 2102416 |
69 | incomm | 2097245 |
70 | dachuwang | 2073564 |
71 | axiom | 2021678 |
72 | centurion | 2019055 |
73 | phpadsnew | 2018344 |
74 | tomatocart | 1990678 |
75 | wplib-box | 1985108 |
76 | absis | 1974195 |
77 | dokuwiki | 1963236 |
78 | prestashop | 1957719 |
79 | zftest | 1927880 |
80 | apigility | 1899026 |
81 | mautic | 1897530 |
82 | jaws | 1894371 |
83 | sugarcrm | 1891029 |
84 | pim | 1876572 |
85 | address-books | 1872943 |
86 | livehelperchat | 1856258 |
87 | roundcube | 1840959 |
88 | crew | 1805065 |
89 | nextcloud | 1801205 |
90 | sootherp | 1774907 |
91 | zf3 | 1766583 |
92 | phpbb | 1739167 |
93 | hubzilla | 1698190 |
94 | concrete5 | 1663313 |
95 | thirtybees | 1638854 |
96 | clarolineFull | 1625437 |
97 | edusoho | 1616253 |
98 | quizzmp | 1592375 |
99 | review | 1576658 |
100 | glpi | 1498311 |
Updates
Since the initial publication, the following updates has been made
- concrete 5 was introduced at rank #94
Any missing project ?
Exakat needs to identify other large PHP projects so as to add them to the top 10. Any suggestion may be directed to contact@exakat.io or @exakat on twitter !
The previous survey was done in 2016 : “Largest PHP code bases”