Netcraft began its Web Server Survey in 1995 and has tracked the deployment of a wide range of scripting technologies across the web since 2001. One such technology is PHP, which Netcraft presently finds on well over 200 million websites.
The first version of PHP was named Personal Home Page Tools (PHP Tools) when it was released by Rasmus Lerdorf in 1995. PHP 1 can still be downloaded today from museum.php.net. Weighing in at only 26 kilobytes in size, php-108.tar.gz is diminutive by today’s standards, yet it was capable of allowing users to implement guestbooks and other form-processing applications.
PHP 2 introduced built-in support for accessing databases, cookie handling, and user-defined functions. It was released in 1997, and by the following year, around 1% of sites on the internet were using PHP.
However, PHP 3 was the first release to closely resemble today’s incarnation of PHP. A rewrite of the underlying parser by Andi Gutmans and Zeev Suraski led to what was arguably a different language; accordingly, it was renamed to simply PHP, which was a recursive acronym for "PHP: Hypertext Preprocessor". This was released in 1998 and the ease of extending the language played a large part in its tremendous success, as this aspect attracted dozens of developers to submit a variety of modules.
Andi Gutmans and Zeev Suraski continued to rewrite PHP’s core, primarily to improve performance and increase the modularity of the codebase. This led to the creation of the Zend Engine, which was used by PHP 4 when it was released in 2000. As well as offering better performance, PHP 4 could be used with more web servers, supported HTTP sessions, output buffering and several new language constructs.
By September 2001, Netcraft’s Web Server Survey found 1.8M sites running PHP.
PHP 5 was released in 2004, and remains the most recent major version release today (5.4.11 was released on 17 January 2013). Zend Engine 2.0 forms the core of this release.
By January 2013, PHP was being used by a remarkable 244M sites, meaning that 39% of sites in Netcraft’s Web Server Survey were running PHP. Of sites that run PHP, 78% are served from Linux computers, followed by 8% on FreeBSD. Precompiled Windows binaries can also be downloaded from windows.php.net, which has helped Windows account for over 7% of PHP sites.
Popular web applications that use PHP include content management systems such as WordPress, Joomla and Drupal, along with several popular ecommerce solutions like Zencart, osCommerce and Magento. In January 2013, these six applications alone were found running on a total of 32M sites worldwide.
PHP also demonstrates a strong installation base across web-facing computers that are found as part of Netcraft’s Computer Counting survey. Just as an individual IP address is capable of hosting many websites, an individual computer can also be configured to have multiple IP addresses. This survey allows us to identify unique web-facing computers and which operating systems they use regardless of how many sites or IP addresses they have. As of January 2013, 2.1M out of 4.3M web-facing computers are running PHP.
PHP has also become a victim of its own success in some respects: With so many servers running PHP, and with so many different web applications authored in PHP, hackers are presented with a huge and rather attractive attack surface. Because it is so easy to get started with programming in PHP, it attracts all levels of developers, many of whom may produce insecure applications through lack of experience and attention to detail. Netcraft’s anti-phishing services find wave upon wave of phishing attacks hosted on compromised PHP applications, and the U.S. NVD (National Vulnerability Database) contains several thousand unique vulnerabilities that relate either to PHP itself, or to applications written in PHP.
Methodology
The full list of hostnames from the Netcraft Web Server Survey forms the basis of our technology tracking. We make requests to each of these sites, or if there is a large number of sites hosted on a single IP address, we employ a proportional sampling technique. The content of each page and its HTTP headers are analysed to determine which technologies are being used. For PHP, we look for references to .php filename extensions or the existence of HTTP response headers like "X-Powered-By: PHP". Additional signature tests are used to identify particular PHP applications, such as WordPress.
Each metric is then calculated as follows:
Hostnames
For each IP address, we estimate the total number of PHP sites it serves by calculating the product of the proportion of sampled hostnames that are running PHP and the total number of hostnames on that IP address. In cases where the IP address is serving 100 or fewer sites, all sites will be sampled and thus be representative of the entire population for that IP address.
Active sites
To provide a more meaningful metric which counts the number of human-generated sites actively using PHP, our active site count excludes spam sites or other computer-generated content. This methodology is described in more detail here.
IP addresses
This metric counts the number of unique IP addresses where at least one hostname in its sample set was found to be running PHP.
Computers
A single physical or virtual computer may have more than one IP address. We are able to identify unique computers that are exposed to the internet via multiple IP addresses. If an IP address is running PHP, then the computer associated with it is marked as running PHP. Further details of this methodology are explained in our Hosting Provider Server Count.