Web development in PHP abstracts a lot of the underlying networking and communication issues away from developers. This is great in most cases, but sometimes an application may want more control over when and how responses are sent. The output buffering mechanism is built into the PHP interpreter to satisfy those more advanced requirements.
What is output buffering?
When sending responses from a PHP script, the output is typically sent immediately. This is advantageous, as the script can quickly free up the memory needed to hold the data to be sent, so script execution takes less memory and data is sent as soon as possible. Output buffering puts a stop to this: it stores the entire response in memory, and sends it all at once when all the data is available.
Basic output buffering usage
Output buffering must be started explicitly using the ob_start()
function. Every piece of output, for example from echo or print statements, will then be buffered. The buffered response is sent to the user once one of the flushing functions flush()
, ob_end_flush()
or ob_get_flush()
is called, or when the script finishes.
<?php
// start output buffering
ob_start();
// this will not be sent yet, but stored in the buffer
print("Hello world!");
// send buffered message
ob_flush();
?>
The buffered response can be modified before sending too. It can be read using ob_get_contents()
or deleted with ob_clean()
. This allows to discard previous output entirely:
<?php
// start output buffering
ob_start();
// this will not be sent yet, but stored in the buffer
print("Hello world!");
// read currently buffered output
$previous_output = ob_get_contents();
// delete buffered output
ob_clean();
// write new output
print("I am the new content")
print("Previous output was " + $previous_output);
// send output buffer
ob_flush();
?>
Note that multiple calls to ob_start()
will created nested output buffers, so you can safely use it at any point in the script, even if a previous buffer is still active.
Performance considerations
The main benefit of output buffering is the additional control over the output, especially valuable in larger applications. While it provides a considerable benefit on a structural level, it is important to remember the drawbacks: the entire output is stored in memory until it is flushed. This can significantly increase resource usage - consider this example:
<?php
ob_start();
readfile("archive.zip");
ob_flush();
?>
The script uses readfile()
to output the contents of the file archive.zip
to the user. This can be problematic depending on the size of that file: if archive.zip
is 15GB in size, the script will now need 15GB per user calling the script to execute. If output buffering were off, it could instead read and send the file in tiny chunks (8kb by default), reading only that little part into memory, sending it, and repeating with the next chunks until the file is transferred.
HTTP Headers and output buffering
While output buffering will prevent premature sending of response contents, this does not affect headers. Consider this example:
<?php
print("Hello world");
header("Content-Type: text/plain");
?>
This script would result in the error Warning: Cannot modify header information - headers already sent by [...]
, and the header would not be set in our response. This is due to the nature of the HTTP protocol: it expects messages to begin with metadata, such as the HTTP status code indicating the success of the request, followed by headers that provide information about the type, size and language of the response data, among other things. After the headers comes a blank line followed by the actual response content. The HTTP protocol was not designed to provide headers out of order: once the response content starts, the response cannot switch back to providing more headers inbetween (this has been addressed to some extend in recent years through the addition of trailing headers, we will ignore those for simplicity).
Output buffering can help mitigate this situation, by ensuring no content is sent before all headers had a chance:
<?php
// start output buffering
ob_start();
// is stored in buffer, not sent yet
print("Hello world");
// headers are not buffered, so this gets send immediately
header("Content-Type: text/plain");
// now the print gets send too
ob_flush();
?>
Now that the output buffer has delayed the writing of the print()
function until after the header()
function, the execution order is correct again and the previous error does not show up anymore.
Preventing incomplete pages on errors
By default, an error in a PHP script will stop execution at the point it occurred, allowing a partial response to be sent to the user.
Consider this example:
<?php
// set custom exception handler
set_exception_handler(function($e){
print($e->getMessage());
});
// will be displayed
print("Hello ");
// simulate an error
throw new Exception("Something went wrong!");
// will not be displayed
print("World");
?>
The script returns the partial result Hello Something went wrong!
, where the first print()
call is visible to the user, but the second isn't. Output buffering is commonly used to mitigate this partial response, by buffering the page and replacing it with an error page if something goes wrong:
<?php
// set custom exception handler
set_exception_handler(function($e){
// turn off output buffering and delete buffered content
ob_end_clean();
print($e->getMessage());
});
// enable output buffering
ob_start();
// written to output buffer
print("Hello ");
// simulate an error
throw new Exception("Something went wrong!");
// cannot be reached because of error
print("World");
ob_flush();
?>
Now only the custom error handler's output is shown, and the previous half-rendered page gets discarded.
When used properly, output buffering can be a powerful tool, enabling developers to build more flexible and robust web applications. While it's advantages are desirable, it is important to remember it's drawbacks and turn it off for larger response contents to ensure scripts don't use system memory excessively.