Downloading files is an essential feature of the modern web. Although it may appear simple at first glance, a developer can run into a lot of unexpected complexity when implementing file downloads with support for features expected by the modern web like resumable downloads and forced download start.
Basic file serving
To simply serve a file from the server to the user, we need several things: the size of the file, the type of the file's contents and the file contents themselves.
<?php
function download_file(string $file_path){
header('Content-Type: ' . mime_content_type($file_path));
header('Content-Length: ' . filesize($file_path));
echo file_get_contents($file_path);
}
download_file("archive.zip");
To make usage easier, we package all the downloading logic into a function download_file(), so that every time we need to offer file downloads, we can simply call this function with the path of the file to download. While the current function will serve the file as expected, it has a problem: file_get_contents() will read the entire file into memory. For large files, that will quickly lead to memory exhaustion (i.e. going over the memory limit configured for the server, or using more memory than the server has).
To remedy this, we need to read the file in chunks and send each chunk to the user before reading the next one:
<?php
function download_file(string $file_path){
header('Content-Type: ' . mime_content_type($file_path));
header('Content-Length: ' . filesize($file_path));
$file = fopen($file_path, 'rb');
while (!feof($file)) {
echo fread($file, 8192);
flush();
}
fclose($file);
}
download_file("archive.zip");
The new version now opens the file in binary reading mode, then reads and sends 8kb chunks to the client until the end of the file is reached. With this code in place, the download will take no more than 8kb at a time, no matter how large the file is.
Forcing file downloads and custom file names
The basic version works for presenting the file to the user, but it won't necessarily behave as a download; for files that can't be displayed in a browser, a download will be triggered, but for files that ca be displayed in the browser, like text or images, the browser will display them instead. In order to force every file to behave as a download, we need to send the Content-disposition: attachment
HTTP header with the response. This header can also contain a a second part: a name for the downloaded file. Without this, the browser will name the downloaded file either after the path from the website it was served from, or give it a randomly generated name.
<?php
function download_file(string $file_path, string $file_name) {
header('Content-Type: ' . mime_content_type($file_path));
header('Content-Length: ' . filesize($file_path));
header('Content-Disposition: attachment; filename="' . $file_name . '"');
$file = fopen($file_path, 'rb');
while (!feof($file)) {
echo fread($file, 8192);
flush();
}
fclose($file);
}
download_file("archive.zip", "download.zip");
Since we added the Content-disposition
header, we also added a parameter $file_name
to the download_file()
function to allow providing a custom name for the file download. With this little change, the
Supporting resumable downloads
Users downloading files in the modern web intuitively expect file downloads to be resumable in case they need to pause it or a network issue interrupts it. To support this feature on the server side, the code needs to properly handle HTTP Range Requests. Simply speaking, a range request will ask for only a specific range of the file, rather than the entire file.
Supporting this feature requires multiple steps: First, the server needs serve the Accept-ranges: bytes
HTTP header to advertise that partial requests are supported - without this, browsers may disable the resuming functionality, assuming the server wouldn't support it. For requests that contain a Range
HTTP header, the server then needs to ensure to respond with a 206 Partial Content
status code and the Content-Range
header indicating what part of the file is being returned. If the requested range of bytes cannot be served, for example because they exceed the file size, the status 416 Range Not Satisfiable
needs to be returned.
<?php
function download_file(string $file_path, string $file_name) {
$file_size = filesize($file_path);
$mime_type = mime_content_type($file_path);
if(!isset($_SERVER['HTTP_RANGE'])) {
// no range, serve the whole file
header('Accept-Ranges: bytes');
header('Content-Type: ' . $mime_type);
header('Content-Length: ' . $file_size);
header('Content-Disposition: attachment; filename="' . $file_name . '"');
$file = fopen($file_path, 'rb');
while (!feof($file)) {
echo fread($file, 8192);
flush();
}
fclose($file);
return;
}
// range request, serve only the requested range
list($size_unit, $range_orig) = explode('=', $_SERVER['HTTP_RANGE'], 2);
$range = explode(',', $range_orig, 2)[0];
list($range_start, $range_end) = explode('-', $range, 2);
$range_start = intval($range_start);
$range_end = intval($range_end);
// check if range is valid
if($size_unit != "bytes" || $range_start >= $range_end || $range_end > $file_size - 1 || $range_start < 0 || $range_end < 0) {
header('HTTP/1.1 416 Requested Range Not Satisfiable');
header('Content-Range: bytes */' . $file_size);
return;
}
// serve requested range
header('HTTP/1.1 206 Partial Content');
header('Accept-Ranges: bytes');
header('Content-Type: ' . $mime_type);
header('Content-Length: ' . ($range_end - $range_start + 1));
header('Content-Disposition: attachment; filename="' . $file_name . '"');
header('Content-Range: bytes ' . $range_start . '-' . $range_end . '/' . $file_size);
$file = fopen($file_path, 'rb');
fseek($file, $range_start);
while (!feof($file) && $range_start < $range_end) {
echo fread($file, 8192);
flush();
$range_start += 8192;
}
fclose($file);
}
download_file("archive.zip", "download.zip");
The final function is several times larger than the tiny snippet we started out with, as downloading files contains a lot of complexity not directly visible to the unsuspecting developer when writing it for the first time. The function may be further improved by throwing Exception
s instead of quietly returning on errors, but it is functionally complete and ready to serve and kind and size of file, with support for resuming failed or paused downloads from a browser or download manager.
As a last note, this functionality can also be exploited: A download manager may download a file with multiple requests at once, possibly circumventing bandwidth limits if not configured correctly.
Getting downloads right in a modern web environment can be tricky and is often underestimated by developers. It is crucial to make time to understand the underlying HTTP protocol components necessary for all features or fall back to using tried and tested libraries if this is not feasable.