Designing REST APIs for long-running tasks

HTTP-based REST APIs have long become a standard in inter-program communication. Beloved by programmer for it's intuitive use and by administrators for solely relying on the common HTTP protocol, it is a great fit to expose service APIs over the internet. But the simple request-response pattern quickly falls apart for long-running tasks.

A sample API service

To illustrate the design pattern more effectively, we are going to demonstrate it with a sample API service. This API let's users submit video files and converts them, maybe for use in websites or compressed for long-term storage. This task can, depending on video size, easily take anywhere from seconds to hours. For a service such as this, the typical request-response pattern of receiving the result immediately after making an HTTP request does not work.

To handle this use case, we opt for two endpoints instead:

/convert Receives a video file and adds it to the internal queue. It returns a job_id, which can be used in the second endpoint. A job_id can be anything that uniquely identifies the queued job, for example an SQL primary key or a UUID
/job/<job_id> Is used to interact with the job, receive results or cancel it

The service will use a queue internally to convert videos in the order they were received, to prevent resource exhaustion in case of receiving too many requests at once.

Adding a video to be converted

The first step is to get a video file queued for conversion. This is done by sending an HTTP POST request to the /convert endpoint. This request should contain the video file and possibly additional parameters, for example quality settings for the resulting video.

You should also consider adding a way to reach the user once conversion finishes, for example by letting them provide an email address or webhook, to notify them of the final result of the created job.

The endpoint should return an appropriate HTTP status code in response, for example 202 Accepted or 201 Created.

Getting information about a queued job

In order to enable users to check the status of their queued job, they can issue an HTTP GET request to the /job/<job_id> endpoint with the job_id their job was assigned earlier. This endpoint should be able to reflect both the current state of the job (queued, running, completed etc) as well as the final result of the job if it has already completed. The final result for our sample service could be a download URL of the converted output video file.

Completed jobs could also include more metadata, such as date created, time needed to complete and other other metrics that might be of importance (or interest) t the user.

Interacting with jobs

Especially for long-running jobs, it is important to provide a way to cancel them in case the user does not need it's result anymore, because letting it run in those scenarios will bind resources for a long time for no reason. While this is almost always of no concern for less time-consuming REST APIs, it is very important for those that run time-consuming tasks.

This can be facilitated by allowing users to send an HTTP DELETE request to the /job/<job_id> endpoint, which could have one of two effects:

If the job is currently queued or running, it will be cancelled and removed
If the job has completed, it (and it's output files) will be removed

This allows users full control over how long their content is present on your API servers, while also enabling them to cancel jobs that are not required (anymore).