Encoding videos for the modern web using ffmpeg

About video codecs and containers

Video files consist of 3 seperate components that define how data is stored and decoded:

Container format (for example mp4 or webm): Stores the video/audio streams, as well as some metadata like subtitles.
Video stream: The data of the video itself (just visual frames, no audio), encoded using a video codec like h264 or vp9.
Audio stream: The audio data encoded using an audio codec like aac or opus.

A single video file will be a specific combination of these 3 components. For web videos, we typically have vp9-encoded video streams with opus-encoded audio streams in a webm container, and h264 video streams with aac audio in mp4 containers as a fallback for older devices.

Encoding a video as WEBM

Our primary video format is VP9 video with Opus audio in a Webm container. It is recommended to encode VP9 using double-pass to enable some quality-enhancements not available in single-pass. For most cases, you will want to encode video using VP9's Constant Quality encoding mode:

ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 0 -crf 33 -pass 1 -row-mt 1 -an -f null /dev/null
ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 0 -crf 33 -pass 2 -row-mt 1 -threads 16 -speed 2 -tile-columns 2 -tile-rows 2 -frame-parallel 1 -g 240 -auto-alt-ref 1 -lag-in-frames 25 -deadline good -c:a libopus -b:a 64k -f webm output.webm

The options explained (second pass only):

-i input.mp4: your input file
-c:v libvpx-vp9: encode video as VP9 using libvpx
-b:v 0: turn off video bitrate limiting to allow VP9 to vary bitrate in order maintain the taret quality level
-crf 33: Constant rate factor (aka "Quality") setting. Valid values are 0-63, sane values are 15-35; 31 is recommended to 1080 video.
-row-mt 1 Enable row-based multithreading
-threads 16: Use up to 16 threads for encoding (4 tile rows * 4 tile columns)
-speed 1: Encoding speed from -16 to 16, higher is faster with less quality. 1 is a good compromise between quality and speed, as output is very close to 0 but much faster. Resolutions of 720p or higher should set this to 2 as the quality gain of 1 is not humanly recognizable for them.
-tile-columns 2: Log2 number of VP9 tile columns to use from -1 to 6 (careful with log2 formatted values: 1 means 2 columns, 2 means 4 columns!). Values >0 and threads>1 enable multithreaded encoding. Maximum value for 1920x1080 video files is 2, larger horizontal resolutions allow higher column counts.
-tile-rows 2: Log2 number of VP9 tile rows to use from -1 to 2 (careful with log2 formatted values: 1 means 2 rows, 2 means 4 rows!). Maximum number of rows is 4 independent of video size.
-frame-parallel 1: Enables parallel frame decoding
-g 240: Number of frames allowed between keyframes. Larger values allow for more efficient placement of keyframes resulting in better quality.
-auto-alt-ref 1: Enables use of alternative reference frames (double-pass optimization feature)
-lag-in-frames 25: Number of frames to look ahead for alternative ref frames (see above)
-deadline good: Amount of time to spend encoding, affects quality and encoding time; values are good, best, realtime
-c:a libopus: encode audio as opus using libopus
-b:a 64k: set audio encoding target bitrate to 64k
-f webm: output to webm container format

If you need to stay below a specific bitrate at the expense of quality, you can use the VP9 Constrained Quality mode by setting a specific bitrate target for -b:v. If you are not sure whether you need this mode, then you likely don't.

ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 2M -crf 33 -pass 1 -an -f null /dev/null
ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 2M -crf 33 -pass 2 -threads 8 -speed 2 -tile-columns 6 -frame-parallel 1 -auto-alt-ref 1 -lag-in-frames 25 -deadline good -c:a libopus -b:a 64k -f webm output.webm

Encoding a fallback video as MP4

As a fallback video format for older devices we use the widely-supported h264 video codec with AAC audio encoding in an MP4 container. This encoding does not need a double-pass unless you want to use constant-bitrate encoding, which is a bad choice unless you already calculated a specific target bitrate that you need to maintain at the cost of quality. For our encoding, we use Constant Quality Encoding (CRF), as we did for VP9 above:

ffmpeg -i inputfile.mkv -c:v libx264 -vf format=yuv420p -crf 18 -preset veryslow -movflags +faststart -c:a aac outputfile.mp4

The options explained:

-i inputfile.mkv: your input file
-c:v libx264: use h264 video codec with fast community library
-vf format=yuv420p: use yuv 4-2-0 pixel encoding
-crf 18: constant rate factor (visual quality setting). Sane values are between 18 and 28 where 18 is visually lossless and 28 may produce minor visual artifacts
-preset medium: encoding preset (heavily impacts encoding speed; faster = faster encoding but worse quality). Recommended settings: general optimization: medium / slow; long term storage: veryslow; short-lived content: faster / fast / medium; transcoding: ultrafast / superfast / veryfast / faster / fast / medium
-movflags faststart: Moves the MOOV atom to the beginning of the file, eliminating unnecessary seek times
-c:a aac - sets audio codec to AAC
-f mp4: output to mp4 container format

Additional flags for specific use cases

In addition to the general video processing above, you sometimes need to further edit the video in some cases:

-an: Mutes audio (removes audio stream). Reduces filesize, good for videos playing muted in the background or that have no audio.
-r 30: Limits max framerate to 30 FPS. Limiting framerate can drastically reduce filesize.
-vf "scale=1920:720,setsar=1": Scale video to 1920x720 dimensions. Remember sides need to remain divisible by 2 to use YUV420 pixel format. Set width or height to -2 to scale only one side while maintaining the original video's aspect ratio

If you have already processed a video and just want to use one of those additional options on the video, you don't need to re-encode it entirely. You can just copy the stream you are not changing (using -vcodec copy for video or -acodec copy for audio)

For example. to mute an already processed mp4 video without touching video/container data:

ffmpeg -i input.mp4 -vcodec copy -an output.mp4

This will remove the audio stream while only copying the video stream (thus not doing the expensive video encoding again)

Encoding videos for the modern web using ffmpeg

About video codecs and containers

Encoding a video as WEBM

Encoding a fallback video as MP4

Additional flags for specific use cases

More articles

Unleashing the full potential of Google Search with operators

Making sense of lists, dicts, tuples, and sets in python

Prioritizing Processes in Linux for Optimal Resource Management

Automating SSL certificates for web servers with certbot

Automating backups with ansible

The definitive guide to HTTP caching