About video codecs and containers
Video files consist of 3 seperate components that define how data is stored and decoded:
- Container format (for example mp4 or webm): Stores the video/audio streams, as well as some metadata like subtitles.
- Video stream: The data of the video itself (just visual frames, no audio), encoded using a video codec like h264 or vp9.
- Audio stream: The audio data encoded using an audio codec like aac or opus.
A single video file will be a specific combination of these 3 components. For web videos, we typically have vp9-encoded video streams with opus-encoded audio streams in a webm container, and h264 video streams with aac audio in mp4 containers as a fallback for older devices.
Encoding a video as WEBM
Our primary video format is VP9 video with Opus audio in a Webm container. It is recommended to encode VP9 using double-pass to enable some quality-enhancements not available in single-pass. For most cases, you will want to encode video using VP9's Constant Quality encoding mode:
ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 0 -crf 33 -pass 1 -row-mt 1 -an -f null /dev/null ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 0 -crf 33 -pass 2 -row-mt 1 -threads 16 -speed 2 -tile-columns 2 -tile-rows 2 -frame-parallel 1 -g 240 -auto-alt-ref 1 -lag-in-frames 25 -deadline good -c:a libopus -b:a 64k -f webm output.webm
The options explained (second pass only):
-i input.mp4: your input file
-c:v libvpx-vp9: encode video as VP9 using libvpx
-b:v 0: turn off video bitrate limiting to allow VP9 to vary bitrate in order maintain the taret quality level
-crf 33: Constant rate factor (aka "Quality") setting. Valid values are
0-63, sane values are
31is recommended to 1080 video.
-row-mt 1Enable row-based multithreading
-threads 16: Use up to
16threads for encoding (4 tile rows * 4 tile columns)
-speed 1: Encoding speed from
16, higher is faster with less quality.
1is a good compromise between quality and speed, as output is very close to
0but much faster. Resolutions of 720p or higher should set this to
2as the quality gain of
1is not humanly recognizable for them.
-tile-columns 2: Log2 number of VP9 tile columns to use from
6(careful with log2 formatted values:
1means 2 columns,
2means 4 columns!). Values >0 and threads>1 enable multithreaded encoding. Maximum value for 1920x1080 video files is
2, larger horizontal resolutions allow higher column counts.
-tile-rows 2: Log2 number of VP9 tile rows to use from
2(careful with log2 formatted values:
1means 2 rows,
2means 4 rows!). Maximum number of rows is 4 independent of video size.
-frame-parallel 1: Enables parallel frame decoding
-g 240: Number of frames allowed between keyframes. Larger values allow for more efficient placement of keyframes resulting in better quality.
-auto-alt-ref 1: Enables use of alternative reference frames (double-pass optimization feature)
-lag-in-frames 25: Number of frames to look ahead for alternative ref frames (see above)
-deadline good: Amount of time to spend encoding, affects quality and encoding time; values are
-c:a libopus: encode audio as opus using libopus
-b:a 64k: set audio encoding target bitrate to
-f webm: output to
If you need to stay below a specific bitrate at the expense of quality, you can use the VP9 Constrained Quality mode by setting a specific bitrate target for
-b:v. If you are not sure whether you need this mode, then you likely don't.
ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 2M -crf 33 -pass 1 -an -f null /dev/null ffmpeg -i input.mp4 -c:v libvpx-vp9 -b:v 2M -crf 33 -pass 2 -threads 8 -speed 2 -tile-columns 6 -frame-parallel 1 -auto-alt-ref 1 -lag-in-frames 25 -deadline good -c:a libopus -b:a 64k -f webm output.webm
Encoding a fallback video as MP4
As a fallback video format for older devices we use the widely-supported h264 video codec with AAC audio encoding in an MP4 container. This encoding does not need a double-pass unless you want to use constant-bitrate encoding, which is a bad choice unless you already calculated a specific target bitrate that you need to maintain at the cost of quality. For our encoding, we use Constant Quality Encoding (CRF), as we did for VP9 above:
ffmpeg -i inputfile.mkv -c:v libx264 -vf format=yuv420p -crf 18 -preset veryslow -movflags +faststart -c:a aac outputfile.mp4
The options explained:
-i inputfile.mkv: your input file
-c:v libx264: use h264 video codec with fast community library
-vf format=yuv420p: use yuv 4-2-0 pixel encoding
-crf 18: constant rate factor (visual quality setting). Sane values are between
18is visually lossless and
28may produce minor visual artifacts
-preset medium: encoding preset (heavily impacts encoding speed; faster = faster encoding but worse quality). Recommended settings: general optimization:
slow; long term storage:
veryslow; short-lived content:
-movflags faststart: Moves the
MOOVatom to the beginning of the file, eliminating unnecessary seek times
-c:a aac- sets audio codec to
-f mp4: output to mp4 container format
Additional flags for specific use cases
In addition to the general video processing above, you sometimes need to further edit the video in some cases:
-an: Mutes audio (removes audio stream). Reduces filesize, good for videos playing muted in the background or that have no audio.
-r 30: Limits max framerate to 30 FPS. Limiting framerate can drastically reduce filesize.
-vf "scale=1920:720,setsar=1": Scale video to 1920x720 dimensions. Remember sides need to remain divisible by 2 to use YUV420 pixel format. Set width or height to -2 to scale only one side while maintaining the original video's aspect ratio
If you have already processed a video and just want to use one of those additional options on the video, you don't need to re-encode it entirely. You can just copy the stream you are not changing (using
-vcodec copy for video or
-acodec copy for audio)
For example. to mute an already processed mp4 video without touching video/container data:
ffmpeg -i input.mp4 -vcodec copy -an output.mp4
This will remove the audio stream while only copying the video stream (thus not doing the expensive video encoding again)