Quick Start Guide FFmpeg 2023
Quick Start Guide FFmpeg 2023
full color
Quick Start
Guide to FFmpeg
Learn to Use the Open Source
Multimedia-Processing Tool
like a Pro
—
V. Subhash
Quick Start Guide to
FFmpeg
Learn to Use the Open Source
Multimedia-Processing
Tool like a Pro
V. Subhash
Quick Start Guide to FFmpeg: Learn to Use the Open Source
Multimedia-Processing Tool like a Pro
V. Subhash
Chennai, Tamil Nadu, India
Acknowledgments����������������������������������������������������������������������������xvii
Introduction���������������������������������������������������������������������������������������xix
v
Table of Contents
vi
Table of Contents
vii
Table of Contents
Remove Logo����������������������������������������������������������������������������������������������������103
Fade into Another Video (And in Audio Too)�������������������������������������������������������105
Crop a Video������������������������������������������������������������������������������������������������������107
Blur or Sharpen a Video������������������������������������������������������������������������������������109
Blur a Portion of a Video�����������������������������������������������������������������������������������110
Draw Text����������������������������������������������������������������������������������������������������������112
Draw a Box��������������������������������������������������������������������������������������������������������113
Speed Up a Video����������������������������������������������������������������������������������������������115
Slow Down a Video�������������������������������������������������������������������������������������������116
Summary����������������������������������������������������������������������������������������������������������117
viii
Table of Contents
ix
Table of Contents
x
Table of Contents
Index�������������������������������������������������������������������������������������������������271
xi
About the Author
V. Subhash is an invisible Indian writer,
programmer, and illustrator. In 2020, he
wrote one of the biggest jokebooks of all
time and then ended up with over two dozen
mostly nonfiction books including Linux
Command-Line Tips & Tricks, CommonMark
Ready Reference, PC Hardware Explained,
Cool Electronic Projects, and How To Install
Solar. He wrote, illustrated, designed, and
produced all of his books using only open source software. Subhash has
programmed in more than a dozen languages (as varied as assembly
and Java); published software for desktop (NetCheck), mobile (Subhash
Browser & RSS Reader), and the Web (TweetsToRSS); and designed several
websites. As of 2022, he is working on a portable JavaScript-free CMS using
plain-jane PHP and SQLite. Subhash also occasionally writes for the Open
Source For You magazine and CodeProject.com.
xiii
About the Technical Reviewer
Gyan Doshi has been with the FFmpeg project as a developer and
maintainer since 2018. During this time, he has focused on FFmpeg
filters, formats, and command-line tools. From his experience in video
postproduction stages such as editing and motion graphics, Gyan has
learned how FFmpeg can be used in multimedia workflows as a valuable
addition or as a substitute for expensive tools. Aside from being engaged as
a multimedia/FFmpeg consultant, Gyan also troubleshoots FFmpeg issues
on online forums such as Stack Exchange and Reddit.
Gyan builds the official Windows binary packages of FFmpeg (ffmpeg,
ffprobe, and ffplay) and other tools (ffescape, ffeval, graph2dot, etc.)
and offers them for download from his website at www.gyan.dev.
xv
Acknowledgments
The author would like to thank:
xvii
Introduction
FFmpeg is a free and open source program for editing audio and video files
from the command line. You may have already known FFmpeg as a nifty
program that can do simple conversions such as:
FFmpeg is much more capable than this, but it is this intuitive interface
and support for a wide variety of formats that has won it millions of users.
The FFmpeg project was originally started by a French programmer
named Fabrice Bellard in the year 2000. It is now being developed by a
large team of open source software developers spread around the world.
This book can serve as an easy FFmpeg tutorial, hack collection, and a
ready reference. However, it is not possible for one book to cover everything
that FFmpeg can do. FFmpeg has a very huge online documentation with
which you may have to craft your commands. While this book may seem
more than enough for most users, the documentation will open up vastly
more possibilities. DO NOT avoid going through the documentation.
Before you go further into the book, you should be aware that the
FFmpeg project creates two types of software:
xix
Introduction
In this book, we will ignore the libav libraries and instead focus on the
ffmpeg command-line program.
xx
Introduction
• www.vsubhash.in/ffmpeg-book.html
xxi
CHAPTER 1
Installing FFmpeg
In the Introduction, I mentioned that FFmpeg was an “end-user program.”
It is actually three command-line end-user programs, or executables:
1. ffprobe
2. ffplay
3. ffmpeg
The executables for these programs are available for Linux, Mac,
Windows, and other operating systems (OSs). When you go to the FFmpeg
website (www.ffmpeg.org), you will have two download options:
If you are unfamiliar with building executables from source code (as
are most people), you should choose the first option.
https://ffmpeg.org/download.html
© V. Subhash 2023 1
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_1
Chapter 1 Installing FFmpeg
2
Chapter 1 Installing FFmpeg
Figure 1-2. There may be more than one “build” option for the
downloads
Figure 1-3. The downloaded archive file contains three EXE files.
Copy them to a folder specified in your PATH environment variable
Copy the EXE files to some folder that is already included in your
operating system’s PATH environment variable. If you copy them to a new
folder, then add the folder’s full location to the PATH variable.
If you do not do the above, you will need to type the full path of the
executable in your commands in the Command Prompt window.
3
Chapter 1 Installing FFmpeg
Let us assume that you have extracted the EXE files to the folder C:\
MyInstalls\ffmpeg\bin. Launch the Command Prompt window with
Administrator privileges. Then, permanently suffix this folder’s location to
the PATH environment variable with this command.
ffmpeg -version
If you do not modify the environment variable, then you will have to
type the full path whenever you want to use the program.
C:\MyInstalls\ffmpeg\bin\ffmpeg -version
@ Causes error
FFMPEG -VERSION
@ Causes no error
FFMPEG -version
ffmpeg -version
4
Chapter 1 Installing FFmpeg
You should avoid writing anything after the backslash or the caret.
Invisible trailing space(s) can also make a command to fail. (This happens
often with copy-pasted commands.)
5
Chapter 1 Installing FFmpeg
https://trac.ffmpeg.org/wiki/CompilationGuide
configure --help
6
Chapter 1 Installing FFmpeg
In your Linux package manager app, try to search and install (dev-
suffixed) developmental packages with similar names as the external
libraries. You may not be able to install developmental packages for all
of the libraries. But, for whatever libraries that you can install or have
them already installed, add relevant -enable options to the configure
compilation step. Here are a few:
...
--enable-chromaprint --enable-frei0r \
--enable-libbluray --enable-libbs2b --enable-libcdio \
--enable-libflite --enable-libfontconfig \
7
Chapter 1 Installing FFmpeg
--enable-libfreetype --enable-libfribidi \
--enable-libmp3lame --enable-libsmbclient \
--enable-libv4l2 --enable-libvidstab \
...
Run the FFmpeg build statement with these changes, and eventually
all three binary executable files will be created in your $HOME/bin
directory. Then, secure the copy of the documentation from the ffmpeg_
build directory so that you can read it whenever it is required.
If you have an old OS where the latest FFmpeg executable does not
run or cannot be compiled, go to https://johnvansickle.com/ffmpeg/
and download pre-built statically linked executables (not including
ffplay). On my old Ubuntu 10 Fiendish Frankenstein installation, I could
not run the latest FFmpeg pre-built executable nor build the source, but
these statically linked executables worked. (Even the C library is statically
linked.) That is how I was able to finish the 2020 version of this book in
the old OS.
8
Chapter 1 Installing FFmpeg
https://trac.ffmpeg.org/wiki/CompilationGuide/macOS
S
ummary
Although originally designed as a Linux program, FFmpeg is also available
for Windows and Mac operating systems. In this chapter, you learned how
to obtain pre-built FFmpeg executables specific to your OS from the official
FFmpeg site. You also learned how to build your own customized FFmpeg
executables from source.
In the next chapter, you will learn how to start using the executables.
9
CHAPTER 2
Starting with FFmpeg
The FFmpeg project provides several end-user programs. This book will
focus on three command-line programs – ffprobe, ffplay, and ffmpeg.
You will be using ffmpeg most of the time, but ffprobe and ffplay can
help you as well. In this chapter, you will gain an introduction to all three.
All three have an annoying “feature” – they display a build-information
banner that is as big as the state of Texas. If you create the following aliases
in your $HOME/.bashrc file, then you do not have to suffer the annoyance.
Some command examples in this book will have the suffixes 2>
/dev/null or > /dev/null. Such recourses were necessary to prevent
information clutter.
© V. Subhash 2023 11
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_2
Chapter 2 Starting with FFmpeg
ffprobe
If you want to find out useful information about an audio or video file, you
need to use ffmpeg with the -i option. With ffprobe, you do not need
the option.
ffmpeg -i tada.wav
ffprobe tada.wav
ffprobe can reveal much more information than this if you use
the -show_streams option. You can filter the output of this command for
use in your shell scripts. In a later chapter, you will find a sample output of
this command.
12
Chapter 2 Starting with FFmpeg
ffplay
If you want to play a video file directly from the command line, just type
ffplay and the file name. ffplay is a tiny media player. It does not have
a context menu system or other interface. It responds to some keys and
mouse clicks but does nothing more.
ffplay solar.mp4
13
Chapter 2 Starting with FFmpeg
ffmpeg
The executables ffprobe, ffplay, and ffmpeg have several common
command-line options (arguments, switches, or parameters). You can list
most of them with the -h option.
ffmpeg -h
ffmpeg -h long
ffmpeg -h full > ffmpeg-help-full.txt
ffmpeg -formats
ffmpeg -encoders
ffmpeg -decoders
ffmpeg -codecs
ffmpeg -filters
ffmpeg -h demuxer=mp3
ffmpeg -h encoder=libmp3lame
ffmpeg -h filter=drawtext
14
Chapter 2 Starting with FFmpeg
Summary
In this chapter, you gained an introduction to the three FFmpeg
executables. Before venturing into what FFmpeg can do for you, you need
to learn a few things about multimedia formats and codecs. The next
chapter will help you with that.
15
CHAPTER 3
Formats and Codecs
An MP3 audio file can be identified by its “.mp3” file extension. Similarly,
an MP4 video file can be identified by the “.mp4” extension. The file
extensions of multimedia files do not provide any kind of surety about
the format. Even the format name is merely a notion. If you need to
process audio and video content, you need to go beyond file extensions.
You need to be familiar with multimedia concepts such as containers,
codecs, encoders, and decoders. In this chapter, you will gain some basic
information about all that and more.
Containers
Multimedia files such as MP4s or MP3s are just containers – containers
for some audio and/or video content. An MP4 file is a container for some
video content written using the H.264 codec and some audio content
written using the AAC codec. It need not be like that for all MP4 files. Some
MP4 files may have their video content written using the Xvid codec and
the audio content written using the MP3 codec. Similarly, AVI, MOV WMV,
and 3GP are popular containers for audio/video content. Codecs can differ
from file to file even if their extensions are the same. A multimedia file may
have the wrong extension because of some human error. You can expect all
sorts of combinations in the wild.
When the codecs are not what is usually expected in a container, you
may encounter annoying format errors in playback devices. Sometimes,
you may be able to fix the error by simply renaming the file with the correct
© V. Subhash 2023 17
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_3
Chapter 3 Formats and Codecs
extension. At other times, you will have to re-encode the file using codecs
supported by the device. So, what does it mean when a device says it only
supports certain “codecs”?
18
Chapter 3 Formats and Codecs
ffmpeg -i uncompressed-stereo.wav \
-c:a libmp3lame -b:a 128k -ac 2 -ar 44100 \
compressed.mp3
☞ You will learn more about these settings in later chapters, but
for now just be aware that they are often required.
Demuxers and Muxers
I have been using FFmpeg for years without knowing what demuxers
and muxers were. Even now, I cannot care less. Well… maybe a little.
A demuxer is a software component that can read a multimedia input
file so that a decoder can work on it. Similarly, a muxer writes data to a
multimedia output file after it has been processed by an encoder. Between
a decoder and encoder, some processing work may be done, or it may even
pass directly to the other end. Here is all that you need to know:
19
Chapter 3 Formats and Codecs
For example, to read and write to the MP4 format, an MP4 demuxer
and an MP4 muxer are required. FFmpeg automatically takes care of
muxers and demuxers so that you do not have to bother with them.
However, there may come situations when you do have to explicitly
address them.
20
Chapter 3 Formats and Codecs
Summary
In this chapter, you learned some theoretical concepts about multimedia
formats, containers, and codecs. In the next chapter, we will delve deeper
into the container and learn how to refer to its constituents from the
command line using index numbers.
21
CHAPTER 4
Media Containers
and FFmpeg
Numbering
In the previous chapter, you learned that a multimedia file is actually a
container. On the inside, it encloses multimedia streams and metadata. In
this chapter, you will learn what streams and metadata are and how you
can access them from the command line. The sections in this chapter are
arranged for easy access and completeness. It may not be possible for you
to understand all of it on your first read. Return to this chapter a few times
to get a full understanding.
Containers
A container can have several streams. A stream could be audio, video,
subtitles, or a file attachment.
In an MP4 video file or container, you will usually find a video stream
and an audio stream. In an MP3 file, you will find an audio stream and
maybe some IDv3 tags (such as title, album, and artist) as metadata.
If you have one of those rare multi-angle DVDs, then each camera
angle will be represented by a separate video stream. Multi-language
videos will have an audio stream for each language. DVD subtitles for
© V. Subhash 2023 23
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_4
Chapter 4 Media Containers and FFmpeg Numbering
Container Internals
Logically, the internals of a multimedia file look like this. A container
needs to have at least one stream. Everything else is optional. It is all
right for a video file to not have album art, subtitles, custom fonts, or tags
(global metadata), but one video stream and one audio stream are usually
expected.
24
Chapter 4 Media Containers and FFmpeg Numbering
From this logical representation, you will note that a multimedia file
container may have some global metadata and that each stream in the
container can have stream-specific metadata too.
You can use ffprobe to display these details for any multimedia file.
In this ffprobe output, the global metadata for the MP3 file shows ID3
tags such as title, album, and artist. It also includes a “comment” metadata
that I added after I bought the music. The metadata for the audio stream
shows that it was encoded using the LAME encoder by the music vendor.
The album art is shown as a video stream but it has only one frame. More
importantly, you should note that FFmpeg refers to the input files and
streams using index numbers starting from 0 (zero), instead of 1 (one).
Here is another example; this one is for a video file.
25
Chapter 4 Media Containers and FFmpeg Numbering
26
Chapter 4 Media Containers and FFmpeg Numbering
27
Chapter 4 Media Containers and FFmpeg Numbering
☞ ffmpeg can also read from streams and write to them. The
streams can be piped from/to another command and also transported
over a network protocol. For more information, read the official
documentation on protocols.
Figure 4-4. The output video is the input video with the overlaid
input image
The two input files were specified using the -i option. An MP4 video
file is input file #0 and a PNG image file is input file #1. The output file, as
is always, has been specified last.
28
Chapter 4 Media Containers and FFmpeg Numbering
Figure 4-5. The output of the command shows the index numbers
used for the input files and streams
The output of the command shows that the first stream in the first
input file is a video stream and is numbered #0:0. The second stream in
that file is an audio stream and is numbered #0:1. The first stream in the
second input file (the PNG image file) is considered as a video stream even
though it has only one (image) frame and is identified as #1:0.
29
Chapter 4 Media Containers and FFmpeg Numbering
You can refer to streams by their type. In the previous command, the
streams were as follows:
For this to become clear, spend some time studying the screenshot in
Figure 4-5.
Suppose that a multi-language DVD video file had one video stream
and two audio language streams. The streams can be referred as follows:
As you may have guessed, the stream-type identifier for video is v and
a for audio. There are others as given in Table 4-1.
30
Chapter 4 Media Containers and FFmpeg Numbering
Audio a
Video v
Video (not images) V
Subtitles s
File attachments t
Data d
After displaying the information about the input files and streams,
ffmpeg will list how the input streams will be processed and mapped to
intermediate and final streams. Then, it will list the final output files and
their streams. In a bash terminal, you can press the key combination Ctrl+S
if you wish to pause and study this information. Otherwise, all of this
information will quickly flash past your terminal as ffmpeg will then post a
huge log of informational, warning, and error messages as it performs the
actual processing of the input data.
Maps
With multiple input files, FFmpeg will use an internal logic to choose
which input streams will end up in the output file. To override that, you
can use the -map option. Maps enable you to specify your own selection
and order of streams for the output file. You can specify stream mapping in
several ways:
-map InputFileIndex
all streams in file with specified index
-map InputFileIndex:StreamIndex
the stream with specified index in file with specified index
-map InputFileIndex:StreamTypeIdentifier
all streams of specified type in file with specified index
-map InputFileIndex:StreamTypeIdentifier:StreamIndex
among streams of specified type in file with specified index, the
stream with specified index
32
Chapter 4 Media Containers and FFmpeg Numbering
Figure 4-6. The audio of this video had gramophone sound artifacts
# Swap the existing audio track with the mp3 fixed by Audacity
ffmpeg -i Stopmotion-hot-wheels.mp4 \
-i Stopmotion-hot-wheels-fixed.mp3 \
-map 0:0 -map 1:0 \
-codec copy \
Stopmotion-hot-wheels-fixed.mp4
33
Chapter 4 Media Containers and FFmpeg Numbering
In the first command, I included a map for the second stream (0:1) in
the MP4 file and saved it as an MP3 file. (I assumed that the second stream
was an audio stream. It need not be.) I then corrected errors in the MP3 file
using Audacity. In the second command, the first input file (the MP4 file)
had two streams – (0:0) and (0:1) – same as in the first command. (More
assumptions.) The second input file (the “fixed” MP3) had one stream
(1:0). In the second command, I used the first file’s first stream (0:0) and
the second file’s first and only stream (1:0). Alternatively, I could have
typed the command by mapping to the first file’s first video stream (0:v:0)
and the second file’s first audio stream (1:a:0).
ffmpeg -i Stopmotion-hot-wheels.mp4 \
-i Stopmotion-hot-wheels-fixed.mp3 \
-map 0:v:0 -map 1:a:0 \
-codec copy \
Stopmotion-hot-wheels-fixed.mp4
The audio stream in the original MP4 (0:1) or (0:a:0) gets discarded
because it was not included in any of the maps. If I wanted to retain the
original audio stream, I can add another map for it as a second audio
stream. The fixed audio track will be played by default by media players.
I can manually select the second audio track with the remote or a menu
option to hear the unfixed original audio.
ffmpeg -i Stopmotion-hot-wheels.mp4 \
-i Stopmotion-hot-wheels-fixed.mp3 \
-map 0:v:0 -map 1:a:0 -map 0:a:0 \
-codec copy \
Stopmotion-hot-wheels-fixed-n-restored.mp4
34
Chapter 4 Media Containers and FFmpeg Numbering
You can use maps when generating multiple output files with one
command.
ffmpeg -i solar.mp4 \
-map 0:1 -c:a libmp3lame -b:a 128k solar-high.mp3 \
-map 0:1 -c:a libmp3lame -b:a 64k solar-low.mp3
The -map options provide a new set of streams available for options
specified after them. Options such as -codec or -ac will only affect streams
specified by the -map options before them, not the streams available in the
input files.
Metadata
Metadata means data about data. When using FFmpeg, metadata is read
by the demuxer and/or written by the muxer. The data is usually specified
as key-value pairs. For a media file, the metadata can be global (for
the entire file) or specific to a stream in the file. Each container format
specifies a limited set of metadata keys. The MP3 format, for example,
supports metadata keys such as title, artist, album, and copyright. You can
specify metadata for individual streams as follows:
-metadata:s:StreamIndex or
-metadata:s:StreamTypeIdentifier:StreamIndex
35
Chapter 4 Media Containers and FFmpeg Numbering
36
Chapter 4 Media Containers and FFmpeg Numbering
ffmpeg -y -i raisa.mp3 \
-map 0 -c copy \
-metadata:s:v:0 title='raisa.png' \
raisa2.mp3 # Smooth!
This command makes no changes to the MP3 except for the value of
the incriminating title metadata of the album art.
37
Chapter 4 Media Containers and FFmpeg Numbering
ffmpeg -i Stopmotion-hot-wheels.mp4 \
-i Stopmotion-hot-wheels-fixed.mp3 \
-map 0:v:0 -map 1:a:0 -map 0:a:0 \
-codec copy \
-metadata:s:a:0 language="eng" \
-metadata:s:a:1 language="fre" \
Stopmotion-hot-wheels-fixed-n-restored.mp4
38
Chapter 4 Media Containers and FFmpeg Numbering
-metadata:s:s:0 language="eng" \
-metadata:s:s:1 language="fre" \
Metadata Maps
Have you noticed that when you convert MP3 files, the album art or the
meta tags get lost? This is because of improper or no metadata mapping.
Metadata can get lost when you convert files or create new files from
multiple input files. The -map_metadata option helps you correctly route
metadata from input files to output files. Its value is specified in a rather
twisted manner. The left is the destination and the right is the source.
-map_metadata InputFileIndex:MetadataSpecifier or
-map_metadata:g InputFileIndex:MetadataSpecifier or
-map_metadata:MetadataSpecifier InputFileIndex:⏎
MetadataSpecifier
39
Chapter 4 Media Containers and FFmpeg Numbering
Where
Yeah, it made my head spin too! Take your time. Nobody does
metadata mapping on their first excursion into FFmpeg. Take the
slow lane.
The following example copies global metadata from the second input
file (-map 1) as the global metadata for the output file. This ensures that
the MP3 tags are copied as the video’s metadata.
The next example copies global metadata from the second input file
both globally (:g) and to the audio stream (:s:a). The global metadata
from the second input file can be specified either as 1:g or simply as 1.
Global output metadata can be typed as -map_metadata:g (as below) or
simply as -map_metadata (as above).
40
Chapter 4 Media Containers and FFmpeg Numbering
Figure 4-11. The global metadata has been duplicated to the audio
stream metadata as well
Channel Maps
Audio streams can have one or more channels. Monaural audio has only
one channel. Stereo music has two channels - left and right. DVD movies
can have two or six or eight channels for playback on both stereo and
surround speaker systems.
To pin down the channels exactly as you want in the output file, you
need to use the -map_channel option. It can be specified as follows:
-map_channel
InputFileIndex.StreamIndex.ChannelIndex
41
Chapter 4 Media Containers and FFmpeg Numbering
or as
-map_channel -1
if you want the channel muted.
The -map_channel options specify the input audio channels and the
order in which they are placed in the output file.
Imagine that the audio channels in an MP4 file are mixed up. When
you wear headphones, in either ear, the voices are heard for people on the
opposite side in the video. You can fix it by the following:
ffmpeg -i wrong-channels.mp4 \
-c:v copy \
-map_channel 0.1.1 -map_channel 0.1.0 \
fine-channels.mp4
ffmpeg -i moosic.mp3 \
-map_channel 0.0.0 -map_channel -1 \
moosic4lefty.mp3
☞ No, you should not make it mono. Mono audio will be heard on
both sides.
In some videos, the left and right audio channels are independent
tracks. What these content creators do is place the original audio on one
channel and the most annoying royalty-free music on the other. Instead
42
Chapter 4 Media Containers and FFmpeg Numbering
ffmpeg -y -i zombie.mp4 \
-map 0:0 -map 0:1 -map 0:1 -map 0:1 \
-map_channel 0.1.0:0.1 -map_channel 0.1.1:0.2 \
-c:v copy \
zombie-tracks.mp4
The first stream in the output file will be the original video (0.0). The
left channel (0.1.0) will be the second stream (0.1). The right channel
(0.1.1) will be the third stream (0.2). The original stereo audio will
become the fourth stream. (Yes, the second and third streams will be
mono audio.)
What about the numbers after the colon? That is explained by the full
definition for channel maps:
-map_channel InputFileIndex.InputFileStreamIndex.⏎
ChannelIndex:OutputFileIndex.OutputFileStreamIndex
How do you like them apples? The second part beginning with the
colon is optional. It is for placing the mapped input audio channel on a
specified output stream.
43
Chapter 4 Media Containers and FFmpeg Numbering
44
Chapter 4 Media Containers and FFmpeg Numbering
S
ummary
In this chapter, you learned about how to access streams and metadata.
You also learned how to pick and choose what streams and metadata you
would like to have in the output file(s).
As mentioned in the beginning of this chapter, it is not necessary
that you grasp every detail in this chapter on the first go. As you read
forthcoming chapters, certain things mentioned in this chapter will
become clearer. If not, you can always return to this chapter.
45
CHAPTER 5
Format Conversion
The main reason that so many people use ffmpeg is its amazing ability to
convert files from one format to another. ffmpeg supports so many formats
that I doubt there is any competition even from paid software. In this
chapter, you will learn how to perform these conversions and customize
them to extract the best quality from the source files.
No-Brainer Conversions
The default output format in many Linux multimedia programs is OGV
and OGG files. Sadly, very few consumer electronic devices support these
two formats. I use gtk-recordMyDesktop to screen capture my computer
demos, and it creates OGV video files. Before I can play the files on my TV,
I need to convert them to MP4 format.
FFmpeg can guess the output format based on the file extension
you have used for the output file. It will automatically apply some good
preset conversion settings (defaults). You can specify custom conversion
settings too.
© V. Subhash 2023 47
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_5
Chapter 5 Format Conversion
Conversion Options
Table 5-1 lists a few FFmpeg options that are useful when converting files.
You will learn how to use them in the rest of this chapter.
48
Chapter 5 Format Conversion
Obsolete/Incorrect Options
FFmpeg is fault-tolerant to an extent but do not be sloppy in typing the
options. You should avoid using -r:a instead of -ar (audio sampling rate).
Instead of conventions such as -acodec and -vcodec, you should be using
-c:a or -c:v instead. Support for such old practices may be removed
in future.
Codec Option
The -codec option is used to specify an encoder (when used before an
output file). When used before an input file, it refers to the decoder.
(ffmpeg may have more than one decoder and encoder for a particular
codec.) Choose the correct name from the output of the command
ffmpeg -encoders or ffmpeg -decoders, and not from that of
ffmpeg -codecs.
The -codec option can also be specified for all streams for a particular
type, such as -codec:a for all audio streams or -codec:s for all subtitle
streams or for a particular stream using its index. For each stream, only the
last applicable -codec option will be considered. If you use the value copy
for the encoder, ffmpeg will copy applicable streams as is without using an
encoder.
How do you know which codec (encoder name) you need to use for a
particular format? For an MP3 file, you could try the following:
49
Chapter 5 Format Conversion
ffmpeg -i net-video.mp4 \
-s 320x240 \
-c:v mpeg4 -b:v 200K -r 24 \
50
Chapter 5 Format Conversion
The output video stream uses MPEG4 codec with qvga (320x240)
dimensions, 200K bitrate, and a 24 frames-per-second rate. The output
audio stream uses MP3 codec (Lame encoder) with two-channel audio
(stereo) and 96K bitrate.
☞ You will know what values to use for each setting only if you make
it a habit to use ffprobe on new types of files that you encounter.
Multi-pass Conversion
In multi-pass encoding, ffmpeg processes the video stream multiple times
to ensure the output video is close to the specified bitrate. ffmpeg creates
a log file for each pass. In the initial passes, the audio is not processed
and video output is not saved (dumped on null device). In the final pass,
however, you will have to specify the audio conversion settings and the
output file. In the next example, the conversion from the previous section
is performed using two passes.
This is the first pass.
ffmpeg -y -i net-video.mp4 \
-s 320x240 -c:v mpeg4 -b:v 194k -r 24 \
51
Chapter 5 Format Conversion
ffmpeg -y -i net-video.mp4 \
-s 320x240 -c:v mpeg4 -b:v 194k -r 24 \
-pass 2 -passlogfile /tmp/ffmpeg-log-net-video \
-c:a libmp3lame -ac 2 -b:a 96K \
portable-video.mp4
☞ When the streams meet the specified bitrates, you will also know
exactly how big the file will be. Just multiply the bitrate with the duration
of the video. The reverse is also true. You can target a particular file size
(allowing for some deviation) by specifying a proportional bitrate for both
the audio and video. Conversion with constant bitrate was popular when
DVD videos were encoded (ripped off) to fit on a CD.
52
Chapter 5 Format Conversion
the H.264 codec, you can achieve the required quality and compression
in one pass using the -crf (CRF or Constant Rate Factor) option and by
specifying a processing “preset.” The -crf option affects quality.
Figure 5-2. This extract from the output of an old script shows preset
and tuning variables supported by the H.264 encoder
ffmpeg -i solar.mp4 \
-c:v libx264 -crf 21 -preset fast \
-c:a copy \
solar-CONVERTED.mp4
53
Chapter 5 Format Conversion
Figure 5-3. The ffmpeg output stream details will tell you which pixel
format has been used
The CRF range is from 0 (lossless) to 63 (worst) for 10-bit pixel formats
(such as yuv420p10le) and 0 to 51 for 8-bit pixel formats (such as yuv420p).
You can determine the pixel format from the ffmpeg output of a similar
file conversion. The median can be 21 for 8-bit encoder and 31 for 10-bit
encoder.
What the heck is a pixel format? All that you need to know about
pixel format (at this stage) is that it is a data-encoding scheme used
to specify the colors of each pixel (dots) in a video frame. FFmpeg
supports these pixel formats: monob, rgb555be, rgb555le, rgb565be,
rgb565le, rgb24, bgr24, 0rgb, bgr0, 0bgr, rgb0, bgr48be,
uyvy422, yuva444p, yuva444p16le, yuv444p, yuv422p16, yuv422p10,
yuv444p10, yuv420p, nv12, yuyv422, and gray.
In addition to the processing preset, you can also specify a -tune
option depending on the kind of video that you have selected. The
values psnr and ssim are used to generate video quality metrics and are
not normally used in production. zerolatency output can be used for
streaming. fastdecode can be used for devices that do not have a lot of
processing power. grain is to prevent the encoder from being confused by
grainy videos.
54
Chapter 5 Format Conversion
Audio Conversion
This command uses the Lame MP3 encoder to convert an Ogg audio file to
a 128K-bitrate two-channel (stereo) MP3 file.
ffmpeg -i alarm.ogg \
-c:a libmp3lame \
-ac 2 \
-b:a 128K \
alarm.mp3
Audio Extraction
Some video files have great sound. Music videos are good examples. How
do you extract their audio? Well, drop the video stream and copy the audio
stream to an audio file.
# Matroska audio
ffmpeg -i music-video.mp4 -c:a copy music-video.mka
# MPEG4 audio - FFmpeg flounders
ffmpeg -i music-video.mp4 -vn -c:a copy music-video.m4a
☞ Without -vn, the video stream will get copied to the m4a file!
Hurray for redundant options! Le paranoid survive!
55
Chapter 5 Format Conversion
ffmpeg -i music-video.mp4 \
-c:a libmp3lame -b:a 128K -ac 2 \
music-video.mp3
ffmpeg -i music-video.mp4 \
-vn \
-map 0:a -c:a libmp3lame -b:a 128K music-high.mp3 \
-map 0:a -c:a libmp3lame -b:a 64K music-low.mp3
56
Chapter 5 Format Conversion
57
Chapter 5 Format Conversion
To extract video frames as image files, you need to use the -f image2
option. The numbering of the output images is specified in the name of
the output file. The format mask of the output file is similar to that of the
printf function in the C programming language. In the mask used in
the next command, % is for character output, 0 is for padding with zeros
instead of spaces, 3 is for the total number of digits, and d is for integer
numbers.
☞ Most videos are encoded with a frame rate of 24, 25, 30, or
even 60 frames per second. Be careful with your extraction rate and
length of the video, or you will quickly run out of space.
Use the -r option to restrict the number of images generated for each
second of the source video. You can omit the -r option to extract all frames
(and let it be determined by the frame rate of the source video) but
58
Chapter 5 Format Conversion
Image-Conversion Settings
Table 5-2 lists some FFmpeg conversion options that are useful when
working with image files. Although this book will describe how to use
them, more comprehensive information will be found in the official
FFmpeg documentation.
ffmpeg -r 1 -i frames%03d.jpg \
-s qvga -pix_fmt yuv420p \
Stopmotion-hot-wheels-reconstituted.mp4 2> /dev/null
ffplay -autoexit \
Stopmotion-hot-wheels-reconstituted.mp4 2> /dev/null
59
Chapter 5 Format Conversion
60
Chapter 5 Format Conversion
cat *.png | \
ffmpeg -y -f image2pipe \
-framerate 1/3 -i - \
-filter:v \
61
Chapter 5 Format Conversion
"scale=eval=frame:w=640:h=360:
force_original_aspect_ratio=decrease,
pad=640:360:(ow-iw)/2:(oh-ih)/2:black" \
-c:v libx264 -r 24 -s nhd -pix_fmt yuv420p \
slide2.mp4
You need to experiment a lot with the filters to understand what will
work and what will not. A set of values that do well to optimize the file size
for one source video may do poorly for another video. GIF optimization is
extremely unpredictable. Learn more from this article:
https://engineering.giphy.com/how-to-make-
gifs-with-ffmpeg/
62
Chapter 5 Format Conversion
ffmpeg -y -i bw.m4v \
-filter:v "fps=10,scale=w=320:h=-1:flags=lanczos" \
-c:v ppm \
-f image2pipe - | \
convert -delay 10 - \
-loop 0 \
-layers optimize \
bw.gif
APNG
A better alternative to GIF animations is APNG. This format has limited
support from image-viewing and image-editing applications but has
near-universal support from desktop and mobile web browsers. Like PNG
and unlike GIF, APNG supports millions of colours. This means that its
colours will not have to be downsampled and will be very close to those
in the source content. APNG animation files are typically bigger than
animated GIFs.
If you are converting GIF animations to APNGs, then ImageMagick is
the tool you should use, not ffmpeg.
63
Chapter 5 Format Conversion
If you are converting a video to APNG, then you can use ffmpeg.
ffmpeg -i bw.m4v \
-vf "scale=w=250:h=-2, hqdn3d, fps=6" \
-dpi 72 -plays 0 \
bw.apng
64
Chapter 5 Format Conversion
This command generates only one image frame in the MP4. The image
frame is not encoded as a regular video stream for the entire duration of
the audio.
Figure 5-6. This video does not really have any video, just one frame
from an image
However, not all media players will accept this trickery. On my
computer, Totem media player does not show the image at all and plays it
like a regular audio file. VLC displays the image because it uses FFmpeg
internally. If your player shirks its duty, you will have to encode the image
for the full duration of the audio.
65
Chapter 5 Format Conversion
ffmpeg -y -i blobfish.mp3 \
-loop 1 -framerate 12 -i Blobfish_face.jpg \
-shortest -s qvga -c:a copy \
-c:v libx264 -pix_fmt yuv420p \
"Weird Fins - 17 - The Blobfish (no tricks).mp4"
# The album art image loops forever so the
# podcast audio creates the shortest output stream
https://youtube-dl.org
66
Chapter 5 Format Conversion
wget https://yt-dl.org/downloads/latest/youtube-dl \
-O ~/bin/youtube-dl
chmod +x ~/bin/youtube-dl
youtube-dl --version
67
Chapter 5 Format Conversion
You can make youtube-dl use ffmpeg to convert the downloaded files.
Many audio podcasts are posted to online video sites. To only listen to
them in the Audacious media player, I use a command like this:
youtube-dl will not only download and convert the audio (from
AAC) to MP3 (using ffmpeg), but it will also launch a command when the
conversion process is complete. That command can be for your media
player. youtube-dl will replace {} in the command string with the name of
the output (MP3) file.
ffmpeg -f lavfi \
-i "flite=textfile=speech.txt:voice=slt" \
speech.mp3
I like the male-only espeak utility better. The defaults are good, and
you can change several settings.
68
Chapter 5 Format Conversion
☞ If you want to extract still images from movies, optical media
is usually the best source.
Summary
In this chapter, you learned how to convert multimedia content in the form
of audio, video, image, and text. You also learned to customize conversion
settings to suit different formats, coder/decoders, and mediums. In the
next chapter, you will learn how to edit videos using ffmpeg.
69
CHAPTER 6
Editing Videos
I used to save DVDs as ISO files (whole-DVD backups) so that I could play
them on my media player box. Each ISO took up several gigabytes (GBs)
on my hard disk that I eventually ran out of space. Now, I use FFmpeg and
store DVDs as MP4s of around just one GB.
While FFmpeg makes it very easy to convert multimedia files, as you
learned in the previous chapter, storing them in their entity is not always
feasible or required. Sometimes, you need just a few clips, not the whole
shebang. You may want to combine parts of one video with parts of other
videos. You can also downsize the videos to conserve space. In ffmpeg
terms, you want to cut, concatenate, and resize videos. In this chapter, you
will learn to do just that.
Resize a Video
You can resize a video using the -s option. The dimension of a video
is usually specified as WidthxHeight. That is an “x” as in “x-mas” in the
middle. When editing or converting videos, you will have to specify the
video dimension using this syntax. The next command resizes a VGA-size
(640x480) video to a VCD-size (352x288) video.
ffmpeg -i dialup.mp4 \
-s 352x288 \
dialup.mpg
© V. Subhash 2023 71
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_6
Chapter 6 Editing Videos
Table 6-1. FFmpeg option and values for setting the dimensions
of a video
Option For
72
Chapter 6 Editing Videos
ffmpeg -i "distorted.mpg" \
-vf setdar=dar=4/3 \
restored.mpg
73
Chapter 6 Editing Videos
Figure 6-1. The distortion in the background video was fixed using a
filter that changed the DAR (display aspect ratio)
These ratios may seem similar but there are subtle differences, as
presented in Table 6-2.
74
Chapter 6 Editing Videos
Editing Options
Some often used video- and audio-editing options are listed in Table 6-3.
75
Chapter 6 Editing Videos
☞ Use the -ss option before the -i option so that ffmpeg can
quickly jump to the location of the specified timestamp. If you place
it after the input file and before the output file, there will be a delay
as ffmpeg decodes all the data from the beginning to the timestamp
and then discards it (as it is not wanted)!
20 20 seconds
1:20 One minute and 20 seconds
02:01:20 Two hours, 1 minute, and 20 seconds
02:01:20.220 Two hours, 1 minute, 20 seconds, and 220 milliseconds
20.020 20 seconds and 20 milliseconds
76
Chapter 6 Editing Videos
If the video segment that you want to remove is the ending, then use -t
option to specify the duration of the content that needs to be copied from
the beginning.
ffmpeg -i long-tail.mp4 \
-t 00:01:00 \
no-monkey.mp4
If you want to cut from the middle, then you need to use both options.
77
Chapter 6 Editing Videos
Figure 6-2. The ffprobe output shows settings that you can use for
the next ffmpeg task
There are disadvantages with this option too. The entirety of the audio and
video information may not be present at the timestamps you have specified
for FFmpeg to make a clean cut. A few seconds of the video may have to be
sacrificed or go out of sync. Out-of-sync audio by one or two seconds is not
really a problem in videos where the speaker remains in the background.
78
Chapter 6 Editing Videos
Use -codec copy only when the container of the output file supports
the existing codec of the input stream you are trying to copy. You cannot
copy streams from an OGV file to a MP4 file, but you can do that with an
MKV output file. First, check whether input codecs are among the default
codecs listed by the muxer of the output container.
These commands list the default extensions and codecs used by some
popular containers.
file '/tmp/video.mp4'
file '/home/yourname/Desktop/video1.mp4'
file '/media/USB1/DCIM/DS00002.mp4'
Ideally, the file locations should be relative to the current directory and
have simple file names. Because these files do not satisfy that condition,
I have used the option -safe 0 in this ffmpeg command. The next
command will re-encode the preceding input files using the specified MP4
settings.
ffmpeg -f concat \
-safe 0 \
-i list.txt \
-c:v libx264 -r 24 -b:v 266k -s qvga \
-c:a libmp3lame -r:a 44000 -b:a 64k -ac 2 \
mixology.mp4
80
Chapter 6 Editing Videos
I advise against the use -f concat demuxer. The output files have a
tendency to confuse and crash media players. If input videos are not of
the same type, the concatenation will fail or the output file will not be
playable. The same thing can happen if some of the input files are -codec
copy veterans. You are lucky if conversion starts at all. If you are forced to
use the concat demuxer, then read about it in the official documentation.
The text file supports other directives (not just file) to make it more
informative to the demuxer.
For more resilient concatenations, use the concat filter as described in
Chapter 7.
Whether you use -codec copy or the concat filter, all the input files
should be of the same type (same dimensions, codecs, frame rates, etc.).
S
ummary
FFmpeg provides some very neat options to edit multimedia files from
the command line. With some files, you may be able to -codec copy the
streams. With others, you will have to re-encode them. Both methods have
advantages and disadvantages.
In the next chapter, you will finally learn about the ffmpeg filters that I
have been all along teasing you with.
82
CHAPTER 7
Filter Construction
In an ffmpeg command, a filter is used to perform advanced processing
on the multimedia and metadata data decoded from the input file(s).
A simple filter consumes an input stream, processes it, and generates an
output stream. The input and output will be of the same type. An audio
filter (used with the option -filter:a or -af) consumes an audio stream
and outputs an audio stream. A video filter (specified by a filter:v or -vf
option) consumes a video stream and outputs a video stream.
You can daisy-chain multiple simple filters to create a filter chain. In
such a filter chain, the output of one filter is consumed by a subsequent filter.
Thus, as a whole, the filter chain will also have one input and one output.
When such a linear filter chain is not possible, you need to use a
complex filtergraph (with the option -filter_complex). A complex
filtergraph can contain several filters or filter chains. The constituent filters
can have zero to several inputs. They can consume streams of different
types and output streams of different types. The number of inputs need not
match the number of outputs. It is not necessary for a filter to consume the
output of the previous filter.
© V. Subhash 2023 83
V. Subhash, Quick Start Guide to FFmpeg, https://doi.org/10.1007/978-1-4842-8701-9_7
Chapter 7 Using FFmpeg Filters
Some filters known as source filters do not have inputs. There are also
sink filters that do not generate any outputs.
In an ffmpeg command, you specify a filter in this fashion:
84
Chapter 7 Using FFmpeg Filters
☞ There are lots of filters and you need to pore over pages of
documentation to find the one that will work for you.
Filter Errors
Sometimes, you will encounter a “No such filter” error. This is probably
because (out of habit) you placed a semicolon after the last filter. Some
filters have an exact number of inputs or outputs. If you fail to identify one
of them, ffmpeg will throw an error. Other common filter errors are caused
when a labeled input or output is not consumed. If you use an output label
more than once, you will get an ‘Invalid stream specifier’ error. An output
stream can only be labelled once and used once. If you want to use a filter
output stream as input for more than one filter, use the split or asplit
filters to duplicate the stream.
85
Chapter 7 Using FFmpeg Filters
86
Chapter 7 Using FFmpeg Filters
PI E PHI QP2LAMBDA
(22/7) (Euler’s number or (golden ratio or 118
exp(1) ~ 2.718) (1+sqrt(5))/2 ~ 1.618)
Several filters define their own constants. These are actually real-
time variables whose values can change depending on the input files, the
processing options, or even time. You need to look at the documentation
for each filter to see what these filter constants represent.
87
Chapter 7 Using FFmpeg Filters
☞ When you specify a filter within double quotes (" "), the
commas separating the parameters of a function will have to
be escaped as \, to prevent ffmpeg from interpreting them as
delimiters used to separate two filters.
This command may also be written without the names and only the
values of the filter options.
88
Chapter 7 Using FFmpeg Filters
☞ If you encounter such commands, they will seem very cryptic.
You will have to look up the filter in the official documentation or the
help output (ffmpeg -help filter=scale) and ascertain the
order of the used filter options.
The scale filter specifies actual width and height values (150:150)
to which the inset video needs to be resized. The overlay filter specifies
x- and y-coordinates of the top-left corner of the inset video on the news
report video. The x-coordinate uses a filter expression (W-w-20) with filter
constants W (width of the background video) and w (width of the inset
video) to correctly inset the video 20 pixels away from the right edge of the
background video. The y-coordinate is specified with the actual value, that
is, 20 pixels from the top edge.
89
Chapter 7 Using FFmpeg Filters
The input for the scale filter is the inset video ([1:v] or the video stream
of the second input file). Its output is labeled [inset]. The inputs for the
overlay filter are the news report ([0:v] or the video stream of the first file)
and the output of the scale filter labeled previously as [inset]. The overlay
filter has one output and it is labeled [v]. This overlaid video and the original
audio of the news report (0:a:0) are then mapped into the output file.
To construct a filter expression with useful filter constants, you need to
refer to the documentation of the filter. If these expressions try to hurt your
brain (they will initially), you can specify explicit values. The preceding
command can be rewritten as follows:
90
Chapter 7 Using FFmpeg Filters
Figure 7-2. The scale filter was used to reduce the height of the first
video. The pad filter has been used to expand the frame of the scaled
video. The overlay filter has been used to place the second video in
the empty area of the expanded frame
91
Chapter 7 Using FFmpeg Filters
After the scale filter, the frame size of the scaled video is expanded
sideways so that the second video can be placed in the new empty area.
The pad filter uses the expression iw+332 to arrive at the new expanded
size of the frame. It then places the scaled video at the top-left corner (0:0)
of the new frame. That is, the scaled video will be on the left side of the
expanded frame.
In the empty area on the right side of the expanded frame ([frame]),
we place the second input file ([1:v]) using the overlay filter.
Without using filter expression, the last ffmpeg command can be
rewritten with actual values as follows:
If you do not want the news video to be downscaled, then you could
put some white space… (in this case) yellow space around the second
video. In the next command, filter expressions and actual values have
been used to correctly position the second video in the middle of the
expanded frame.
92
Chapter 7 Using FFmpeg Filters
☞ It is much more easier and faster to use the filters hstack
and vstack. However, these filters require the input videos to have
the same pixel format (data encoding scheme of pixel color) and the
same dimensions (height for hstack and width for vstack.)
93
Chapter 7 Using FFmpeg Filters
☞ This will re-encode the input files, as will any other filter.
Specify the video and audio streams of the input clips or segments
in the order that they need to be appended by the filter. [0:v:0][0:a:0]
refers to the video and audio streams of the first input clip. [1:v:0]
[1:a:0] refers to the video and audio streams of the second clip. The
filter option n refers to the number of input clips. v refers to the number of
output video streams, and a refers to the number of output audio streams.
The concatenated video and audio streams are the filter outputs labeled as
[vo] and [ao]. These labeled outputs are then mapped to the output file.
94
Chapter 7 Using FFmpeg Filters
ffmpeg -y -i barbara.mp4 \
-filter_complex \
"[0:v:0]trim=start=0:end=16, setpts=PTS-STARTPTS[lv];
[0:v:0]trim=start=36:end=44, setpts=PTS-STARTPTS[rv];
[0:a:0]atrim=start=0:end=16, asetpts=PTS-STARTPTS[la];
[0:a:0]atrim=start=36:end=44, asetpts=PTS-STARTPTS[ra];
[lv][rv]concat=n=2:v=1:a=0[v];
[la][ra]concat=n=2:v=0:a=1[a]" \
-map '[v]' -map '[a]' barb-cut.mp4
Rotate a Video
Some videos that people take from a mobile phone are rotated by 90
or 180 degrees from normal. You can manually fix them by specifying a
transpose filter.
# Rotate to right
ffmpeg -i slt.mp4 \
-filter:v "transpose=1" \
slt-rotated-1.mp4
95
Chapter 7 Using FFmpeg Filters
# Rotate to left
ffmpeg -i slt.mp4 \
-filter:v "transpose=2" \
slt-rotated-2.mp4
For the transpose filter option dir, a value of 1 or 2 turns the video 90
degrees right or left. Values 0 and 3 turn the video left or right and also
vertically flip them. Mobile phone users should stick with the first two values.
Figure 7-4. These still images show dir values that can be used with
the transpose filter
96
Chapter 7 Using FFmpeg Filters
ffmpeg -y -i malampuzha-lake.mp4 \
-filter_complex \
"rotate=angle=16*PI/180:fillcolor=brown" \
malampuzha-lake-tilt-16-chopped.mp4
# Rotates video but corners get cut off
97
Chapter 7 Using FFmpeg Filters
As FFmpeg requires that the new width and height be even numbers,
that is, divisible by 2, the calculated dimensions are first divided by 2,
truncated off, and then multiplied by 2.
Figure 7-5. The first video has the original dimensions, but the
rotated content has chopped-off corners. The second video has bigger
dimensions to accommodate the extruding corners
Flip a Video
Some videos are flipped for some reason. Use vflip or hflip to set
them right.
98
Chapter 7 Using FFmpeg Filters
Figure 7-6. These still images show which filter to us for what effect
ffmpeg -i exhibit.mp4 \
-filter:v "vflip" \
exhibit-upside-down.mp4
ffmpeg -i exhibit.mp4 \
-filter:v "hflip" \
exhibit-half-crazy.mp4
ffmpeg -i exhibit.mp4 \
-filter:v "hflip,vflip" \
exhibit-totally-flipped.mp4
99
Chapter 7 Using FFmpeg Filters
100
Chapter 7 Using FFmpeg Filters
ffmpeg -y -i barbet.mp4 \
-filter_complex \
"[0:v]pad=(iw*2):ih:0:0[frame];
[0:v]eq=brightness=0.2[bright];
[bright]eq=saturation=3[color];
[color]eq=contrast=2[dark];
[frame][dark]overlay=W/2:0[out]" \
-map '[out]' -map 0:a \
barbet-test.mp4
Brightness -1 1.0 0
Contrast -1000 1000 1
Saturation 0 3 1
Gamma 0.1 10 1
ffmpeg -y -i barbet.mp4 \
-filter_complex \
"[0:v]eq=brightness=0.2[bright];
[bright]eq=saturation=3[color];
[color]eq=contrast=2[dark]" \
-map '[dark]' -map 0:a \
barbet-bright.mp4
101
Chapter 7 Using FFmpeg Filters
The test video has a color pattern, a scrolling gradient, and a changing
timestamp. The audio is a low white noise. I do not know who needs this
video, but if it floats your boat, then here is the command to create it.
ffmpeg -f lavfi \
-i "testsrc=size=320x260[out0];
anoisesrc=amplitude=0.06:color=white[out1]" \
-t 0:0:30 -pix_fmt yuv420p \
test.mp4
102
Chapter 7 Using FFmpeg Filters
Remove Logo
In 2019, a newspaper in New York published an opinion alleging bias
against women in government experiments. NASA’s Apollo Space Program
was then celebrating its 50th anniversary.
ffmpeg -i apollo-program.mp4 \
-filter:v "delogo=x=520:y=10:w=100:h=50" \
apollo-program-you-are-dead.mp4
103
Chapter 7 Using FFmpeg Filters
☞ After applying the filter, the logo has disappeared from the top-
right corner.
105
Chapter 7 Using FFmpeg Filters
Mixing these two videos can be done with one command, but for
clarity, I have split it into four commands. (You should combine the filters
to avoid multiple re-encoding.) The crossfade effect is performed by the
fade filter for video and the afade filter for audio. The trim and atrim
filters are used to divide the video and audio tracks into two parts – one
where the stream plays normally and another where the fade filters take
effect. I used overlay and amix filters to mix the second parts. After
that, the concat filter was used to put three segments together – normal
playback from the first file, crossfade effect from both files, and then
normal playback from the second file.
106
Chapter 7 Using FFmpeg Filters
[v1][fading][v2]concat=n=3:v=1:a=0[v]" \
-map '[v]' -pix_fmt yuv420p \
aliens-r-us-v.mp4
Crop a Video
For some screenshots in the beginning of this chapter, I needed a public-
domain video of a sign-language translator. I found one but it was too big.
I grabbed a still image from the video using a media player and edited it
in GIMP.
107
Chapter 7 Using FFmpeg Filters
Figure 7-11. First, take a screengrab from the video. Then, use
an image-editing program to identify the location (150,12) and
dimensions (332,332) of the region you want to cut out
I then selected the region that I wanted cut into. I noted down the
coordinates and dimensions of the region from GIMP’s Tool Options panel.
I used the details from GIMP in the options for a crop filter that I used on
the video.
ffmpeg -i how-to-vote.mp4 \
-filter:v "crop=332:332:150:12" \
accessibility.mp4
108
Chapter 7 Using FFmpeg Filters
ffmpeg -i LED-Flip-Flop-Circuit.mp4 \
-filter:v
"smartblur=luma_radius=5:luma_strength=1.0:
luma_threshold=30" \
LED-Flip-Flop-Circuit-blurred.mp4
ffmpeg -i LED-Flip-Flop-Circuit.mp4 \
-filter:v
"smartblur=luma_radius=5.0:luma_strength=-1.0:
luma_threshold=30" \
LED-Flip-Flop-Circuit-sharpen.mp4
The smartblur filter can blur or sharpen videos without affecting the
outlines. It works on the brightness of the pixels. The luma_radius (0.1 to 5)
represents the variance of the Gaussian blur filter. luma_strength (-1 to 1)
varies between sharpness to blurring. luma_threshold (-30 to 30) varies the
focus of the filter from the edges to interior flatter areas.
ffmpeg -y -i stilt.mp4 \
-filter_complex \
"[0:v]crop=260:80:400:550[c1];
[0:v]crop=100:60:1:550[c2];
[c1]boxblur=6:6[b1];
[c2]boxblur=6:6[b2];
[0:v][b1]overlay=400:550[v1];
110
Chapter 7 Using FFmpeg Filters
[v1][b2]overlay=1:550[v]" \
-map '[v]' -map 0:a -c:a copy \
stilt-masked.mp4
☞ To avoid any doubt or confusion, I would like to state that I have
masked faces of private individuals (even in public-domain content) in
several screenshots using an image-editing program. In this screenshot,
however, the effect was achieved using the ffmpeg filter boxblur.
111
Chapter 7 Using FFmpeg Filters
Draw Text
To draw text on video, you need to use the drawtext filter and also specify
the location of the font file. When you are drawing several pieces of text, it
is better to daisy-chain your texts (using commas, not semicolons).
ffmpeg -y -i color-test.mp4 \
-filter_complex \
"[0:v:0]drawtext=x=(w-tw)/2:y=10:fontcolor=white: \
shadowx=1:shadowy=1:text='Detonation Sequence': \
fontsize=25: fontfile=AllertaStencil.ttf, \
drawtext=x=(w-tw)/2:y=60:fontcolor=white: \
shadowx=1:shadowy=1: \
text='This TV will self-destruct in t seconds.': \
fontsize=15:fontfile=Exo-Black.ttf[v]" \
-map '[v]' -map 0:a:0 -pix_fmt yuv420p \
idiot-box-1.mp4
112
Chapter 7 Using FFmpeg Filters
Draw a Box
You can use the drawbox filter to render all kinds of boxes, filled or bound,
with all sorts of colors and transparencies.
ffmpeg -y -i color-test.mp4 \
-filter_complex \
"[0:v:0]drawbox=x=20:y=3:w=280:h=36:[email protected]:
t=fill, \
drawbox=x=11:y=49:w=294:h=40:color=lime:t=1, \
drawtext=x=(w-tw)/2:y=10:fontcolor=white: \
shadowx=1:shadowy=1:text='Detonation Sequence': \
113
Chapter 7 Using FFmpeg Filters
fontsize=25: fontfile=AllertaStencil.ttf, \
drawtext=x=(w-tw)/2:y=60:fontcolor=white: \
shadowx=1:shadowy=1: \
text='This TV will self-destruct in t seconds.': \
fontsize=15:fontfile=Exo-Black.ttf[v]" \
-map '[v]' -map 0:a:0 -pix_fmt yuv420p \
idiot-box-2.mp4
The part of the color value after the @ symbol refers to the transparency
level. It ranges from 0 (fully transparent) to 1 (opaque). If you specify the
value fill for the filter option t or thickness, then the box will be filled
with that color. Otherwise, it applies to the border.
114
Chapter 7 Using FFmpeg Filters
Speed Up a Video
When you increase the playback speed of a video, its duration decreases.
When you slow down a video, its duration increases. There is no one filter
that changes the speed of both the audio and the video. You need to use
two different filters – one for video and one for audio. The two filters do not
work in the same way. The two need to be calibrated correctly so that the
same effect is achieved on both the audio and the video.
For the video, you need to set the setpts video filter to a fraction of the
PTS filter constant. If you want to double the speed of the video, divide PTS
by 2. If you want the video to be four times fast, then divide PTS by 4. For
the audio, you need to use the atempo filter. The range of this filter is from
half the speed to 100 times. The following command fast-forwards a video
by four times (4x).
ffmpeg -y -i barb.mp4 \
-filter_complex \
"[0:v]setpts=PTS/4[v];
[0:a]atempo=4[a]" \
-map '[v]' -map '[a]' \
barb-speed.mp4
115
Chapter 7 Using FFmpeg Filters
ffmpeg -y -i tom.mp4 \
-filter_complex \
"[0:v]setpts=PTS*4[v];
[0:a]atempo=0.5, atempo=0.5[a]" \
-map '[v]' -map '[a]' \
possessed-doll.mp4
116
Chapter 7 Using FFmpeg Filters
the same using FFmpeg and apply the change to the video as well. My
calculation became easier when I used seconds. The original video was
114 seconds, and my slowed-down audio was 128 seconds.
The links to these videos and those used in other examples in this book
are available online:
www.vsubhash.in/ffmpeg-book.html
Summary
The examples in this chapter would have amply demonstrated that a lot
of useful and powerful multimedia-processing abilities are hidden in
the filters functionality. You need to read the relevant documentation to
make full use of a filter. Filter expressions using built-in real-time variables
(filter constants) and functions provide a lot of versatility and extensibility
to command-line users that would have otherwise been limited to
programmers who use the libav libraries.
In this book, the teaching portion about FFmpeg functionality ends
here. The subsequent chapters are topic-specific for those who want quick
answers to a particular type of problem and do not want to read through
dense explanatory text before finding the answer. You will find some
information repeated or not mentioned at all.
117
CHAPTER 8
The -Ow makes Timidity to output the playback in WAVE format. Its -o
option is used to specify the output file. Instead of an output file, we use -
to make it write to the standard output. The Timidity output is then piped
over to an FFmpeg command, where it is captured from the standard input
with yet another - (hyphen).
Change Volume
FFmpeg can increase the loudness of an audio file using its volume filter.
The filter accepts a multiple either as a number (scalar) or in decibels
(logarithmic).
I had an audio file that continued to have low volume, even after
trebling the levels. I opened it in Audacity and found the reason.
120
Chapter 8 All About Audio
121
Chapter 8 All About Audio
The volumedetect filter shows that we can safely increase the volume
to 16db. If we raised the volume to 17dB or higher, normalization would
cut into the waveform, and the peaks would get attenuated or chopped off.
At 17dB, six sound samples (the loudest) in the waveform would be lost.
ffmpeg -i sarah.mp3 \
-af 'volume=16dB' -f ogg \
sarah-normalized.ogg
122
Chapter 8 All About Audio
Figure 8-3. Audacity confirms that the volume has been increased
without cutting into the waveform
This is fine. Now, how do you decrease the volume? Well, choose a fraction
between 0 and 1 for the volume filter. For example, to decrease the volume by
two-thirds, you should set the multiple at 0.33. (You know ⅓ = 0.33?)
123
Chapter 8 All About Audio
ffmpeg -i sarah.mp4 \
-c:v copy \
-af 'volume=3' \
-c:a libmp3lame -b:a 128k \
sarah-more.mp4
ffmpeg -i sarah.mp4 \
-af 'volumedetect' \
-vn \
-f null \
/dev/null
# Displays that the loudest samples are at 17dB
124
Chapter 8 All About Audio
ffmpeg -y -i train-trip-low.mp3 \
-filter:a dynaudnorm=gausssize=3 \
train-trip-low-dynaudnormalized.mp3
125
Chapter 8 All About Audio
Channels
An audio stream can have one or more channels. A channel is an
independent sequence of audio. All channels in an audio stream are of the
same length, and they are played back simultaneously. The idea of having
a separate channel is to have a different choice of musical instruments or
sounds to play in different speakers. Audio content creators may move
back and forth sounds between different channels at different volume
levels. This can be useful in creating a 2D or 3D effect to the sound.
Typically, each channel in an audio stream is assigned to a particular
speaker. This composition of channels in a multichannel stream is known
as its channel layout. When the number of speakers is less than the
number of channels, then that particular channel may not be heard, or
the device may downmix the channels so that the excess channels will be
heard on the existing speakers.
Monaural audio has only one channel. Stereo music has two channels –
left and right. Movies can have two, six, seven, eight, or more channels.
When working with channels, you will need to use filters such as amerge,
channelmap, channelsplit, and pan. These filters make use of certain IDs
for channels and channel layouts. Table 8-1 and Table 8-2 list these IDs.
126
Chapter 8 All About Audio
127
Chapter 8 All About Audio
You can specify the channel settings using the map filter option in
this format:
input_channel_id-output_channel_id|input_channel⏎
_id-output_channel_id|...
128
Chapter 8 All About Audio
The filter option l is used to specify the channel layout. After that, you
have to specify how much of what channel (in the input stream) you need
for each channel in the output audio stream. For specifying that proportion
or the gain, you can specify a multiple or a fraction. If you omit the gain, it
implies that you want that channel as is or that the gain is equal to 1 (one).
If you use 0 (zero), it means that you want that channel totally attenuated.
129
Chapter 8 All About Audio
Because the first two of the mapped output audio streams need to be
freshly encoded as mono streams and the last mapped audio stream just
needs to be copied without re-encoding, encoder (-c) and channel count
(-ac) need to be specified on a per-stream basis.
130
Chapter 8 All About Audio
131
Chapter 8 All About Audio
# Downmix to mono
ffmpeg -i uncompressed-stereo.wav \
-ac 1 \
mono.mp3
ffmpeg -i mono.mp3 \
-ac 2 \
stereo-kind-of.mp3
133
Chapter 8 All About Audio
ffmpeg -i AAC-LC-Channel-ID.mp4 \
-ac 2 \
stereo.mp3
134
Chapter 8 All About Audio
// Slow MP4 was 128 seconds. The original was 114 seconds.
ffmpeg -i Laurie-Lennon-Slow.mp4 \
-i Laurie-Lennon-Original.mp4 \
-loop 1 -i bg.png \
-filter_complex \
"[0:v:0]scale=320:180[v1];
[1:v:0]scale=320:180[v2];
[2:v:0][v1]overlay=320:90[v3];
[v3][v2]overlay=0:90[v];
[0:a:0]channelsplit=channel_layout=mono[right];
[1:a:0]channelsplit=channel_layout=mono,apad[left];
[left][right]join=inputs=2:channel_layout=stereo[a]" \
135
Chapter 8 All About Audio
ffmpeg -y -i dialup-modem.mp4 \
-filter_complex \
"[0:a]showwaves=s=160x90:mode=line[waves];
[0:v]drawbox=x=(iw-20-w):y=(ih-20-h):w=160:h=90:
[email protected]:t=fill[bg];
[bg][waves]overlay=x=(W-20-w):y=(H-20-h)[over]" \
-map '[over]' -map 0:1 \
dialup-modem-handshake.mp4
136
Chapter 8 All About Audio
ffmpeg -i The-most-annoying-DIY-electronic-alarm.mp3 \
-filter_complex \
"showfreqs=s=640x320:mode=bar[v]" \
-map '[v]' -map 0:a:0 \
-c:v mpeg4 -b:v 466k -r 24 \
The-most-annoying-DIY-electronic-alarm.mp4
137
Chapter 8 All About Audio
There are a few other filters similar to this one. Check the
documentation. These filters are very interesting.
Detect Silence
I have a shell script for censoring movies. (It uses FFmpeg, of course.) I
use it to protect kids from foul dialog and unsuitable scenes. It asks for
timestamps where the audio needs to be silenced and the video needs to
be blacked out. After it does the job, I need to double-check these locations
before the grand première on the TV. I use this command:
ffmpeg -i edited-movie.mp4 \
-filter:a "silencedetect" \
-vn -f null -
Silence the Video
Heck, you do not want sound at all! Just remove the audio stream.
ffmpeg -i music-video.mp4 \
-an \
-c:v copy \
sound-of-silence.mp4
138
Chapter 8 All About Audio
ffmpeg -f lavfi \
-i "flite=textfile=speech.txt:voice=slt" \
speech.mp3
This library has an option for a female voice, but I like the male-only
espeak better. You can find other options for the flite filter option voice
by typing the following:
On my computer, this command lists awb, kal, kal16, rms, and slt as
voices that are supported.
ffmpeg -i Stopmotion-hot-wheels.mp4 \
-filter:a "lowpass=frequency=1000" \
-codec:v copy \
Stopmotion-hot-wheels-audio-passed-low.mp4
139
Chapter 8 All About Audio
The default option in Audacity was 1000 Hz for the frequency and 6 dB
per octave for the roll-off. The roll-off specifies how steeply the frequencies
are attenuated. The lowpass filter can apply a 3 dB roll-off if you set its
poles option to 1. The default 2 applies a 6 dB roll-off, and I did not have to
explicitly specify it in the above command.
Summary
In this chapter, you learned how to perform several tasks with audio
content. You may find it helpful to initially use Audacity to understand
audio problems. As you get more familiar with what ails audio content,
you can rely on FFmpeg entirely. FFmpeg has a ton of audio filters, and
this chapter used just a few of them. Check the FFmpeg documentation on
audio filters, and you will find more exciting things you can do with audio.
140
CHAPTER 9
142
Chapter 9 All About Subtitles
Did you notice something else with the above command? I subtitled
the movie in two formats (MP4 and MKV) using one command. With the
MP4, I had to encode the OGV streams because its codecs are not native
to the MP4 container. With the MKV, I could use -codec copy. The MKV
container supports a wide variety of codecs including those supported by
OGV and MP4. If you are backing up DVDs for long-term storage, choose
MKV. It is the best.
143
Chapter 9 All About Subtitles
Figure 9-1. Subtitles burned into a video cannot be turned off with
the remote or a menu option
ffmpeg -i 2020-Jokebook1.ogv \
-filter_complex \
"drawbox=w=250:h=100:x=360:y=90:[email protected]:t=fill,
subtitles=2020-Jokebook1.ass" \
-c:v libx264 -r 24 \
2020-Jokebook1.mp4
144
Chapter 9 All About Subtitles
☞ The black box was unnecessary. SSA has built-in support for
dynamic background boxes, as you will learn later.
☞ You should place the font file in the current directory or specify
its full path.
145
Chapter 9 All About Subtitles
Figure 9-2. When subtitles are added as a stream, the viewer can
turn them on/off using the remote or with a menu option
146
Chapter 9 All About Subtitles
However, I prefer not to do that. I download the SRT file, let it open in
a GUI program called Gnome Subtitles, and save it as a SSA file. After this,
I run a BASH script on the .ass file to change its style statement. The style
statement generated by ffmpeg and Gnome Subtitles refers to Windows
fonts. These fonts are not available in Linux and the resultant subtitles do
not look cool. My script uses a better style statement with a font I already
have installed in Linux.
ffmpeg version:
Style: Default,Arial,16,&Hffffff,&Hffffff,&H0,&H0,⏎
0,0,0,0,100,100,0,0,1,1,0,2,10,10,10,0
Style: Default,Tahoma,24,&H00FFFFFF,&H00FFFFFF,⏎
&H00FFFFFF,&H00C0C0C0,-1,0,0,0,100,100,0,0.00,⏎
1,2,3,2,20,20,20,1
My version:
Style: Default,Headline,20,&H00FFFFFF,&H006666EE,⏎
&H00000000,&HAA00EEEE,-1,-1,0,0,100,100,0,0.00,⏎
1,4,0,2,20,20,20,1
When I used this style in the book-read video, the subtitles…
147
Chapter 9 All About Subtitles
Figure 9-3. In this video, the subtitles have a text outline. (This
eliminated the need to render a black box behind the subtitles using
an FFmpeg filter. SSA subtitles support multiple such styles in the
same file.) The subtitle shadow has been zeroed
148
Chapter 9 All About Subtitles
Name refers to a subtitle display style. You can define and use many
different styles, not just the Default. The colors are in hexadecimal
AABBGGRR format. (Ese, are they loco? No. It is allegedly to help with
video-to-text conversion.) PrimaryColour is the color of the subtitle text.
OutlineColour is for the outline of the text. BackColour is the color of the
shadow behind the text. SecondaryColour and OutlineColour will be
automatically used when timestamps collide. Bold, italic, et al. are -1
for true and 0 for false. (Yeah, I know. The bash shell does the same.)
ScaleX and ScaleY specify magnification (1-100). Spacing is additional
pixel space between letters. Angle is about rotation (0-360) and controlled
by Alignment. BorderStyle uses 1 (outlined and drop-shadowed text), 3
(outline box and shadow box), and 4 (outlined text and drop-shadow box).
Outline represents the border width (1-4) of the outline or the padding
around the text in the outline box. Shadow represents the offset (1-4) of
the shadow from the text or the space around the text in the shadow box.
Alignment takes 1 (left), 2 (center), and 3 (right). If you add 4 to them, the
subtitle appears at the top of the screen. If you add 8, it goes to the middle.
Then, we have margin from the left, right, and bottom edges of the screen.
Encoding is 0 for ANSI Latin and 1 for Unicode (I think).
To really go bonkers with subtitles, I say we render subtitles with a
miasma of colors, location, and tilt.
Style: Default,Headline,22,&H6600FFFF,&H006666EE,⏎
&H660000FF,&H220066EE,-1,-1,0,0,100,100,0,25.00,⏎
3,4,4,2,20,20,120,1
149
Chapter 9 All About Subtitles
Figure 9-4. This is truly subtitles gone wild. SSA subtitle format
offers the most control and options. There is a yellow shadow to the
red outline. Because the colors are translucent, their intersection
appears orange
150
Chapter 9 All About Subtitles
☞ The codes that you can use for setting the language are
further described in Chapter 10.
ffprobe 2020-Jokebook1-subtitled-en-fr.mkv
151
Chapter 9 All About Subtitles
ffmpeg -i dvd-movie-subtitled.mp4 \
dvd-movie-subtitle-default.ass
If the video has multiple subtitle streams, you need to specify mapping. The
next command saves the second subtitle stream in the input file as an SSA file.
ffmpeg -i 2020-Jokebook1-subtitled-en-fr.mkv \
-map 0:s:1 \
2020-Jokebook1-subtitle-fr.ass
152
Chapter 9 All About Subtitles
Summary
Subtitles are available in several formats including SRT, Substation Alpha,
and MPEG4 Timed Text. The Substation Alpha is the most versatile
subtitle format, and MKV seems to be the best container for it. The style
specification for the Substation Alpha format may seem intimidating at
first but will be accommodative in customizing subtitles for a variety of
use cases.
153
CHAPTER 10
ffmpeg -y \
-i Uthralikavu-Pooram.mp3 \
-i Uthralikavu-Pooram-festival-fireworks.png \
-i Uthralikavu-Pooram-festival-crowds.png \
There are several options for the comment key, as defined in the ID3 tag
specification.
https://id3.org/id3v2.3.0
156
Chapter 10 All About Metadata
Figure 10-1. The album art displayed by different media players for
the same MP3 file can be different
ffmpeg -y -i Uthralikavu-Pooram-festival-fireworks.mp3 \
-map 0 \
-metadata title="Uthralikavu Pooram Festival" \
-metadata artist="V. Subhash" \
-metadata \
subject="Fireworks and crowds" \
-metadata album="Pooram festival fireworks" \
-metadata date="2013-12-26" \
-metadata genre="Event" \
157
Chapter 10 All About Metadata
☞ MP3 tags metadata get added at the global level. They are not
stream-specific.
Figure 10-2. Media player support for MP3 tags may be buggy or
not 100%. Do not break your head just because some tags do not get
displayed by a media player
Export Metadata
You can export metadata to a text file using the -f ffmetadata option.
158
Chapter 10 All About Metadata
ffmpeg -i Kerala-Uthralikavu-Pooram-festival-fireworks.mp3 \
-f ffmetadata \
mp3-meta.txt
Import Metadata
Let us imagine that I modified the metadata in the text file (from the
previous section) using a text editor. Now, I want the updated metadata to
be imported back into the audio file. How can I do it?
ffmpeg -y \
-i Kerala-Uthralikavu-Pooram-festival-fireworks.mp3 \
-i mp3-meta-modified.txt \
-codec copy \
-map_metadata 1 \
Kerala-Uthralikavu-Pooram.mp3
159
Chapter 10 All About Metadata
Here, -map_metadata 1 refers to the second input file, that is, the
modified metadata file. (-map_metadata 0 would have simply copied
the metadata from the first input file, that is, the MP3 file. We did not
want that.)
Figure 10-4. An MP3 audio file and the album art extracted from it
If there are more than one album art, you need to check the ffprobe
output and then extract the album art using a map.
160
Chapter 10 All About Metadata
The crowds image is identified as a video stream with index 0:2 (third
among all streams). To extract it, I should use the map 0:2. To be safer, I
refer to it as 0:v:1 (second video stream).
ffmpeg -i Kerala-Uthralikavu-Pooram-festival-fireworks.mp3 \
-map 0:v:1 \
crowds.png
161
Chapter 10 All About Metadata
ffmpeg -i "Sign_Language_-_How_To_Vote.mp4" \
-codec copy \
-map_metadata -1 \
how-to-vote.mp4
I have had portable media players that do not play MP3 files if they
have album art. Album art cannot be removed as metadata because they
are encoded as video streams. So, I use -codec copy and specify a -map for
the audio stream. By omitting video streams, the output file will not have
any album art.
ffmpeg -i Kerala-Uthralikavu-Pooram.mp3 \
-map 0:a \
-codec copy \
pooram.mp3
# You can also use -vn instead of the -map option
162
Chapter 10 All About Metadata
Figure 10-6. This video has audio tracks in three languages. The
metadata for the audio streams helps identify the languages
The following command sets the language names using ISO codes and
makes the menus a lot more informative.
ffmpeg -i how-to-create-a-speaker-instructions.mp4 \
-map 0 \
-metadata:s:a:0 language=eng \
-metadata:s:a:1 language=mal \
-metadata:s:a:2 language=tam \
-codec copy \
how-to-create-a-speaker-instructions-multilang.mp4
163
Chapter 10 All About Metadata
map 0 includes all streams in the first input file (#0), that is, including
the video stream and the three audio streams. (If not used, there will
be just one video stream and one audio stream in the output file.)
-metadata:s: is used to set metadata for a stream, not a subtitle.
www.loc.gov/standards/iso639-2/php/code_list.php
Summary
In this chapter, you learned to use ffmpeg to easily add, examine, edit,
export, import, and remove metadata. Metadata can be specified at the
container level (global) and for individual streams. This information can
greatly enrich the experience with media players. In their absence, media
players will try to make guesses and/or frustrate you with generic or wrong
interface choices. Media formats and software/hardware applications may
be picky and choosy about the kind of metadata they support.
With the end of this chapter, all that remains is a set of tips and tricks
that could not be accommodated anywhere else.
164
CHAPTER 11
FFmpeg Tips
and Tricks
I like tips and tricks, and I cannot lie. I have written an entire book titled
Linux Command-Line Tips and Tricks. The tips and tricks in this chapter
are mostly about FFmpeg. If you spend a lot of time with FFmpeg, these
tips will be useful. Some advanced FFmpeg solutions, which could not be
accommodated elsewhere, are also included.
Customize the Terminal
FFmpeg commands tend to be very long. Modify the ~/.bashrc file to
ensure that you have enough real estate at the prompt.
PS1="\a\n\n\[\e[31;1m\]\u@\h on \d at \@\n\[\e[33;1m\]⏎
\w\[\e[0m\]\n\[\e[32;1m\]\$ \[\e[0m\]"
PS1="\a\n\n\[\e[31;1m\]\w\n\[\e[33;1m\]\$ \[\e[0m\]"
166
Chapter 11 FFmpeg Tips and Tricks
167
Chapter 11 FFmpeg Tips and Tricks
Figure 11-3. In the year 2020, the context menu options in the file
manager of my desktop became so heavy with FFmpeg automation
scripts that I decided to write this book
168
Chapter 11 FFmpeg Tips and Tricks
Figure 11-4. In the Mate desktop, the file manager is called Caja. You
can use the app Caja Actions Configuration to add your own custom
context-sensitive menu options inside Caja
169
Chapter 11 FFmpeg Tips and Tricks
Hide the Banner
In older versions of FFmpeg, you could hide the big banner that it displays
by redirecting it to the null device (2> /dev/null or 2> NUL). In new
versions of FFmpeg, you can use the new -hide_banner option. Typing this
option is a hassle, so it is better if you create command aliases as described
in Chapter 2.
The log output of FFmpeg is classified into several types of
information. You can use the -loglevel option to specify what types
you would like to see. You can also choose to see nothing. Check the
documentation for what you would like.
ffplay can be prevented from displaying a window or console output
using the option -nodisp. This is useful when playing audio files in shell
scripts.
170
Chapter 11 FFmpeg Tips and Tricks
Figure 11-5. I built this boombox. The MP3 player module’s display
is limited to song numbers
I use a script that iterates through the MP3 files in a directory and
adds an espeak intro to the files. If a file is named “Fibber-n-Molly-Fibber-
Makes-His-Own-Chili-Sauce.mp3,” then espeak intelligently reads it out
as “Fibber n Molly Fibber Makes His Own Chili Sauce.” This audio is first
saved to a wave file and then concatenated to the MP3 file.
171
Chapter 11 FFmpeg Tips and Tricks
-map_metadata 1 -id3v2_version 3 \
-map "[outa]" \
-c:a libmp3lame \
"AnnD/${sFileName}-AnnD.mp3" 2> /dev/null
echo -e "\tsaved to AnnD/${sFileName}-AnnD.mp3"
rm "${sFileName}.wav"
done
Although FFmpeg can be used to tag MP3 files, I prefer to use EasyTAG
instead. It has sophisticated options to mass-rename files and set MP3
tags. (I am not an FFmpeg fanatic and neither should you be one.) After I
neatly tag and name the MP3 files, I let this FFmpeg script do its thing.
172
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -i uncompressed.wav \
-c:a libmp3lame \
-qscale:a 4 \
good.mp3
To create an old-school CBR MP3 file, use the b:a (bitrate) option.
ffmpeg -i uncompressed.wav \
-c:a libmp3lame \
-b:a 128k \
goot.mp3
173
Chapter 11 FFmpeg Tips and Tricks
Colors in Hexadecimal
Any color can be represented as a combination of three color channels -
red (R), green (G), and blue (B). Sometimes, a fourth value called alpha or
transparency is also specified. When alpha is present, the colors can range
from totally transparent to totally opaque. When alpha is not present, the
color can range from black to the full color.
Colors are often represented in hexadecimal. This is a numbering
system with a base of 16. It uses the numerals 0 to 9 and A to F.
The alpha channel affects all the other three channels. It needs to be at
its maximum value (F, FF) for a color to be totally opaque. If it is any less,
the entire color will have some transparency, and any object below the
color will show through it. If there are not more than one layer of colors,
then alpha value has no relevance and will be ignored.
174
Chapter 11 FFmpeg Tips and Tricks
Colors in Literal
While you can specify colors in hexadecimal, FFmpeg also supports some
literal names, as listed in Table 11-1. These are the same nonstandard color
names that Microsoft introduced into Internet Explorer and forced Firefox to
adopt them to maintain compatibility. In addition to these color names, you
can also use the literal random and let FFmpeg choose a color by random.
175
Chapter 11 FFmpeg Tips and Tricks
(continued)
176
Chapter 11 FFmpeg Tips and Tricks
177
Chapter 11 FFmpeg Tips and Tricks
[STREAM]
index=0
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
profile=High
codec_type=video
codec_tag_string=avc1
codec_tag=0x31637661
width=640
height=480
coded_width=640
coded_height=480
closed_captions=0
film_grain=0
has_b_frames=2
sample_aspect_ratio=4:3
display_aspect_ratio=16:9
pix_fmt=yuv420p
level=30
color_range=unknown
color_space=unknown
color_transfer=unknown
color_primaries=unknown
chroma_location=left
field_order=progressive
refs=1
is_avc=true
nal_length_size=4
id=0x1
r_frame_rate=90000/2999
178
Chapter 11 FFmpeg Tips and Tricks
avg_frame_rate=90000/2999
time_base=1/90000
start_pts=0
start_time=0.000000
duration_ts=1802399
duration=20.026656
bit_rate=488521
max_bit_rate=N/A
bits_per_raw_sample=8
nb_frames=601
nb_read_frames=N/A
nb_read_packets=N/A
extradata_size=40
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
DISPOSITION:captions=0
DISPOSITION:descriptions=0
DISPOSITION:metadata=0
DISPOSITION:dependent=0
DISPOSITION:still_image=0
TAG:language=eng
179
Chapter 11 FFmpeg Tips and Tricks
TAG:handler_name=VideoHandle
TAG:vendor_id=[0][0][0][0]
[/STREAM]
[STREAM]
index=1
codec_name=aac
codec_long_name=AAC (Advanced Audio Coding)
profile=LC
codec_type=audio
codec_tag_string=mp4a
codec_tag=0x6134706d
sample_fmt=fltp
sample_rate=48000
channels=2
channel_layout=stereo
bits_per_sample=0
id=0x2
r_frame_rate=0/0
avg_frame_rate=0/0
time_base=1/48000
start_pts=0
start_time=0.000000
duration_ts=960000
duration=20.000000
bit_rate=129267
max_bit_rate=N/A
bits_per_raw_sample=N/A
nb_frames=939
nb_read_frames=N/A
nb_read_packets=N/A
extradata_size=5
180
Chapter 11 FFmpeg Tips and Tricks
DISPOSITION:default=1
DISPOSITION:dub=0
DISPOSITION:original=0
DISPOSITION:comment=0
DISPOSITION:lyrics=0
DISPOSITION:karaoke=0
DISPOSITION:forced=0
DISPOSITION:hearing_impaired=0
DISPOSITION:visual_impaired=0
DISPOSITION:clean_effects=0
DISPOSITION:attached_pic=0
DISPOSITION:timed_thumbnails=0
DISPOSITION:captions=0
DISPOSITION:descriptions=0
DISPOSITION:metadata=0
DISPOSITION:dependent=0
DISPOSITION:still_image=0
TAG:language=eng
TAG:handler_name=SoundHandle
TAG:vendor_id=[0][0][0][0]
[/STREAM]
This option can also be output in several other formats such as ini and
CSV. For more information, check the documentation that came with your
version of FFmpeg.
181
Chapter 11 FFmpeg Tips and Tricks
With the -sections option, you can find out how the streams
information is organized.
ffprobe -sections
Sections:
W.. = Section is a wrapper (contains other sections,⏎
no local entries)
.A. = Section contains an array of elements of the⏎
same type
..V = Section may contain a variable number of fields⏎
with variable keys
FLAGS NAME/UNIQUE_NAME
---
W.. root
.A. chapters
... chapter
..V tags/chapter_tags
... format
..V tags/format_tags
.A. frames
... frame
..V tags/frame_tags
.A. side_data_list/frame_side_data_list
... side_data/frame_side_data
.A. timecodes
... timecode
.A. components
182
Chapter 11 FFmpeg Tips and Tricks
... component
.A. pieces
... section
.A. logs
... log
... subtitle
.A. programs
... program
..V tags/program_tags
.A. streams/program_streams
... stream/program_stream
... disposition/program_stream_
disposition
..V tags/program_stream_tags
.A. streams
... stream
... disposition/stream_disposition
..V tags/stream_tags
.A. side_data_list/stream_side_data_list
... side_data/stream_side_data
.A. packets
... packet
..V tags/packet_tags
.A. side_data_list/packet_side_data_list
... side_data/packet_side_data
... error
... program_version
.A. library_versions
... library_version
.A. pixel_formats
... pixel_format
183
Chapter 11 FFmpeg Tips and Tricks
... flags/pixel_format_flags
.A. components/pixel_format_components
... component
[STREAM]
duration=20.026656
[/STREAM]
20.026656
Now, the command output contains just the duration value. Similarly,
other details about an input file can be atomized. Your shell script or some
other program can capture these values for further processing.
184
Chapter 11 FFmpeg Tips and Tricks
185
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -y -i train.mp4 \
-r 1 \
-f image2 \
nofilter-still%02d.jpg
186
Chapter 11 FFmpeg Tips and Tricks
Figure 11-9. The source video was taken from a moving train.
The first command took still images at regular intervals without
consideration for image quality. The second command only took the
I frames with maximal detail and less pixelation
187
Chapter 11 FFmpeg Tips and Tricks
This command takes a still after the 20-second mark. When there is a
lot of action at that timestamp, ffmpeg may not be able to find an I frame
there, and this may result in some inevitable pixelation.
#!/bin/bash
# BASH script to create a 3x3 thumbnail gallery for a video
# Accepts the pathname of the video as argument ($1)
###########################################################
# Floating point number functions by Mitch Frazier
# Adapted from
# https://www.linuxjournal.com/content/floating-point-math-bash
###########################################################
188
Chapter 11 FFmpeg Tips and Tricks
NUMBER_OF_THUMBNAILS=9
189
Chapter 11 FFmpeg Tips and Tricks
MOVIE="$1"
COUNTER=0
# Number of seconds
MOVIE_DURATION=$(ffprobe \
-show_entries "format=duration" \
-of "default=nokey=1:noprint_wrappers=1" \
-i $MOVIE 2> /dev/null)
#echo $MOVIE_DURATION
TW=$(float_eval "$MOVIE_WIDTH/3")
TH=$(float_eval "$MOVIE_HEIGHT/3")
THUMB_WIDTH=${TW%.*}
THUMB_HEIGHT=${TH%.*}
#echo "$THUMB_WIDTH x $THUMB_HEIGHT"
190
Chapter 11 FFmpeg Tips and Tricks
"($i-0.5)*$MOVIE_DURATION/$NUMBER_OF_THUMBNAILS")
#echo $LOCATION_FLOAT
LOCATION_INT=${LOCATION_FLOAT%.*}
#echo $LOCATION_INT
191
Chapter 11 FFmpeg Tips and Tricks
Record from Microphone
A PC may have more than one sound device - built-in sound card, the
webcam, HDMI output audio, and sometimes even a USB microphone. In
Linux, these sound cards are identified as hw:0, hw:1…. You have to find
which one you are using or what can record audio through its microphone.
Check your desktop sound configuration utility, and ensure that it is
responding to noises in your room. After this,
192
Chapter 11 FFmpeg Tips and Tricks
• If you are then able to record from the device using the
number of the card with this command, then you are all
set to record from the microphone.
193
Chapter 11 FFmpeg Tips and Tricks
If you are still unable to record, then audio capture may have been
disabled.
Record from Webcam
Recording from a webcam or grabbing the screen output in Windows is not
easy. There is a FOSS tool called CamStudio that internally uses FFmpeg. If
you are able to use it, then follow the FFmpeg Wiki on the topic.
In Linux, things are very easy. Even then, install Cheese or a similar
webcam application before you use ffmpeg. Ensure that the device is
working properly. Check the preferences and leave it at the best settings.
Then, close it and try this ffmpeg command:
ffmpeg -y -f v4l2 \
-i /dev/video0 \
-s vga -r 12 -b:v 466k \
-t 0:0:10 \
webcam.ogv
194
Chapter 11 FFmpeg Tips and Tricks
☞ To tell you the truth, I do not use webcams anymore. This
command was tested on an ancient Logitech cam that still works
fine. Check the settings supported by your hi-res Hasselblad, and
update the size and bitrate options accordingly.
Screen Capture
The -f x11grab format option can be used to capture the video display
(-i :0.0, known as X in Linux). You need to specify the capture settings in
the order given here, that is, -f, -s, -i, and -b. The frame rate can be 12 at
the minimum. Otherwise, the output capture file will be very big.
195
Chapter 11 FFmpeg Tips and Tricks
☞ Replace the value for the -s option with the pixel resolution of
the screen you are trying to grab.
You can capture sound playing on the speakers while you grab the
screen. The input to capture (sound mixer) is not easy to nail down on my
computer, so this command uses the PulseAudio hack (-i pulse) again.
196
Chapter 11 FFmpeg Tips and Tricks
-ac 2 \
-t 0:0:10 \
-y screen2.ogv
ffmpeg -y -i subhash-browser-rss-demo.mp4 \
-ignore_loop 0 -i animation-download.gif \
-filter_complex \
"[0:v:0]overlay=(W-w-10):(H-h)/2:shortest=1[v]" \
-map '[v]' -map 0:a:0 \
-c:v libx264 -c:a copy \
subhash-browser-rss-demo-with-download-button.mp4
197
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -y -i subhash-browser-rss-demo.mp4 \
-stream_loop -1 -i animation-download.gif \
-filter_complex \
"[0:v:0]overlay=(W-w-10):(H-h)/2:shortest=1[v]" \
-map '[v]' -map 0:a:0 \
-c:v libx264 -c:a copy \
subhash-browser-rss-demo-with-download-button.mp4
The -shortest option in the overlay filter ensures that the filter
processing ends when the output from the video file has been completed.
Otherwise, the endlessly looping GIF animation will continue the
processing forever.
198
Chapter 11 FFmpeg Tips and Tricks
Figure 11-12. This GIF is also animated over the video in the
background window
199
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -y -i rollcage-video.mp4 \
-filter:v \
"drawtext=:x=100:y=h-lh-100:
shadowcolor=FFFFFF66:shadowx=1:shadowy=2:
fontfile=Time.ttf:fontcolor=00000066:fontsize=70:
timecode=\'00\:00\:00\:00\':timecode_
rate=29.91" \
race-timer.mp4
200
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -f lavfi \
-i anullsrc \
-vn -t 0:0:12 -b:a 128k -c:a libmp3lame \
silent.mp3
201
Chapter 11 FFmpeg Tips and Tricks
This command uses the filter as a virtual input file. The anullsrc filter
does not require an input file. By default, it generates a 44100 Hz wave as
output. This one will have no sound though.
ffmpeg -f lavfi \
-i "sine=frequency=220:beep_factor=3:duration=20" \
sine.wav
202
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -f lavfi \
-i aevalsrc='sin(1000*PI*t*lt(t-trunc(t)\,0.1))' \
-t 0:0:20 sine.wav
The “brown” noise is closer to the sound that a TV generates when its
CATV signal cable is unplugged.
ffmpeg -y -i barbara.mp4 \
-filter_complex \
"[0:v:0]noise=alls=100:allf=a+t:enable='between(t,6,12)'[v];
[0:a:0]atrim=start=0:end=6, asetpts=N/SAMPLE_RATE/TB[fa];
anoisesrc=color=brown:d=6[ma];
[0:a:0]atrim=start=12:end=20, asetpts=N/SAMPLE_RATE/TB[la];
[fa][ma][la]concat=n=3:v=0:a=1[a]" \
-map "[v]" -map "[a]" \
-t 0:0:20 \
barb-intermission.mp4
This command uses a video noise filter between seconds 6 and 12. In
the same interval, the aforementioned brown noise is used in place of the
original audio.
203
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -y -i barbara.mp4 \
-filter_complex \
"[0:a:0]atrim=start=0:end=5, asetpts=N/SAMPLE_RATE/TB[a1];
sine=frequency=1000:duration=2[a2];
[0:a:0]atrim=start=7:end=10, asetpts=N/SAMPLE_RATE/TB[a3];
[a1][a2][a3]concat=n=3:v=0:a=1[a]" \
-map 0:v:0 -map '[a]' \
-t 0:0:10 \
barb-bleep.mp4
ffmpeg -y -i barbara.mp4 \
-filter_complex \
"[0:a:0]atrim=start=0:end=5, asetpts=N/SR/TB[a1];
[0:a:0]atrim=start=6:end=12, asetpts=N/SR/TB,
aecho=0.8:0.9:1000:0.3[a2];
[0:a:0]atrim=start=13:end=16, asetpts=N/SR/TB[a3];
[a1][a2][a3]concat=n=3:v=0:a=1[a]" \
204
Chapter 11 FFmpeg Tips and Tricks
Reverse a Video
In some of his movies, Jim Carrey does a live rewind of a shot. Does he
sound intelligible if you rewind that footage?
ffmpeg -y -i ace-ventura-reverse.mp4 \
-filter_complex \
"[0:v:0]reverse[v]; [0:a:0]areverse[a]" \
-map '[v]' -map '[a]' \
ace-ventura-reverse-reversed.mp4
205
Chapter 11 FFmpeg Tips and Tricks
206
Chapter 11 FFmpeg Tips and Tricks
Figure 11-15. Use xfade filter transition from one video to another.
Use crossfade to do the same for audio
207
Chapter 11 FFmpeg Tips and Tricks
If you get any time base or frame rate errors because of differences in
the videos, try this instead:
208
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -i sine.wav \
-filter_complex \
"showwaves=s=vga:mode=cline:draw=full:
colors=yellow[v]" \
-map '[v]' -map 0:a:0 \
-c:v mpeg4 -b:v 300K -r 24 \
sine-wave.mp4
209
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -y -i ace-ventura-reverse.mp4 \
-lavfi "showwavespic=s=600x120:split_channels=1:
colors=yellow|red:scale=sqrt" \
ace-waveform.png
ffmpeg -y -i chenda-music.mp4 \
-filter_complex \
"[0:a:0]showfreqs=s=250x100:mode=bar:cmode=separate:
210
Chapter 11 FFmpeg Tips and Tricks
colors=orange|red[chartf];
[0:a:0]showvolume=w=250:h=50:p=0.6:dm=2:dmc=red[chartv];
[0:a:0]showwaves=s=250x100:mode=cline:draw=full:
colors=yellow|orange:split_channels=1[chartw];
color=color=black@0:size=vga[bg];
[bg][chartf]overlay=x=20:y=20[v1];
[v1][chartv]overlay=x=20:y=150[v2];
[v2][chartw]overlay=x=20:y=280[v3];
[0:v:0][v3]overlay[v]" \
-map '[v]' -map 0:a:0 -shortest \
chenda-music-sound-levels.mp4
211
Chapter 11 FFmpeg Tips and Tricks
The colorkey filter requires three keys - the color, how strictly shades
of colors closer to the one specified are also made transparent, and by how
much the transparent pixels should blend with the background.
212
Chapter 11 FFmpeg Tips and Tricks
ffmpeg -i color-test.mp4 \
-filter:v "colorhold=yellow:similarity=0.2" \
hold-yellow.mp4
Figure 11-20. Using the colorhold filter, all colors in the original
video have been removed except yellow
213
Chapter 11 FFmpeg Tips and Tricks
214
Chapter 11 FFmpeg Tips and Tricks
I studied the build script and made a few changes to one of the files
extracted from the tarball (the downloaded compressed source code).
Then, I ran the make and make install commands to build the binaries.
Now, the version number is more meaningful. If I have to deal with multiple
ffmpeg binaries sometime in the future, this information will be useful.
Figure 11-22. The -version option displays the git label for
whatever it is worth, YOUR build date, and the number of the last
release version
This of course assumes that you will build the binaries on the same day
you downloaded the source.
215
Chapter 11 FFmpeg Tips and Tricks
Hardware Acceleration
Computer video cards have encoders and decoders of some popular
codecs in their chips. These hardware encoders and decoders are faster
than the CPU running software-based encoders and decoders. You can
offload the encoding and decoding operations of supported codecs from
the processor (CPU) on your computer’s motherboard to the processor
chip (GPU) on your graphics card. (AMD calls ’em GPUs as APUs.)
What the heck is all that? Well, instead of encoding the video using
your CPU with a software encoder like this,
… you can offload the processing to your video card like this:
# or
Is that not cool? Well, to use such an exotic option, you need to build
the FFmpeg source code forked by one of the participating video card
manufacturers. You can find more information on this topic from the
following:
https://docs.nvidia.com/video-technologies/video-codec-
sdk/ffmpeg-with-nvidia-gpu/
https://trac.ffmpeg.org/wiki/HWAccelIntro
216
Chapter 11 FFmpeg Tips and Tricks
Beware that not all GPU models are supported. In some cases,
performance may be inferior or have additional restrictions. nVidia seems
to have shown more interest and openness in this field than AMD or Intel.
I have AMD hardware and could not find enough documentation to build
from source.
It is better if you can get statically linked builds created by someone
else. For Windows users, the builds provided by the reviewer on his
website (www.gyan.dev) had support for hardware-accelerated encoders
and decoders in AMD and nVidia GPUs.
☞ No, wine will not work. I used it only to take this screenshot of
the encoder listing.
☞ The hevc encoders are for the newer H265 codec. Try ffmpeg
-hwaccels to see what hardware-accelerated options you have.
217
Chapter 11 FFmpeg Tips and Tricks
Apart from encoders and decoders, you can install some hardware-
accelerated filters when you build from source.
Finis
All right! What does this command do?
ffmpeg \
-f image2 -loop 1 -i BG-Collage.png \
-f mp4 -i idiot-box-2.mp4 -i chenda-music-sound-levels.mp4 \
-i Delphine-with-accessibility.mp4 \
-i race-timer.mp4 -i slide.mp4 \
-i watermarked-solar.mp4 \
-filter_complex \
"[0:v:0]drawtext=x=(w-tw)/2:y=15:
fontcolor=red:alpha=0.6:shadowx=1:shadowy=2:
218
Chapter 11 FFmpeg Tips and Tricks
This command creates a video that has six downscaled videos playing
simultaneously on a background image. The audio from the five input files
were downmixed to stereo. (The slideshow had no audio.) Even the texts
on the background were rendered by ffmpeg.
219
Chapter 11 FFmpeg Tips and Tricks
Figure 11-25. This video collage was created using several FFmpeg
techniques described in this book
This video and several others used in this book are available in an
online video playlist. You can find its link on these sites:
www.apress.com/9781484287002
www.vsubhash.in/ffmpeg-book.html
What Next…
Well, you have finished the book. What else can you do?
220
Chapter 11 FFmpeg Tips and Tricks
• https://superuser.com/questions/tagged/ffmpeg
• https://video.stackexchange.com/questions/
tagged/ffmpeg
http://ffmpeg.org/donations.html
221
CHAPTER 12
Annexures
Annexure 1: Sample List of Codecs
This annexure contains sample output for the command ffmpeg -codecs.
Codecs:
D..... = Decoding supported
.E.... = Encoding supported
..V... = Video codec
..A... = Audio codec
..S... = Subtitle codec
..D... = Data codec
..T... = Attachment codec
...I.. = Intra frame-only codec
....L. = Lossy compression
.....S = Lossless compression
-------
D.VI.S 012v Uncompressed 4:2:2 10-bit
D.V.L. 4xm 4X Movie
D.VI.S 8bps QuickTime 8BPS video
.EVIL. a64_multi Multicolor charset for Commodore 64 (encoders:
↳ a64multi)
.EVIL. a64_multi5 Multicolor charset for Commodore 64, extended with
↳ 5th color (colram) (encoders: a64multi5)
D.V..S aasc Autodesk RLE
D.V.L. agm Amuse Graphics Movie
D.VIL. aic Apple Intermediate Codec
DEVI.S alias_pix Alias/Wavefront PIX image
DEVIL. amv AMV Video
D.V.L. anm Deluxe Paint Animation
D.V.L. ansi ASCII/ANSI art
DEV..S apng APNG (Animated Portable Network Graphics) image
D.V.L. arbc Gryphon's Anim Compressor
D.V.L. argo Argonaut Games Video
DEVIL. asv1 ASUS V1
DEVIL. asv2 ASUS V2
D.VIL. aura Auravision AURA
224
Chapter 12 Annexures
225
Chapter 12 Annexures
226
Chapter 12 Annexures
227
Chapter 12 Annexures
228
Chapter 12 Annexures
229
Chapter 12 Annexures
230
Chapter 12 Annexures
231
Chapter 12 Annexures
232
Chapter 12 Annexures
233
Chapter 12 Annexures
234
Chapter 12 Annexures
235
Chapter 12 Annexures
236
Chapter 12 Annexures
237
Chapter 12 Annexures
238
Chapter 12 Annexures
239
Chapter 12 Annexures
240
Chapter 12 Annexures
241
Chapter 12 Annexures
242
Chapter 12 Annexures
243
Chapter 12 Annexures
244
Chapter 12 Annexures
245
Chapter 12 Annexures
246
Chapter 12 Annexures
247
Chapter 12 Annexures
248
Chapter 12 Annexures
249
Chapter 12 Annexures
250
Chapter 12 Annexures
251
Chapter 12 Annexures
252
Chapter 12 Annexures
253
Chapter 12 Annexures
254
Chapter 12 Annexures
255
Chapter 12 Annexures
256
Chapter 12 Annexures
257
Chapter 12 Annexures
258
Chapter 12 Annexures
259
Chapter 12 Annexures
260
Chapter 12 Annexures
261
Chapter 12 Annexures
262
Chapter 12 Annexures
263
Chapter 12 Annexures
264
Chapter 12 Annexures
DE ivf On2 IVF
D ivr IVR (Internet Video Recording)
D j2k_pipe piped j2k sequence
DE jacosub JACOsub subtitle format
D jpeg_pipe piped jpeg sequence
D jpegls_pipe piped jpegls sequence
D jpegxl_pipe piped jpegxl sequence
D jv Bitmap Brothers JV
D kux KUX (YouKu)
DE kvag Simon & Schuster Interactive VAG
E latm LOAS/LATM
D lavfi Libavfilter virtual input device
D libcdio
D libgme Game Music Emu demuxer
D libmodplug ModPlug demuxer
D libopenmpt Tracker formats (libopenmpt)
D live_flv live RTMP FLV (Flash Video)
D lmlm4 raw lmlm4
D loas LOAS AudioSyncStream
DE lrc LRC lyrics
D luodat Video CCTV DAT
D lvf LVF
D lxf VR native stream (LXF)
DE m4v raw MPEG-4 video
E matroska Matroska
D matroska,webm Matroska / WebM
D mca MCA Audio Format
D mcc MacCaption
E md5 MD5 testing
D mgsts Metal Gear Solid: The Twin Snakes
DE microdvd MicroDVD subtitle format
DE mjpeg raw MJPEG video
D mjpeg_2000 raw MJPEG 2000 video
E mkvtimestamp_v2 extract pts as timecode v2 format, as
↳ defined by mkv toolnix
DE mlp raw MLP
D mlv Magic Lantern Video (MLV)
D mm American Laser Games MM
DE mmf Yamaha SMAF
D mods MobiClip MODS
D moflex MobiClip MOFLEX
E mov QuickTime / MOV
D mov,mp4,m4a,3gp,3g2,mj2 QuickTime / MOV
E mp2 MP2 (MPEG audio layer 2)
DE mp3 MP3 (MPEG audio layer 3)
E mp4 MP4 (MPEG-4 Part 14)
D mpc Musepack
D mpc8 Musepack SV8
DE mpeg MPEG-1 Systems / MPEG program stream
E mpeg1video raw MPEG-1 video
E mpeg2video raw MPEG-2 video
DE mpegts MPEG-TS (MPEG-2 Transport Stream)
D mpegtsraw raw MPEG-TS (MPEG-2 Transport Stream)
265
Chapter 12 Annexures
266
Chapter 12 Annexures
267
Chapter 12 Annexures
DE srt SubRip subtitle
D stl Spruce subtitle format
E stream_segment,ssegment streaming segment muxer
E streamhash Per-stream hash testing
D subviewer SubViewer subtitle format
D subviewer1 SubViewer v1 subtitle format
D sunrast_pipe piped sunrast sequence
DE sup raw HDMV Presentation Graphic Stream subtitles
D svag Konami PS2 SVAG
E svcd MPEG-2 PS (SVCD)
D svg_pipe piped svg sequence
D svs Square SVS
DE swf SWF (ShockWave Flash)
D tak raw TAK
D tedcaptions TED Talks captions
E tee Multiple muxer tee
D thp THP
D tiertexseq Tiertex Limited SEQ
D tiff_pipe piped tiff sequence
D tmv 8088flex TMV
DE truehd raw TrueHD
DE tta TTA (True Audio)
E ttml TTML subtitle
D tty Tele-typewriter
D txd Renderware TeXture Dictionary
D ty TiVo TY Stream
DE u16be PCM unsigned 16-bit big-endian
DE u16le PCM unsigned 16-bit little-endian
DE u24be PCM unsigned 24-bit big-endian
DE u24le PCM unsigned 24-bit little-endian
DE u32be PCM unsigned 32-bit big-endian
DE u32le PCM unsigned 32-bit little-endian
DE u8 PCM unsigned 8-bit
E uncodedframecrc uncoded framecrc testing
D v210 Uncompressed 4:2:2 10-bit
D v210x Uncompressed 4:2:2 10-bit
D vag Sony PS2 VAG
D vbn_pipe piped vbn sequence
DE vc1 raw VC-1 video
DE vc1test VC-1 test bitstream
E vcd MPEG-1 Systems / MPEG program stream (VCD)
D vfwcap VfW video capture
DE vidc PCM Archimedes VIDC
DE video4linux2,v4l2 Video4Linux2 output device
D vividas Vividas VIV
D vivo Vivo
D vmd Sierra VMD
E vob MPEG-2 PS (VOB)
D vobsub VobSub subtitle format
DE voc Creative Voice
D vpk Sony PS2 VPK
D vplayer VPlayer subtitles
D vqf Nippon Telegraph and Telephone Corporation (NTT) TwinVQ
268
Chapter 12 Annexures
DE w64 Sony Wave64
DE wav WAV / WAVE (Waveform Audio)
D wc3movie Wing Commander III movie
E webm WebM
E webm_chunk WebM Chunk Muxer
DE webm_dash_manifest WebM DASH Manifest
E webp WebP
D webp_pipe piped webp sequence
DE webvtt WebVTT subtitle
DE wsaud Westwood Studios audio
D wsd Wideband Single-bit Data (WSD)
D wsvqa Westwood Studios VQA
DE wtv Windows Television (WTV)
DE wv raw WavPack
D wve Psion 3 audio
D x11grab X11 screen capture, using XCB
D xa Maxis XA
D xbin eXtended BINary text (XBIN)
D xbm_pipe piped xbm sequence
D xmv Microsoft XMV
D xpm_pipe piped xpm sequence
E xv XV (XVideo) output device
D xvag Sony PS3 XVAG
D xwd_pipe piped xwd sequence
D xwma Microsoft xWMA
D yop Psygnosis YOP
DE yuv4mpegpipe YUV4MPEG pipe
269
Index
A from MIDI, 120
mono to stereo, 133
Apple Mac
stereo to mono, 131, 132
download, installation, 6
from text, 138
Audacity, 32, 34, 116, 120, 121, 123,
two stereo to one
125, 139, 140
stereo, 134–136
Audio
from video, 119
album art, 39, 64, 155–157,
visual waveforms, 136–138
160, 162
copy, 42, 55, 56
beep, 202, 203
cut, 78, 79
bitrate, 19, 48, 51, 55
decoder, 18, 19, 48, 231
bleep, 204
downmix (see Channels,
capture, 193, 194, 196
downmix)
channels
echo, 250
channel maps, 44
encoder, 18, 19, 48, 55
downmix, 126
espeak (see libflite)
filters, 44
extraction, 55, 56
merge, 44
fading, 105
mix, 42, 43
hardware, 164
move, 42, 126, 129, 130
libflite (see espeak)
mute, 41, 42
metadata, 23, 25, 38, 39, 155,
out-of-phase, 130, 131
158, 164, 260
split, 44, 126, 129
microphone, 19, 192–194
swap, 128
MIDI, 47
codec, 17–19, 56
mono, 24, 42, 43, 129, 131,
compression, 19, 173, 231, 242
133, 135
concatenate, 94, 171, 261
multi-channel, 126, 250
conversion
noise, 102, 139, 252
5.1 to stereo, 134
272
INDEX
273
INDEX
274
INDEX
275
INDEX
276
INDEX
I, J, K Linux
desktop (see FFmpeg,
I frames, 185–188, 191, 192
automation)
Image
download, compiling source
conversion
code, installation (see
slideshow, 60, 61
Source code)
video-to-image, 57, 58
See also bash
gallery, 191
Logo, see Filters, delogo
GIF, 197
render GIF animation over
video, 197–199 M
render static image over
Maps, see FFmpeg, filters,
video, 197
-channelmap; FFmpeg,
thumbnails, 191
options,-map; FFmpeg,
See also Blurring; Formats; I
options,-metadata_map
frames; P frames
Mate, see FFmpeg, automation
Input files
Matroska, see MKV
numbering, 39
Metadata, 20, 164
See also FFmpeg, options,-i;
adding, 34, 155–157
FFmpeg, options,-map
album art, 35, 37, 160, 161
Installation
for audio stream
Apple Mac, 9
language, 38, 163, 164
Linux, 147
export, 158
Windows, 1–6
global, 155
See also Hardware acceleration;
import, 159, 160
Source code
ISO codes, 163
map, 160, 161
L MP3 tags, 170–172
metadata
LAME MP3
maps, 39–41
conversion, 55
numbering, 170
ID3v2, 158, 172
remove, 162
tag, 40, 157, 158, 172
stream-specific, 25, 26, 155
libflite, 68, 138, 172
277
INDEX
Metadata (cont.) O
for subtitle stream language, 39
OGG, 47
See also FFmpeg, schematic;
Output file, 19, 27–31, 35, 39–41, 43,
Containers
45, 47–49, 51, 58, 75, 76, 79,
Microphone, 19, 192–194
81, 84, 90, 94, 120, 131,
MIDI, see Audio, MIDI
162, 164
MKV
container, 23, 24, 143, 153
conversion, 142 P, Q, R
subtitles, 23, 143, 153 PATH, see FFmpeg, executables,
MP3, see LAME MP3 installing in Windows
MP4, see MPEG4 P frames, 185
MPEG4 Pixelation, see Blurring
codecs, 52 Pixel formats, 54, 59, 93, 255, 257
constant bitrate, 52 PNG, 28, 29, 197, 226, 237, 246
constant quality, 52
constant rate factor, 53
encoders, 53, 54, 56 S
presets, 53 Sine wave, 202, 208, 209, 252
subtitle format, 153 Source code
tuning, 53 compilation guide, wiki
Muxers for Apple Mac users, 9
concat, 81 for Linux users, 6–8
GIF, 20 download, configure script,
See also Filters, concat; Help compilation, building
executable, 1, 16, 215
extra resources, 221
N version, 4
Nautilus, see FFmpeg, automation See also Hardware acceleration
Noise Streams
in audio, 102, 139, 250, 251 addressing (index), 141
high-pass filter, 139 numbering (index), 174
in video, 203 types (identifiers), 30
NUL, 5, 52, 170
278
INDEX
279
INDEX
280