I would like to clarify that I understand how to use -ss
, -t
, -to
, stream copy, and the difference between stream copying and reencoding/transcoding.
What I don't understand, is how the seeking/cutting/splitting (these words can be used interchangeably, right ?) works regarding keyframes.
The FFmpeg doc says that, for the -ss
option :
When used as an input option (before
-i
), seeks in this input file to position. Note that in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position. When transcoding and-accurate_seek
is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when-noaccurate_seek
is used, it will be preserved.When used as an output option (before an output url), decodes but discards input until the timestamps reach position.
The FFmpeg Wiki says that :
Input seeking
Specify
-ss
before-i
:ffmpeg -ss 00:23:00 -i "Mononoke.Hime.mkv" -frames:v 1 "out1.jpg"
The demo produces 1 image frame at 23 min from the beginning of the movie. The input will be parsed by keyframe, which is very fast.
As of FFmpeg 2.1, when transcoding with ffmpeg (i.e. not stream copying):
-ss
is also "frame-accurate" even as input option. Previous behavior (seek only to nearest preceding keyframe, despite inaccuracy) can be restored with-noaccurate_seek
.Output seeking
Specify
-ss
after-i
:ffmpeg -i "Mononoke.Hime.mkv" -ss 23:00 -frames:v 1 "out2.jpg"
The demo also produces 1 image frame precisely at 23 min from the beginning of the movie.
Here, the input is decoded (and discarded) until it reaches the position indicated by
-ss
. This will be done relatively slow, frame-by-frame.Seeking while codec copy
Using
-ss
with-c copy
alike may not be accurate: since ffmpeg may only split on I-frame (keyframe independently decodable) alike. Though it may, if applicable: auto-adjust the stream's start time to negative to compensate.E.g. (with typical video) requested timestamp 157 s; but no keyframe until 159 s: It shall include ~ 2 s audio (no video) at the start, and start from the 1st keyframe.
Since it is not clear which parts of the text are outdated, and the writing style is sometimes quite bad, I am not sure how reliable this wiki page is.
I did some tests, and saw that when I use stream copy :
- the duration of the resulting video is the exact same as when I don't use stream copy
- the end of the resulting video is the exact same as when I don't use stream copy
- the start of the resulting video is, audio-wise, the exact same as when I don't use stream copy ; however, video-wise :
- if I used
-ss
as an input option : the first seconds are pixelated - if I used
-ss
as an output option : the first seconds are a still image
- if I used
The commands I used :
ffmpeg -ss POSITION -t DURATION -i INPUT -map 0 [+ -c copy] OUTPUT
ffmpeg -ss POSITION -i INPUT -t DURATION -map 0 [+ -c copy] OUTPUT
ffmpeg -i INPUT -ss POSITION -t DURATION -map 0 [+ -c copy] OUTPUT
ffmpeg -ss POSITION -to POSITION -i INPUT -map 0 [+ -c copy] OUTPUT
ffmpeg -ss POSITION -i INPUT -to POSITION -map 0 [+ -c copy] OUTPUT
ffmpeg -i INPUT -ss POSITION -to POSITION -map 0 [+ -c copy] OUTPUT
Here are my questions :
The doc says : "in most formats it is not possible to seek exactly, so ffmpeg will seek to the closest seek point before position". Here, does "seek point" means "keyframe" (or "keyframe"-like) ? or something else ?
When
-ss
is used as an input option, the input is parsed keyframe by keyframe ; while when-ss
is used as an output option, the input is decoded (whether we do stream copy or not) and parsed frame by frame. Is that right ?The doc says : "ffmpeg will seek to the closest seek point before position. When transcoding and
-accurate_seek
is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when-noaccurate_seek
is used, it will be preserved.". However, the wiki says, about using-ss
with stream copy : "E.g. (with typical video) requested timestamp 157 s; but no keyframe until 159 s: It shall include ~ 2 s audio (no video) at the start, and start from the 1st keyframe.". Doesn't this contradict the doc about the preservation of the extra segment between the seek point and position ? and also, about the claim that ffmpeg seeks to the closest seek point BEFORE position ?
Same thing with my tests, I saw that doing stream copy doesn't change the starting point nor the duration of the resulting video, but it makes the beginning pixelated or still. Doesn't this also contradict the doc ?
I would have thought that if ffmpeg was seeking to the closest seek point before position, and the extra segment between the seek point and position was preserved when doing stream copy, then the video I obtained with stream copy should start with a keyframe, and shouldn't be pixelated or still at its start (and shouldn't start at the exact same timestamp as when I don't use stream copy).
From what I understand, the pixelated/still beginning when using stream copy comes from the fact that I chose a starting point (for-ss
) that was not a keyframe, which means the first seconds can't be decoded properly ; but ffmpeg didn't seem to preserve anything compared to not using stream copy.The wiki says : "Using
-ss
with-c copy
alike may not be accurate: since ffmpeg may only split on I-frame (keyframe independently decodable) alike". However, as in my question above, from my tests and from my understanding, ffmpeg can accurately split/seek/cut anywhere : it's just that, with stream copy, the start of the video will generally be pixelated or still. Is that right ?With stream copy, regarding the first seconds of the resulting video, why are they a still image when I use
-ss
as an output option, instead of being pixelated like when I use-ss
as an input option ?The doc says : "ffmpeg will seek to the closest seek point before position. When transcoding and
-accurate_seek
is enabled (the default), this extra segment between the seek point and position will be decoded and discarded. When doing stream copy or when-noaccurate_seek
is used, it will be preserved.". However, I have read elsewhere that "-noaccurate_seek
option uses nearest keyframe and is about a brazillian times faster". How is that option this faster, when the doc suggests that all it does is preserving the extra segment between the seek point and position ?I also did some tests with
-noaccurate_seek
. Compared to a basic command without stream copy, if I add-noaccurate_seek
, the start of the resulting video is the exact same, however, the end of the video is truncated by a few seconds. Why is that ? If there was a difference, shoudn't it be at the start of the video ? I never had such truncation at the end when I tested with stream copy alone (without-noaccurate_seek
).