6

The Usage

I extract images from videos using ffmpeg.

I dump one down-scaled image every 10 seconds, inclusive, that I combine into montages with imagemagick. These montages again are used to show a preview from the video when hovering the scrubber in a web-based video player. (Calculating which image in the montage to show).

The Command

After playing around I ended up with the following command where the idea is speed over quality:

ffmpeg \
    -loglevel error \
    -hwaccel cuvid \
    -hwaccel_output_format cuda \
    -c:v h264_cuvid \
    -i "$video_file" \
    -r 0.1 \
    -filter:v "scale_cuda=w=-1:h=100,thumbnail_cuda=2,hwdownload,format=nv12" \
    -color_range 2 \
    f%09d.jpg

This seemed to work fine. The shots are off by about ± 0.5 - 1 sec here and there, but that is livable.

The Problem

Issue is that ffmpeg produces one extra image at start of videos. E.g. files are:

file             time
f000000001.jpg   00:00:00
f000000002.jpg   00:00:00
f000000003.jpg   00:00:10
f000000004.jpg   00:00:20
f000000005.jpg   00:00:30
...

Sometimes first and second is off by a few milliseconds.

As I know it (now) I can simply delete the first image and proceed with the rest, but not sure why this happens and if it is a bug or something else.

Put another way: I need to know if the "effect" of two first frames are reliable so that I can delete it in other versions of ffmpeg as well.

As I use the images to show 10 sec. snapshot from the video at a specified time it is off by 10 seconds if I do not delete first image generated. If for some reason it should then not create a dupe at start, other version or what ever, deleting the first image would create the same issue.

Montage

(If of interest the montages are created something like):

montage -tile 5x -geometry +0+0 -background none [file1  - file50 ]  montage01.jpg
montage -tile 5x -geometry +0+0 -background none [file51 - file100]  montage02.jpg
...

Command I use now based on answer (shell):

# Set on call or global:
file_in=sample.mp4
pix_fmt=yuvj420p
sec_snap_interval=10
nr_start=1
pfx_out=snap


ffmpeg \
    -loglevel warning \
    -hwaccel cuvid \
    -hwaccel_output_format cuda \
    -c:v h264_cuvid \
    -i "$file_in" \
    -pix_fmt "$pix_fmt" \
    -filter:v "
        scale_cuda=
            w = -1 :
            h = 100,
        thumbnail_cuda = 2,
        hwdownload,
        format = nv12,
        select = 'bitor(
            gte(t - prev_selected_t, $sec_snap_interval),
            isnan(prev_selected_t)
        )'
    " \
    -vsync passthrough \
    -color_range 2 \
    -start_number "$nr_start" \
    "$pfx_out%09d.jpg"

1 Answer 1

3

Using -r 0.1 sets the output framerate to 0.1Hz, but it's not guaranteed to get frame from the input video exactly every 10 seconds (I am not sure why).

One way for solving it is using select filter.

Example (without GPU acceleration):

ffmpeg -i input.mp4 -vf "select=bitor(gte(t-prev_selected_t\,10)\,isnan(prev_selected_t))" -vsync 0 f%09d.jpg

  • bitor(gte(t-prev_selected_t\,10) is 1 when difference between "passed" timestamps is grater of equal 10 seconds.
    When expression is evaluated to 1, the frame is "selected" and passed to the output.
  • bitor with isnan(prev_selected_t) passes the first frame, where prev_selected_t is NaN (has no value).
  • -vsync 0 applies "passthrough" - Each frame is passed with its timestamp from the demuxer to the muxer.

Here is an example with scale_cuda and thumbnail_cuda:

ffmpeg \
    -loglevel error \
    -hwaccel cuvid \
    -hwaccel_output_format cuda \
    -c:v h264_cuvid \
    -i "$video_file" \
    -filter:v "scale_cuda=w=-1:h=100,thumbnail_cuda=2,hwdownload,format=nv12,select=bitor(gte(t-prev_selected_t\,10)\,isnan(prev_selected_t))" \
    -vsync 0 \
    -color_range 2 \
    f%09d.jpg   
  • Due to the usage of thumbnail_cuda filter, we have to place the select filter at the end.

Testing:
Build synthetic video with frame counter at 10fps:

ffmpeg -y -f lavfi -r 10 -i testsrc=size=128x72:rate=1:duration=1000 -vf setpts=N/10/TB -vcodec libx264 -pix_fmt yuv420p input.mp4

Output frames after executing the above command:

enter image description here enter image description here enter image description here enter image description here enter image description here
enter image description here enter image description here enter image description here enter image description here enter image description here

As you can see, the selected frames are exactly every 10 seconds.

5
  • Simply awesome 👍 🏆. Though it does not address the question "why is first image a duplicate" (or perhaps I'm missing something) - this is better as it 1) does not create a dupe at first "frame". 2) is spot on accurate 3) does not have any (big/any) speed penalty – Only wish I dug into this before processing 4,000+ hours of video haha.
    – Moba
    Commented Jun 22, 2022 at 18:15
  • I could change the title to reflect answer more. E.g. something like How to extract frames at exact time-intervals with or w/o cuda? – but there are perhaps duplicates of that. Have not found any with this solution, but my SE search skills are flawed.
    – Moba
    Commented Jun 22, 2022 at 18:31
  • You're welcome. "extracting frames every N seconds" is great - there is a chance that someone finds the post in Google. You have a spelling mistake: replace evey with every. "Why is first image a duplicate?" is a good question, but I don't know the answer... According to the documentation "vsync is deprecated". Maybe there is a solution using -fps_mode.
    – Rotem
    Commented Jun 22, 2022 at 20:30
  • I'm on ffmpeg version N-107017 and it is there. I did a more through check and it looks like there is some offset added per frame. E.g. at the 01:00:00 mark the snapshot is from around 01:00:03, at 01:30:00 it's around 01:30:05 etc. Now the pedant in me has awoken and I have to look more at it :P Same both for the cuda and non cuda method.
    – Moba
    Commented Jun 23, 2022 at 0:21
  • OK. I ended up writing it in JavaScript. Got the time used reduced with about 65% + the shots are off by 0.05 sec or there about at most :) – it is a lot more heavy on the CPU though. Have not bothered with JS / GPU. Considering it produces finished montages the time reduction is even more.
    – Moba
    Commented Jun 23, 2022 at 7:39

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .