4

I have a use-case where I take a few MPEG-4 files, trim them and concatenate them in to one file. I have a second use-case where one of those files are trimmed and also cropped/scaled, these files must be re-encoded.

The problem with the second use-case is that I end up with a mix of two different file layouts:

Trim only:

Format                         : MPEG-4
Format                         : AVC
Format/Info                    : Advanced Video Codec
Format profile                 : [email protected]
Format settings, CABAC         : Yes
Format settings, ReFrames      : 4 frames
Codec ID                       : avc1
Codec ID/Info                  : Advanced Video Coding
Duration                       : 28s 17ms
Bit rate                       : 3 362 Kbps
Width                          : 1 280 pixels
Height                         : 720 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Variable
Frame rate                     : 60.000 fps
Minimum frame rate             : 58.824 fps
Maximum frame rate             : 62.500 fps
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 8 bits
Scan type                      : Progressive
Bits/(Pixel*Frame)             : 0.061
Stream size                    : 11.2 MiB (95%)
Color primaries                : BT.709
Transfer characteristics       : sYCC
Matrix coefficients            : BT.709

Trim + crop/scale(re-encode)

Format                         : MPEG-4
Format                         : AVC
Format/Info                    : Advanced Video Codec
Format profile                 : High 4:4:4 [email protected]
Format settings, CABAC         : No
Format settings, ReFrames      : 1 frame
Codec ID                       : avc1
Codec ID/Info                  : Advanced Video Coding
Duration                       : 29s 0ms
Bit rate                       : 24.8 Mbps
Width                          : 1 280 pixels
Height                         : 720 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Constant
Frame rate                     : 60.000 fps
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 8 bits
Scan type                      : Progressive
Bits/(Pixel*Frame)             : 0.448
Stream size                    : 85.7 MiB (99%)
Writing library                : x264 core 144 r96 40bb568
Encoding settings              : cabac=0 / ref=1 / deblock=0:0:0 / analyse=0:0 / me=dia / subme=0 / psy=0 / mixed_ref=0 / me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=0 / chroma_qp_offset=0 / threads=3 / lookahead_threads=1 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=0 / weightp=0 / keyint=250 / keyint_min=25 / scenecut=0 / intra_refresh=0 / rc=cqp / mbtree=0 / qp=0

scale/crop command:

ffmpeg -i -ss 05 test.mp4 -c:a copy -vf "crop=w=(in_w/1000)*%d:h=(in_h/566)*%d:x=(in_w/1000)*%d:y=(in_h/566)*%d,scale=in_w:in_h" out-scale-crop.mp4

Per LordNeckbeard's request, FFMPEG's output has been added

ffmpeg version N-43527-gb23a866-   http://johnvansickle.com/ffmpeg/    Copyright (c) 2000-2015 the FFmpeg developers
  built on Jan 13 2015 01:29:05 with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --enable-gpl --enable-version3 --disable-shared --disable-debug --enable-runtime-cpudetect --enable-libmp3lame --enable-libx264 --enable-libx265 --enable-libwebp --enable-libspeex --enable-libvorbis --enable-libvpx --enable-libfreetype --enable-fontconfig --enable-libxvid --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libtheora --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-gray --enable-libopenjpeg --enable-libopus --disable-ffserver --enable-libass --enable-gnutls --cc=gcc
  libavutil      54. 16.100 / 54. 16.100
  libavcodec     56. 20.100 / 56. 20.100
  libavformat    56. 18.101 / 56. 18.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5.  7.100 /  5.  7.100
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/rohan/render_cache/v4033205_745.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.18.101
  Duration: 00:00:35.77, start: 0.000000, bitrate: 2143 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 2001 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 131 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
[libx264 @ 0x360bd20] using cpu capabilities: MMX2 SSE Cache64
[libx264 @ 0x360bd20] profile High 4:4:4 Predictive, level 3.0, 4:2:0 8-bit
[libx264 @ 0x360bd20] 264 - core 144 r96 40bb568 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 - http://www.videolan.org/x264.html - options: cabac=0 ref=1 deblock=0:0:0 analyse=0:0 me=dia subme=0 psy=0 mixed_ref=0 me_range=16 chroma_me=1 trellis=0 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=0 chroma_qp_offset=0 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=0 weightp=0 keyint=250 keyint_min=25 scenecut=0 intra_refresh=0 rc=cqp mbtree=0 qp=0
Output #0, mp4, to '/home/rohan/render_cache/v4033205_745_1_cut.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.18.101
    Stream #0:0(und): Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 690x384, q=-1--1, 30 fps, 15360 tbn, 30 tbc (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc56.20.100 libx264
    Stream #0:1(und): Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, 131 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame=   15 fps=0.0 q=0.0 size=     633kB time=00:00:00.55 bitrate=9409.0kbits/sframe=   65 fps= 65 q=0.0 size=    2986kB time=00:00:02.22 bitrate=11002.2kbits/frame=  117 fps= 77 q=0.0 size=    5507kB time=00:00:03.96 bitrate=11380.3kbits/frame=  182 fps= 91 q=0.0 size=    7832kB time=00:00:06.12 bitrate=10476.3kbits/frame=  244 fps= 97 q=0.0 size=   10248kB time=00:00:08.19 bitrate=10250.0kbits/frame=  290 fps= 96 q=0.0 size=   12275kB time=00:00:09.72 bitrate=10342.1kbits/frame=  337 fps= 96 q=0.0 size=   14408kB time=00:00:11.27 bitrate=10464.6kbits/frame=  401 fps= 99 q=0.0 size=   17318kB time=00:00:13.41 bitrate=10575.5kbits/frame=  458 fps=101 q=0.0 size=   20332kB time=00:00:15.31 bitrate=10872.8kbits/frame=  477 fps= 90 q=0.0 size=   21308kB time=00:00:15.94 bitrate=10946.6kbits/frame=  541 fps= 93 q=0.0 size=   24973kB time=00:00:18.08 bitrate=11314.0kbits/frame=  601 fps= 95 q=0.0 size=   28271kB time=00:00:20.07 bitrate=11534.2kbits/frame=  654 fps= 96 q=0.0 size=   31201kB time=00:00:21.84 bitrate=11701.2kbits/frame=  714 fps= 97 q=0.0 size=   34484kB time=00:00:23.86 bitrate=11837.8kbits/frame=  769 fps= 98 q=0.0 size=   37860kB time=00:00:25.69 bitrate=12069.0kbits/frame=  814 fps= 97 q=0.0 size=   40593kB time=00:00:27.18 bitrate=12232.8kbits/frame=  840 fps= 97 q=-1.0 Lsize=   42204kB time=00:00:28.02 bitrate=12338.7kbits/s dup=1 drop=0
video:41726kB audio:453kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.058509%
[libx264 @ 0x360bd20] frame I:4     Avg QP: 0.00  size:132759
[libx264 @ 0x360bd20] frame P:836   Avg QP: 0.00  size: 50474
[libx264 @ 0x360bd20] mb I  I16..4: 100.0%  0.0%  0.0%
[libx264 @ 0x360bd20] mb P  I16..4: 15.1%  0.0%  0.0%  P16..4: 40.7%  0.0%  0.0%  0.0%  0.0%    skip:44.2%
[libx264 @ 0x360bd20] coded y,uvDC,uvAC intra: 99.5% 98.9% 98.7% inter: 36.3% 39.9% 39.5%
[libx264 @ 0x360bd20] i16 v,h,dc,p: 56% 44%  0%  0%
[libx264 @ 0x360bd20] i8c dc,h,v,p:  0% 44% 55%  0%
[libx264 @ 0x360bd20] kb/s:12207.68

trim command:

ffmpeg -i -ss 05 test.mp4 -codec copy trimmed.mp4

ffmpeg version N-43527-gb23a866-   http://johnvansickle.com/ffmpeg/    Copyright (c) 2000-2015 the FFmpeg developers
  built on Jan 13 2015 01:29:05 with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --enable-gpl --enable-version3 --disable-shared --disable-debug --enable-runtime-cpudetect --enable-libmp3lame --enable-libx264 --enable-libx265 --enable-libwebp --enable-libspeex --enable-libvorbis --enable-libvpx --enable-libfreetype --enable-fontconfig --enable-libxvid --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libtheora --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-gray --enable-libopenjpeg --enable-libopus --disable-ffserver --enable-libass --enable-gnutls --cc=gcc
  libavutil      54. 16.100 / 54. 16.100
  libavcodec     56. 20.100 / 56. 20.100
  libavformat    56. 18.101 / 56. 18.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5.  7.100 /  5.  7.100
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/rohan/render_cache/v4033205_6295.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.18.101
  Duration: 00:00:36.02, start: 0.000000, bitrate: 2142 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 2001 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 131 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Output #0, mp4, to '/home/rohan/render_cache/v4033205_6295_0_cut.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf56.18.101
    Stream #0:0(und): Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 1920x1080, q=2-31, 2001 kb/s, 30 fps, 30 tbr, 90k tbn, 90k tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, 131 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame=  841 fps=0.0 q=-1.0 Lsize=    7498kB time=00:00:28.02 bitrate=2192.1kbits/s

When I play these a set of trimmed files concatenated into one file, everything seems to playback correctly on MPC-HC and Youtube, but when I add in scaled/croppped files and concatenate those together with the non-scaled/cropped files both player freeze when switching to the next segments.

I suspect the variable/constant frame-rate switching being the culprit.

The easy solution would be to just re-encode everything to the same constant frame-rate, but I'm hoping I don't have to (codec: copy is fast and keeps the quality) and the source I'm receiving these files from could have different frame rates etc.

Preferably I'd like to perform the crop/scale re-encode using the exact same settings as the input file so I don't run into this mismatch in output files when concatenating. Is this possible?

Edit #2, @occvtech

I am indeed using the concat protocol using a merge_list.txt like so:

./ffmpeg -f concat -i merge_list.txt -codec copy output.mp4

And its output:

ffmpeg version 2.6.2-   http://johnvansickle.com/ffmpeg/    Copyright (c) 2000-2015 the FFmpeg developers
  built with gcc 4.9.2 (Debian 4.9.2-10)
  configuration: --enable-gpl --enable-version3 --disable-shared --disable-debug --enable-runtime-cpudetect --enable-libmp3lame --enable-libx264 --enable-libx265 --enable-libwebp --enable-libspeex --enable-libvorbis --enable-libvpx --enable-libfreetype --enable-fontconfig --enable-libxvid --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libtheora --enable-libvo-aacenc --enable-libvo-amrwbenc --enable-gray --enable-libopenjpeg --enable-libopus --enable-libass --enable-gnutls --enable-libvidstab --enable-libsoxr --cc=gcc-4.9
  libavutil      54. 20.100 / 54. 20.100
  libavcodec     56. 26.100 / 56. 26.100
  libavformat    56. 25.101 / 56. 25.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 11.102 /  5. 11.102
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, concat, from '/home/rohan/render_cache/merge_list.txt':
  Duration: N/A, start: 0.000000, bitrate: 2208 kb/s
    Stream #0:0: Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 2076 kb/s, 30 fps, 30 tbr, 90k tbn, 60 tbc
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 131 kb/s
Output #0, mp4, to '/home/rohan/render_cache/sivhd_final.mp4':
  Metadata:
    encoder         : Lavf56.25.101
    Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 1920x1080, q=2-31, 2076 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc
    Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, 131 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (copy)
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
[concat @ 0x3204d60] DTS 92955 < 541117 out of order
[mp4 @ 0x322eb80] Non-monotonous DTS in output stream 0:0; previous: 92351, current: 15864; changing to 92352. This may result in incorrect timestamps in the output file. x 2000

frame=  421 fps=0.0 q=-1.0 Lsize=   13507kB time=00:00:14.07 bitrate=7862.8kbits/s
video:13268kB audio:226kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.096110%

The Non-monotonous DTS is warning is printed quite a lot(240 times), but DTS refers to audio and the audio in the output file sounds fine up until the croppped/trim part.

Also I have tried your suggestion to change the framerate (and combinations with different vsync parameters) both at the trim/crop ffmpeg step and the concat step but they only really seem to have effect when I'm re-encoding, not copy codeccing.

--

After I started writing this second edit, I been trying every vsync, filter:v fps, framerate, -r argument for a few hours now and the only actual working solution so far for has been re encoding everything with 60 fps(even if they are not trimmed/cropped) and concat-ing that. Using -preset ultrafast and -qp 0 it feels fast enough and the filesizes are quite big but I'm not keeping the files anyway.

So it seems codec copy is a no-go when there's at least one croppped/scaled file in there.

If anybody else does have a solution regarding this without needing to re-encode everything i'l gladly accept that otherwise I'l award the bounty to @occvtech.

6
  • You should show the complete console output from each command.
    – llogan
    Commented Apr 17, 2015 at 3:00
  • The output has been added!
    – Rohan
    Commented Apr 17, 2015 at 10:36
  • I know considerably less about this than you do, but you may find this interesting: rainnic.altervista.org/content/….
    – Joe
    Commented Apr 22, 2015 at 16:51
  • There are a few different ways to concatenate within FFmpeg. I suspect you are using the concatenate protocol since you are using codec copy, but it would help to know for sure. Can you please include the command and console for that process as well? Assuming you are using the protocol - I think your assessment is correct that FFmpeg is having trouble switching from variable framerate to constant framerate. That said, if your final goal is to upload to Youtube, the max fps is 60fps. Anything above that will drop frames and reduce to 60fps anyway, so why not just change framerate within FFmpeg?
    – occvtech
    Commented Apr 24, 2015 at 16:18
  • You might want to write down an answer @occvtech for me to attribute the bounty.
    – Rohan
    Commented Apr 27, 2015 at 12:10

2 Answers 2

4

I had similar issues with the concat filter and I think it's due to the differing time bases used for the inputs.

I overcame it with the concat protocol method.

I think it's showing variable framerate for your trim-only output because with a 1/90000 base time a 60Hz video will have expect a frame every 1500, but the concat filter may have smushed two videos together with a 1530 cycle gap between last and first frames (58.824 fps) and somewhere else 1440 (62.500 fps). Also, for whatever reason, ffmpeg decided to set the output time base to 1/15360.

When you cropped, it reevaluated the base time for the whole run-length of the output, which is why you get the constant frame rate. Notice that your first video shows a tbn of 15360, while the second is 90k.

In my case, I noticed that with the concat filter, the PTS/DTS values were being set in terms of the first input's time base even when the time base for the other input video streams weren't the same. With the first video using 1/25 and the second using 1/90000, running this on the output (which had a new time base of 1/12800):

ffprobe -hide_banner -show_frames -i output.mp4 2>&1 | grep -A 21 video | grep ts=

At the transition point between videos this happens:

pkt_pts=253952
pkt_dts=253952
pkt_pts=254464
pkt_dts=254464
pkt_pts=254976
pkt_dts=254976
pkt_pts=255488
pkt_dts=255488
pkt_pts=256000
pkt_dts=923443200
pkt_pts=925286400
pkt_dts=925286400
pkt_pts=923443200
pkt_dts=1045094400
pkt_pts=952934400
pkt_dts=1046937600
pkt_pts=954777600
pkt_dts=1048780800
pkt_pts=956620800
pkt_dts=1050624000
1
  • You were right all along!
    – Rohan
    Commented Sep 12, 2015 at 18:38
1

You cannot codec copy when applying a filter.

Also, Youtube has a max framerate of 60fps, so if that is your end goal, you will be dropping frames from your variable framerate file at some point within the transcoding chain anyway.

It's hard to give better advice without having more specific knowledge of your file type, and what your desired end format is.

That said, if you are getting the file to YouTube, I would recommend transcoding your files, and getting them into a static framerate. You could do it all in one step using the concatenate filter instead of the concatenate protocol as well - that way you don't need to produce intermediary files.

ffmpeg -i [INPUT1] -i [INPUT2] -filter_complex "[0:v] [0:a] [1:v] [1:a] concat=n=2:v=1:a=1 [v] [a]" -map "[v]" -map "[a]" ... [OUTPUT]

If quality is your main objective, and you are not keeping your intermediary files before going into youtube, you could transcode to an uncompressed format using -c:v rawvideo. The file sizes will be WAY bigger - so again, without knowing more specifically your end goal I'm not sure if uncompressed is the best option for you

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .