4

This is a follow-up question to audio error when concatenating clips using ffmpeg

What I'd like to achieve is the following:

  1. Cut an input video into chunks
  2. Transcode each chunk (video and audio) individually using x.264 and libfdk_aac into a .ts container
  3. Concatenate the transcoded chunks and output as 1 .ts file

I can cut the segments and concatenate them using the trim/concat solution as suggested here

ffmpeg -i input.mp4 -filter_complex "[0:v]trim=duration=5[av];[0:a]atrim=duration=5[aa];\
[0:v]trim=start=5:end=10,setpts=PTS-STARTPTS[bv]; [0:a]atrim=start=05:end=10,asetpts=PTS-STARTPTS[ba];\
[av][bv]concat[cv];[aa][ba]concat=v=0:a=1[ca];\
[0:v]trim=start=10:end=15,setpts=PTS-STARTPTS[dv];\
[0:a]atrim=start=10:end=15,asetpts=PTS-STARTPTS[da];\
[cv][dv]concat[outv];[ca][da]concat=v=0:a=1[outa]" -map [outv] -map [outa] output.mp4

However this solutions is missing the "transcoding each segment individually" step.

I tried another approach - cut each segment and transcode it in the same command

ffmpeg -i input.mp4 -ss 00 -t 10 -vcodec libx264 -acodec libfdk_aac -f mpegts segment0.ts

then concat

printf "file '%s'\n" ./*.ts > mylist.txt

ffmpeg -f concat -i mylist.txt -vcodec copy -bsf:a aac_adtstoasc output.mp4

However already after the segment/transcode step, each segment has a brief silence at the beginning of its audio stream, which is then audible at each glue point in the concatenated video. I tried a number of testing videos and there was only 1 were I couldn't hear this introduced silence.

Now I'm wondering if there is a way to use the trim/concat solution above but include a "transcode each segment separately before concat" step. Maybe this has to be achieved in 3 commands, i.e. Trim into segments, Transcode each segment, Concat transcoded segments and can't be done all in the same command.

In the approach I tried where I used seeking and transcoding, these audio problems occurred and I'm suspecting they're introduced during the transcoding of each segment. Not sure if this is due to the approach I tried and can be avoided using e.g. the Trim filter (or another one?) with a subsequent transcode but am at a loss regarding a command or sequence of commands that could achieve this.


Trying this command

ffmpeg -i mtb.mp4 -ss 05 -t 5 -c:v libx264 -c:a aac -strict -2 transcoded3.mp4

Yields the following console output

ffmpeg -i mtb.mp4 -ss 05 -t 5 -c:v libx264 -c:a aac -strict -2 transcoded3.mp4
ffmpeg version 2.4.git Copyright (c) 2000-2014 the FFmpeg developers
built on Nov 20 2014 12:45:24 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
configuration: --prefix=/home/tobi/ffmpeg_build --extra-cflags=-I/home/tobi/ffmpeg_build/include     --extra-ldflags=-L/home/tobi/ffmpeg_build/lib --bindir=/home/tobi/bin --enable-gpl --enable-libass -    -enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --    enable-libvorbis --enable-libvpx --enable-libx264 --enable-nonfree --enable-x11grab
libavutil      54. 14.100 / 54. 14.100
libavcodec     56. 12.101 / 56. 12.101
libavformat    56. 14.100 / 56. 14.100
libavdevice    56.  3.100 / 56.  3.100
libavfilter     5.  2.103 /  5.  2.103
libswscale      3.  1.101 /  3.  1.101
libswresample   1.  1.100 /  1.  1.100
libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'mtb.mp4':
Metadata:
major_brand     : mp42
minor_version   : 1
compatible_brands: mp41mp42isom
creation_time   : 2014-06-07 13:05:13
Duration: 00:01:40.35, start: 0.263317, bitrate: 3231 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 3027 kb/s, 29.97     fps, 29.97 tbr, 60k tbn, 59.94 tbc (default)
Metadata:
  creation_time   : 2014-06-07 13:05:13
  handler_name    : Core Media Video
Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, stereo, fltp, 191 kb/s  (default)
Metadata:
  creation_time   : 2014-06-07 13:05:13
  handler_name    : Core Media Audio
[libx264 @ 0x25ec3a0] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
[libx264 @ 0x25ec3a0] profile High, level 3.1
[libx264 @ 0x25ec3a0] 264 - core 142 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 -     http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex     subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0     deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 lookahead_threads=1 sliced_threads=0 nr=0     decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0     direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0     rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40     aq=1:1.00
Output #0, mp4, to 'transcoded3.mp4':
Metadata:
major_brand     : mp42
minor_version   : 1
compatible_brands: mp41mp42isom
encoder         : Lavf56.14.100
Stream #0:0(und): Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 1280x720, q=-1--1,     29.97 fps, 30k tbn, 29.97 tbc (default)
Metadata:
  creation_time   : 2014-06-07 13:05:13
  handler_name    : Core Media Video
  encoder         : Lavc56.12.101 libx264
Stream #0:1(und): Audio: aac ([64][0][0][0] / 0x0040), 44100 Hz, stereo, fltp, 128 kb/s     (default)
Metadata:
  creation_time   : 2014-06-07 13:05:13
  handler_name    : Core Media Audio
  encoder         : Lavc56.12.101 aac
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> h264 (libx264))
Stream #0:1 -> #0:1 (aac (native) -> aac (native))
Press [q] to stop, [?] for help
frame=    1 fps=1.0 q=0.0 size=       0kB time=00:00:00.99 bitrate=   0.4kbits/sframe=   24 fps=     16 q=0.0 size=       0kB time=00:00:01.76 bitrate=   0.2kbits/sframe=   42 fps= 17 q=29.0 size=          53kB time=00:00:02.34 bitrate= 186.5kbits/frame=   45 fps= 15 q=29.0 size=      95kB     time=00:00:02.46 bitrate= 317.7kbits/frame=   49 fps= 13 q=29.0 size=     131kB time=00:00:02.60     bitrate= 411.1kbits/frame=   51 fps= 12 q=29.0 size=     169kB time=00:00:02.64 bitrate=     523.9kbits/frame=   54 fps= 11 q=29.0 size=     215kB time=00:00:02.76 bitrate= 636.9kbits/frame=       58 fps= 11 q=29.0 size=     247kB time=00:00:02.87 bitrate= 703.6kbits/frame=   61 fps= 10 q=29.0     size=     284kB time=00:00:02.99 bitrate= 776.5kbits/frame=   65 fps=9.8 q=29.0 size=     323kB     time=00:00:03.13 bitrate= 844.6kbits/frame=   69 fps=9.5 q=29.0 size=     363kB time=00:00:03.25     bitrate= 913.8kbits/frame=   72 fps=9.2 q=29.0 size=     408kB time=00:00:03.36 bitrate=     993.7kbits/frame=   76 fps=9.1 q=29.0 size=     437kB time=00:00:03.48 bitrate=1028.8kbits/frame=       80 fps=8.9 q=29.0 size=     483kB time=00:00:03.62 bitrate=1092.0kbits/frame=   83 fps=8.7 q=29.0     size=     516kB time=00:00:03.71 bitrate=1137.6kbits/frame=   86 fps=8.5 q=29.0 size=     557kB     time=00:00:03.83 bitrate=1190.5kbits/frame=   89 fps=8.2 q=29.0 size=     601kB time=00:00:03.92     bitrate=1254.7kbits/frame=   93 fps=8.2 q=29.0 size=     635kB time=00:00:04.06     bitrate=1279.6kbits/frame=   97 fps=8.0 q=29.0 size=     683kB time=00:00:04.20     bitrate=1331.7kbits/frame=  101 fps=7.9 q=29.0 size=     737kB time=00:00:04.31 bitrate=1397.1kbits/frame=  105 fps=7.8 q=29.0 size=     776kB time=00:00:04.45 bitrate=1426.4kbits/frame=  109 fps=7.7 q=29.0 size=     827kB time=00:00:04.59 bitrate=1473.2kbits/frame=  112 fps=7.6 q=29.0 size=     855kB time=00:00:04.69 bitrate=1492.8kbits/frame=  115 fps=7.5 q=29.0 size=     898kB time=00:00:04.78 bitrate=1538.3kbits/frame=  119 fps=7.5 q=29.0 size=     944kB time=00:00:04.92 bitrate=1570.5kbits/frame=  123 fps=7.5 q=29.0 size=     986kB time=00:00:04.94 bitrate=1633.0kbits/frame=  127 fps=7.4 q=29.0 size=    1040kB time=00:00:04.94 bitrate=1722.5kbits/frame=  131 fps=7.4 q=29.0 size=    1089kB time=00:00:04.94 bitrate=1803.1kbits/frame=  134 fps=7.3 q=29.0 size=    1146kB time=00:00:04.94 bitrate=1898.5kbits/frame=  137 fps=7.2 q=29.0 size=    1189kB time=00:00:04.94 bitrate=1970.1kbits/frame=  140 fps=7.2 q=29.0 size=    1231kB time=00:00:04.94 bitrate=2039.6kbits/frame=  143 fps=7.1 q=29.0 size=    1275kB time=00:00:04.94 bitrate=2112.6kbits/frame=  147 fps=7.1 q=29.0 size=    1326kB time=00:00:04.94 bitrate=2196.8kbits/frame=  150 fps=5.3 q=29.0 Lsize=    2092kB time=00:00:05.01 bitrate=3416.7kbits/s    
video:2007kB audio:79kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead:     0.283190%
[libx264 @ 0x25ec3a0] frame I:1     Avg QP:24.42  size: 53959
[libx264 @ 0x25ec3a0] frame P:123   Avg QP:24.87  size: 15050
[libx264 @ 0x25ec3a0] frame B:26    Avg QP:28.35  size:  5760
[libx264 @ 0x25ec3a0] consecutive B-frames: 70.7% 13.3% 16.0%  0.0%
[libx264 @ 0x25ec3a0] mb I  I16..4:  3.2% 80.6% 16.1%
[libx264 @ 0x25ec3a0] mb P  I16..4:  1.7%  8.2%  1.4%  P16..4: 44.2% 20.7%  6.6%  0.0%  0.0%        skip:17.1%
[libx264 @ 0x25ec3a0] mb B  I16..4:  0.4%  1.6%  0.3%  B16..8: 48.8%  5.7%  0.6%  direct: 0.8%      skip:41.9%  L0:53.1% L1:45.4% BI: 1.5%
[libx264 @ 0x25ec3a0] 8x8 transform intra:72.8% inter:76.3%
[libx264 @ 0x25ec3a0] coded y,uvDC,uvAC intra: 64.3% 36.4% 1.5% inter: 23.5% 15.1% 0.0%
[libx264 @ 0x25ec3a0] i16 v,h,dc,p: 32% 21% 25% 22%
[libx264 @ 0x25ec3a0] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 17% 18% 20%  6%  8%  6%  9%  6%  9%
[libx264 @ 0x25ec3a0] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 21% 14%  7% 10%  8% 11%  6%  8%
[libx264 @ 0x25ec3a0] i8c dc,h,v,p: 69% 15% 14%  2%
[libx264 @ 0x25ec3a0] Weighted P-Frames: Y:2.4% UV:0.0%
[libx264 @ 0x25ec3a0] ref P L0: 77.7% 20.1%  1.6%  0.5%  0.0%
[libx264 @ 0x25ec3a0] ref B L0: 99.0%  0.9%  0.1%
[libx264 @ 0x25ec3a0] ref B L1: 98.8%  1.2%
[libx264 @ 0x25ec3a0] kb/s:3284.53
7
  • Your requirement is a little confusing. The suggested solution you refer to does transcode each segment. However, it was meant to extract specific sections from a video and not chop into slices. For your purpose you could simply do (for example) three separate transcodes for three separate start and end points and then concatenate. Also, there should not be any "silence" in the beginning if all goes correctly. And do provide console output of that command.
    – Rajib
    Commented Nov 21, 2014 at 4:30
  • I just sliced a chunk out of the video and used a separate command to transcode it. I added all sample files here if you want to have a look, dropbox.com/sh/h09r1pm1f4mv5fh/AABJOg-zfXdJDNtwqiQjhHQEa?dl=0 including 2 screenshots that show the audio stream in Audacity of the chunk and of the chunk after transcoding it. There appears to be added silence at the start of the transcoded file, not sure what I'm doing wrong. The original video is mtb.mp4, the chunk is chunk1.mp4 and the transcoded chunk is transcoded1.mp4. I also added a .txt with commands & console output.
    – user342545
    Commented Nov 21, 2014 at 7:28
  • Why not try ffmpeg -i mtb.mp4 -ss 00 -t 5 -c:v libx264 -c:a aac -strict -2 transcoded1.mp4 at one go? (You don't need start time in this case though). Also, please provide the console output here instead of dropbox.
    – Rajib
    Commented Nov 21, 2014 at 10:41
  • thanks, I tried your command using the native aac instead of libfdk_aac, see console output above. Alas there is still this introduced very brief silence at the beginning of each clip I extract & transcode this way, which is audible when I concat the segments back together. Any idea how this could be solved?
    – user342545
    Commented Nov 24, 2014 at 0:47
  • @Rajib, this here seems to describe the problem I'm experiencing though doesn't include a solution for it, lists.ffmpeg.org/pipermail/ffmpeg-user/2013-December/…
    – user342545
    Commented Nov 24, 2014 at 3:56

1 Answer 1

1

You are essentially talking about separate encodes for separate chunks of video, which can be achieved as separate runs- as in the case of your command ffmpeg -i mtb.mp4 -ss 05 -t 5 -c:v libx264 -c:a aac -strict -2 transcoded3.mp4.

To use the Trim filter to slice a chunk from say 5 seconds to 10 seconds, use this (this does use setpts and asetpts):

ffmpeg -i mtb.mp4 -filter_complex \
"[0:v]trim=start=05:end=10,setpts=PTS-STARTPTS[tv]; \
[0:a]atrim=start=05:end=10,asetpts=PTS-STARTPTS[ta]" \
-map [tv] -map [ta] segment2.ts

Note that I have not included any encoding parameters (quality) at all.

When you concatenate ts files you should be able to simply say:

cat segment1.ts segment2.ts .... segmentn.ts > fullfile.ts

1
  • When I use the trim filter like this, each resulting slice has some silence at the beginning of its audio stream, same as with the command that uses seeking and then transcoding. If this added silence can't be prevented, this might not be the way to go for my use case as I need to be able to slice out pieces of video, transcode them and concat them so that the output file sounds and looks like the input file. Thanks for the help so far, in any case!
    – user342545
    Commented Nov 25, 2014 at 6:11

You must log in to answer this question.