4

I am creating a "Picture in a Picture" audio/video file using three audio/video files.

The conversion completed, but has no audio when I play the completed video file. I do these types of conversions a lot without problems. However, this particular video is having an issue. I am not certain why the audio stream cannot be heard in the completed video.

ffmpeg -i 9318_segment_1_remote_0.mp4 -i 9318_segment_1_remote_1.mp4 -i 9318_segment_1_local_0.mp4 \
    -filter_complex \
    " [1:v]scale=203.33333333333:-1:flags=lanczos,setpts='if(eq(N,0),PTS,PTS+0.228/TB)',fps=30[rem1setpts]; \
    [2:v]scale=203.33333333333:-1:flags=lanczos[loc0]; \
    [0:v]setpts='if(eq(N,0),PTS,PTS+0.311/TB)',fps=30[1setpts]; \
    [1setpts][loc0]overlay=main_w-overlay_w-10:main_h-overlay_h-10[rem0]; \
    [rem0][rem1setpts]overlay=main_w-overlay_w-180:main_h-overlay_h-10[rem1]; \
    [0:a]adelay=311|311[0a]; \
    [1:a]adelay=228|228[1a]; \
    [0a][1a][2:a]amerge=inputs=3[a]" \
    -map "[rem1]" -map "[a]" -ac 3 \
    -vcodec libx264 \
    -ar 44100 -acodec aac \
    9318_segment_1.mp4

Results of command:

ffmpeg version n4.0.2-65-g938bc91 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.10) 20160609
  configuration: --prefix=/home/daryl/ffmpeg_build --pkg-config-flags=--static --extra-cflags=-I/home/daryl/ffmpeg_build/include --extra-ldflags=-L/home/daryl/ffmpeg_build/lib --extra-libs=-lpthread --bindir=/home/daryl/bin --enable-gpl --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '9318_segment_1_remote_0.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:24.61, start: 0.000000, bitrate: 326 kb/s
    Stream #0:0(und): Audio: ac3 (ac-3 / 0x332D6361), 48000 Hz, mono, fltp, 96 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
    Side data:
      audio service type: main
    Stream #0:1(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 227 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
    Metadata:
      handler_name    : VideoHandler
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from '9318_segment_1_remote_1.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:24.54, start: 0.000000, bitrate: 400 kb/s
    Stream #1:0(und): Audio: ac3 (ac-3 / 0x332D6361), 48000 Hz, mono, fltp, 96 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
    Side data:
      audio service type: main
    Stream #1:1(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 301 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
    Metadata:
      handler_name    : VideoHandler
Input #2, mov,mp4,m4a,3gp,3g2,mj2, from '9318_segment_1_local_0.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:24.86, start: 0.000000, bitrate: 468 kb/s
    Stream #2:0(und): Audio: ac3 (ac-3 / 0x332D6361), 48000 Hz, mono, fltp, 96 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
    Side data:
      audio service type: main
    Stream #2:1(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 369 kb/s, 16.67 fps, 16.67 tbr, 12800 tbn, 33.33 tbc (default)
    Metadata:
      handler_name    : VideoHandler
Stream mapping:
  Stream #0:0 (ac3) -> adelay
  Stream #0:1 (h264) -> setpts
  Stream #1:0 (ac3) -> adelay
  Stream #1:1 (h264) -> scale
  Stream #2:0 (ac3) -> amerge:in2
  Stream #2:1 (h264) -> scale
  overlay -> Stream #0:0 (libx264)
  amerge -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[Parsed_amerge_10 @ 0x35b8d80] No channel layout for input 1
[Parsed_amerge_10 @ 0x35b8d80] Input channel layouts overlap: output layout will be determined by the number of distinct input channels
[libx264 @ 0x2cfd200] using SAR=1/1
[libx264 @ 0x2cfd200] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
[libx264 @ 0x2cfd200] profile High, level 3.0
[libx264 @ 0x2cfd200] 264 - core 148 r2643 5c65704 - H.264/MPEG-4 AVC codec - Copyleft 2003-2015 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=1 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
[aac @ 0x2d03740] Using a PCE to encode channel layout
Output #0, mp4, to '9318_segment_1.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
    Stream #0:0: Video: h264 (libx264) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], q=-1--1, 30 fps, 15360 tbn, 30 tbc (default)
    Metadata:
      encoder         : Lavc58.18.100 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1: Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 2.1, fltp, 144 kb/s (default)
    Metadata:
      encoder         : Lavc58.18.100 aac
frame=  746 fps= 49 q=29.0 Lsize=     982kB time=00:00:24.76 bitrate= 324.9kbits/s speed=1.62x
video:800kB audio:154kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 2.881939%
[libx264 @ 0x2cfd200] frame I:3     Avg QP:17.42  size: 23901
[libx264 @ 0x2cfd200] frame P:221   Avg QP:19.93  size:  2688
[libx264 @ 0x2cfd200] frame B:522   Avg QP:23.95  size:   293
[libx264 @ 0x2cfd200] consecutive B-frames:  6.3%  1.1%  0.4% 92.2%
[libx264 @ 0x2cfd200] mb I  I16..4: 29.7% 39.5% 30.8%
[libx264 @ 0x2cfd200] mb P  I16..4:  2.5%  3.0%  0.4%  P16..4: 31.9%  6.2%  4.2%  0.0%  0.0%    skip:51.8%
[libx264 @ 0x2cfd200] mb B  I16..4:  0.0%  0.1%  0.0%  B16..8: 19.4%  0.4%  0.1%  direct: 0.3%  skip:79.7%  L0:46.9% L1:52.2% BI: 0.9%
[libx264 @ 0x2cfd200] 8x8 transform intra:49.6% inter:79.6%
[libx264 @ 0x2cfd200] coded y,uvDC,uvAC intra: 31.4% 47.9% 15.2% inter: 3.0% 7.2% 0.2%
[libx264 @ 0x2cfd200] i16 v,h,dc,p: 26% 16% 16% 42%
[libx264 @ 0x2cfd200] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 19% 16% 45%  3%  3%  3%  4%  4%  4%
[libx264 @ 0x2cfd200] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 22% 27% 13%  5%  7%  7%  7%  6%  6%
[libx264 @ 0x2cfd200] i8c dc,h,v,p: 57% 24% 17%  3%
[libx264 @ 0x2cfd200] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x2cfd200] ref P L0: 67.9%  5.7% 17.3%  9.0%
[libx264 @ 0x2cfd200] ref B L0: 89.5%  8.0%  2.6%
[libx264 @ 0x2cfd200] ref B L1: 97.0%  3.0%
[libx264 @ 0x2cfd200] kb/s:263.43
[aac @ 0x2d03740] Qavg: 59644.461

The completed file has the following details.

ffmpeg -i 9318_segment_1.mp4
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '9318_segment_1.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.12.100
  Duration: 00:00:24.87, start: 0.000000, bitrate: 323 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 640x480 [SAR 1:1 DAR 4:3], 263 kb/s, 30 fps, 30 tbr, 15360 tbn, 60 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, 2.1, fltp, 51 kb/s (default)
    Metadata:
      handler_name    : SoundHandler

As I understand it, "amerge terminates with the shortest input". So I have tried some different options with adding apad to input 0 and 1. However, all three files are similar in duration so I should hear some audio.

  • 9318_segment_1_remote_0.mp4 - Duration: 00:00:24.61 (Starts at 0.311 seconds)
  • 9318_segment_1_remote_1.mp4 - Duration: 00:00:24.54 (Starts at 0.228 seconds)
  • 9318_segment_1_local_0.mp4 - Duration: 00:00:24.86 (Starts at 0.0000 seconds)

Any idea why the output file in this case has no audible audio in it?

10
  • Are you sure, that 44.1 KHz and AAC is a valid combination? FFmpeg might easily accept it, but your player might not. Commented Dec 3, 2018 at 23:26
  • @EugenRieck I can't remember why I have "-ar 44100" in the command. I would need to do some testing to see if it causes any problems removing it. However, I went ahead and removed it from this particular example and it converted, but was still missing the audio.
    – Daryl
    Commented Dec 3, 2018 at 23:33
  • Try -ar 48000 just to make sure, this is a valid combination. Commented Dec 3, 2018 at 23:35
  • In addition to that, I see you use -ac 3 on mono-only inputs which looks fishy. Do you want 3 audio tracks? 3 audio channels from the three inputs? -ac 3 will not achive that! Commented Dec 3, 2018 at 23:39
  • @EugenRieck using "-ar 48000" had the same results.
    – Daryl
    Commented Dec 3, 2018 at 23:39

1 Answer 1

4

You've hit upon a limitation of the amerge filter. It assigns a channel layout based upon the output channel count. For 3 channels, the first layout available is 2.1 i.e. Front Left + Front Right + Low Frequency Effects. So, the third input (from 9318_segment_1_local_0.mp4) when encoded will have most of its frequencies stripped off and its destination marked for the subwoofer :). There are better layouts for 3 channels, but amerge will pick the first one in this case.

A partial remedy, sort of, is to switch the order of the inputs, so that the content bearing streams are first or second.

A better remedy is to use the amix filter which will merge all inputs to a single but audible channel --> [0a][1a][2:a]amix=inputs=3[a]. Remove the -ac 3.

2
  • Now that I have switched from amerge to amix, I'm having issues with the commands that in the past used apad. I'd used apad to "indefinitely extend all inputs except one". Refer to a similar issue you'd helped me with in the past at superuser.com/questions/1306899/…. I've done some tests using amix and removing apad entirely. So far I'm not seeing the issues amerge had with "terminating audio with the shortest input". Do you see any concerns with my removing apad now that I am using amix instead of amerge?
    – Daryl
    Commented Dec 4, 2018 at 21:59
  • amix ends, by default, with the longest input, so apad not needed.
    – Gyan
    Commented Dec 5, 2018 at 4:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .