2

Caveat: I'm not very familiar with audio codecs and terminology, so it could be that I'm not using the correct words to describe what I want to do. I've attempted to make that clear.

I'm using ecamm's Call Recorder for Skype. It records to a QuickTime MOV file with two audio streams. One is the audio recorded from the mic and the other is the audio recorded from the speakers. They provide a tool called "Convert to Internet" that they ship with their Movie Tools distribution.

I have a meeting every week that I've been recording, and I'm looking for a way to automate the process of "flattening" these audio streams into a single stream so I can share with others. The "Convert to Internet" application only supports converting a single file at a time, and it's a GUI app that works by dragging and dropping the file to convert onto the app's main window.

I'd like to find a way to achieve the same end result "Convert to Internet" application gets, but by using something that I run on the command line, (preferably from a Linux box that I can spin up to do the job and then spin down when it's done).

I think the product guide does a pretty good job of explaining the process that needs to happen. In the event that my rambling here is not all that clear.

I've dug into the documentation for FFmpeg, which I thought would be the best candidate for pulling this off. I found the amerge filter, but the docs make it sound like it'll just give me a single audio stream with 4 chanels (left and right from the first stream and left and right from the second stream). Since that's not what I want, I looked at the amix filter, but that appears to work with streams from different input files, and not multiple streams from a single file.

Any help getting pointed in the right direction is greatly appreciated.

Edit:

I thought the output of ffprobe for the file that I'm working with might also be useful.

$ ffprobe input.mov
ffprobe version 2.8.3 Copyright (c) 2007-2015 the FFmpeg developers
  built with Apple LLVM version 7.0.0 (clang-700.1.76)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/2.8.3 --enable-shared --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --cc=clang --host-cflags= --host-ldflags= --enable-opencl --enable-libx264 --enable-libmp3lame --enable-libvo-aacenc --enable-libxvid --enable-vda
  libavutil      54. 31.100 / 54. 31.100
  libavcodec     56. 60.100 / 56. 60.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 40.101 /  5. 40.101
  libavresample   2.  1.  0 /  2.  1.  0
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 537199360
    compatible_brands: qt  
    creation_time   : 2015-09-18 17:04:00
  Duration: 01:07:51.64, start: 0.000000, bitrate: 1503 kb/s
    Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 43 kb/s (default)
    Metadata:
      creation_time   : 2015-09-18 17:04:00
      handler_name    : Apple Alias Data Handler
    Stream #0:1(eng): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 63 kb/s (default)
    Metadata:
      creation_time   : 2015-09-18 17:04:00
      handler_name    : Apple Alias Data Handler
    Stream #0:2(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1280x720, 1381 kb/s, SAR 1:1 DAR 16:9, 19.63 fps, 14.42 tbr, 1k tbn, 50 tbc (default)
    Metadata:
      creation_time   : 2015-09-18 17:04:00
      handler_name    : Apple Alias Data Handler
      encoder         : H.264

2 Answers 2

2

Use amerge then downmix it with -ac:

ffmpeg -i input -filter_complex "[0:a:0][0:a:1]amerge=inputs=2[a]" \
-map 0:v -map "[a]" -c:v copy -ac 1 output

Also see FFmpeg Wiki: Audio Channel Manipulation.

5
  • Thanks! About an hour or so after I posted this question, I kept fiddling with amix and I got it to work (and I'll post it as a different answer). But I decided to give yours a try to see which was the superior approach. Yours ran much faster (about a minute), but the output file turned out to be larger than the original. With a 730M input file, I got a 735M output file with your approach. With the other one I got a 468M file, but it took about 15 minutes. Commented Dec 31, 2015 at 5:59
  • Ah! Looks like the reason mine command was resulted in the smaller file and the longer run time was because it was re-encoding the video, but yours was just doing a copy. Marking yours as the correct answer. Thanks! Commented Dec 31, 2015 at 6:40
  • @M.ScottFord Only the video stream is being stream copied, and the audio is being re-encoded (filtering requires encoding). I just made some assumptions about your input and ffmpeg build because at the time there was no console output provided. libvo-aac sucks, so you should add -c:a aac -strict -2 to use the native AAC encoder instead, although your build is old and does not have the recent AAC updates (you'll have to use a build from current git master to take advantage of that: see static builds).
    – llogan
    Commented Dec 31, 2015 at 9:31
  • I get an error: Unable to find a suitable output format for ' -map' -map: Invalid argument
    – Arthur
    Commented Jul 25, 2016 at 12:33
  • @Arthur Please use a pastebin site to show your actual command and the complete output. Then provide the link in a comment.
    – llogan
    Commented Jul 31, 2016 at 19:03
0

After fiddling around and doing more searching, I came up with the following command which appears to work.

ffmpeg -i input.mov -filter_complex "[0:0][0:1] amix=inputs=2[audio]" \
-map a output.mov

It took much longer to run than the answer posted by LordNeckbeard, but it resulted in a smaller file. So I guess I now have to decide which I value more, time or (storage) space.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .