0

I am trying to extract audio snippets using command line tools. I get consistent, unexpected results and I believe this is due to how the audio files were created/encoded.

Note: I realise there are other approaches to share the content, I'm doing it this way to share the content with users who are either not very computer literate or geo-blocked from the raw content.

Problem Description / Reproduction steps:

  • I start off by using yt-dlp to download a podcast, such as this one with this command:
    yt-dlp -x --audio-format mp3 -o GQT_2012-10-14.mp3 https://www.bbc.co.uk/programmes/b01n6vnh

  • The file is downloaded and plays correctly. I would like to extract a snippet that starts at 20:48 and lasts 03:58, so it finishes at 24:46

  • I tried this first using FFmpeg (version 4.2.7-0ubuntu0.1 on Ubuntu 20.04), with this command:
    ffmpeg -i "/home/user/GQT_2012-10-14.mp3" -ss 00:20:48 -t 00:03:58 GQT_2012-10-12_Snippet1.mp3
    This generates a file that is 3 minutes 58 seconds long but the start time corresponds to 20:28 in the original file.

  • Then I tried using Mp3Splt (version 2.6.2 on the same OS. I am aware that this is an old version), with this command:
    mp3splt "/home/user/GQT_2012-10-14.mp3" -o GQT_2012-10-12_Snippet1 20.48.00 24.46.00
    This generates the same output, a file that is the correct length but 20 seconds early in terms of the expected start time.

Given the same results from both command line tools, this suggests the issue lies with the input file. I tried to inspect it using ffprobe. Within the output, I saw this: Duration: 00:43:00.09, start: 0.025057, bitrate: 141 kb/s I interpret this as the file is "tagged" as starting 25 milliseconds in. Certainly not 20 seconds.

I tried to reset this to zero anyway, trying variations of this answer, I wasn't successful.

I'm looking the understand the root cause of the error in the extracted snippets and correct it.

1 Answer 1

1

I did some tests with the file you provided, and I believe your ffmpeg command actually cuts the file at the exact location you are asking it to.

I believe the actual problem here is the players showing the wrong timestamp when seeking (I tried both vlc and mplayer, and they seem to behave similarly): If I let vlc play the file from start without seeking forward (I actually let it run in the background for 20 minutes!), when it reaches 20:48 it is at exactly the same position where the file produced by ffmpeg starts! If instead I start playing in vlc, and skip forward, that location will be presented as 20:28 instead! My guess here is, that seeking on those players just skips to the next keyframe (or something similar? Not very familiar with the internals of the mp3 format) and just estimates the time passed based on the bitrate (which is variable). You can demonstrate this effect very good by running vlc and seeking close to the end and see vlc continue playing past 43 minutes (I tried seeking at 42:42 and it played until 43:08).

In summary, for getting the exact timing in an mp3, using the timestamps shown by a player like vlc or mplayer doesn't seem to be a good option. Instead, you can use some audio editing program like audacity, which decodes the whole file in the beginning, so the timings should be accurate there. Of course, you can use it also for the cutting part, so you wouldn't need ffmpeg at all to begin with in this case.

1
  • Thank you for taking the time to look into this and particularly for listening to 20+ minutes of a radio programme about gardening (not sure if it's your thing). I never for a second thought VLC could be at fault, it's a piece of software I've been using for years. I'm not going to mark your answer as accepted I'm afraid. You have provided me with a valid work around but I still want to understand the root cause of this. I don't experience this issue with audio files I have downloaded (without yt-dlp). Although VLC isn't playing nice, I still think it's something about the file that's at play Commented Feb 9, 2023 at 13:38

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .