AAC or MP3 are not the best choice of codecs for ultra-low bandwidth transmissions. HE-AAC v2 may be somewhat usable, but in your case I'd useUse a proper speech codec with higher efficiency.
Opus is the best option. It is available in FFmpeg through libopus
. In fact, Opus is not just made for speech; it offers hybrid encoding for both speech and music.
Example:
ffmpeg -i <input> -c:a libopus -ac 1 -ar 16000 -b:a 8K -vbr constrained out.opus
Here, -ac
sets the output to mono, -ar
sets the sampling rate to 16 kHz, and -b:a
sets the bitrate to 8 kBit/s. The constrained variable bitrate mode is used here. In principle, it's not strictly necessary to downsample and downmix to mono with ffmpeg
, as that is something libopus
will do on its own to reach the specified bitrate target.
Some further recommendations are given here. Note that with Opus, 6–8 kBit/s is usable range for (mono, lower sample rate) speech, but not for music.
You'll find an interesting comparison of different codecs and their bitrate/quality curve on the Opus website:
I should add that this figure is an indication only; it's compiled from different test results and anecdotal knowledge.