I have been able to use FFmpeg with OpenAI Whisper to transcribe an audio file to text.
What I would like to do is, instead of using a pre-recorded file, stream an RTMP feed into FFmpeg in order to transcribe in real-time.
I have been able to use FFmpeg with OpenAI Whisper to transcribe an audio file to text.
What I would like to do is, instead of using a pre-recorded file, stream an RTMP feed into FFmpeg in order to transcribe in real-time.
I've had success with this shell script, which requires you to have ffmpeg
, inotifywait
and whisper
installed:
ffmpeg -i STREAM_URL -f segment -segment_time 30 -strftime 1 %s.mp4 -v verbose 2>&1 |
grep -Po --line-buffered "Opening '\K\d+" |
xargs -I _ bash -c 'echo; date -d @_; inotifywait -qqe CLOSE _.mp4; whisper --model medium.en _.mp4'
The ffmpeg
command saves 30-second segments to the current directory in mp4 format, named by Unix timestamp. Its output is then filtered for the debug messages that are printed when it opens a segment file for writing. The command will print the time when each segment file is opened, wait for ffmpeg
to close it, and then call whisper
.