On a small FLAC folder of 1593 files, ffprobe
takes over 12 times longer to run than metaflac
. How to speed up the single-threaded performance of ffprobe
?
I tried to tell ffprobe
to do the same and print only the VORBIS_COMMENT
or TAG
field by running ffprobe -show_entries format_tags ..
, but this unfortunately doesn't improve the performance.
This script searches for possible URLs:
1.ffprobe
:
$ time find . -type f -iname "*.flac" -print0 | while read -d $'\0' FILENAME; do
INFOS=$(ffprobe -hide_banner "$FILENAME" 2>&1)
URL=$("$INFOS"|& grep -i 'http\|www')
if [ "$URL" != "" ]; then
echo -e "$FILENAME\t$URL"
fi
done | column -t -s $'\t'
2.metaflac
:
(only this line is different:)
...
INFOS=$(metaflac --list --block-type=VORBIS_COMMENT "$FILENAME" 2>&1)
...
Benchmarks:
The CPU is a Ryzen 5600X and the files are located on an external 850 EVO 1TB SSD.
Single-threaded benchmarks:
$ time find .. | |
---|---|
metaflac |
real 0m2,579s |
mediainfo |
real 0m14,222s |
ffprobe |
real 0m32,379s |
exiftool |
real 1m21,881s |
Edit To speed things up I modified my code to run in parallel:
getURLs() {
FILENAME=$1
URL=$(mediainfo "$FILENAME" 2>&1 | grep -i 'http\|www')
if [ "$URL" != "" ]; then
echo -e "$FILENAME\t$URL"
fi
}
find . -type f -iname "*.flac" -print0 | while read -d $'\0' FILENAME; do
getURLs "$FILENAME" &
done | column -t -s $'\t'
wait
This parallelized script version utilizes my 5600X CPU fully and the performance seems to be CPU-bound:
Multi-threaded benchmarks:
$ time find .. | |
---|---|
metaflac |
real 0m0,695s |
mediainfo |
real 0m2,109s |
ffprobe |
real 0m4,844s |
I'm asking for a single-threaded speed improvement of ffprobe
, so I won't mark this as a solution, though it's a nice speed improvement indeed.