I've updated the scripts based on the file name examples that you provided in your comment:
"Liam sur la moto (VHS) (2001) - Maison 13100.m2ts"
"M&L Plage 1080i (2012) - Camargue 30240.m2ts
I came up with two methods to handle this naming convention.
The first is to assume that in every case the year is enclosed in parenthesis. I updated the 'first' script to reflect that case; it's simply an update to the regex pattern that is used.
regexPat='\(\K[0-9]{4,4}(?=\))'
The second script was updated to show a different method, where we can't be sure the year is surrounded by parenthesis. Here we read out the result from the grep evaluation as an array in case there are multiple matches, and then do a sanity check on the year - i.e the year must be between 1970 and 2020; otherwise we assume its not a year.
Note that the readarray
command (aka mapfile) is only in Bash versions 4.x+. At the bottom is a more portable version using just read
. It can be tricky to parse the output of find
without things breaking due to spaces or specials characters in the filenames.
Script 1
#!/bin/bash
# Create test files
touch abcd\({2001,1985,1984,1931}\)efgh.m2ts
touch abcd{24001,198a5,19b84,1912331,1293}.m2ts
touch "abcd 1232 adffd.m2ts"
touch "Liam sur la moto (VHS) (2001) - Maison 13100.m2ts"
touch "M&L Plage 1080i (2012) - Camargue 30240.m2ts"
TestScriptResultFile="./CamCorderFindResult.file"
touch $TestScriptResultFile
regexPat='\(\K[0-9]{4,4}(?=\))'
readarray fileList <<<"$(IFS="\n" ; find . -name "*.m2ts" -exec basename {} \;)"
for i in "${fileList[@]}"; do
echo "Processing File: $i"
if year=$(grep -oP "$regexPat" <<<"$i"); then
if [ "$year" -le 1984 ]; then
echo "1984 or earlier: $i" >> "$TestScriptResultFile"
else
echo "After 1984: $i" >> "$TestScriptResultFile"
fi
else
echo "No valid year found in file $i"
fi
done
1. Using the find
command to get a file list and store it an array using readarray
.
- Set the field seperator to a new line:
IFS=$'\n'
- Use an
-exec
argument in find
which will run basename
on each file to get only the filename and not the path.
- The
find
output is directed into an array by using command substitution and a 'Here String', <<<
and the readarray
command
2. Loop on the array of filenames
3. Use grep and regexPat to find the embedded year
The regex pattern I used will match on 6 characters in a string when the first character is a (
, followed by exactly 4 numbers [0-9]{4,4}
, and closed out by a )
at the end.
In order to output only the 4 numbers in between (hopefully the year), the argument -P
is given to grep for 'Perl Regex' which allows seperating matched characters from captured (output) characters, among other things.
The /K
will cause grep to not output anything that matches prior to the /K
in the pattern (aka a look-ahead).
Finally, the closing )
is removed from the output by using a look-behind non-capture similar to the /K
. You can use the basically the same syntax for both, the bottom script shows the look-ahead method that doesn't use /K
.
the -o
flag tells grep to only output the captured, matching portion of the string, which in our case will be a 4 digit number.
The rest of the script checks the number against 1984 and logs it accordingly.
Here's another more compact approach.
2 things to notice here
find
is given the argument -print0
which will null terminate the
output.
- The
read
command is given the argument -d ''
, which tells it its input is null terminated. A null string is usually written\0
in plain text; in Bash you can use ''
or $'\0'
Script 2
#!/bin/bash
TestScriptResultFile="./CamCorderFindResult.file"
touch $TestScriptResultFile
regexPat='(?<=[^0-9])[0-9]{4,4}(?=[^0-9])'
find . -name "*.m2ts" -print0 | while IFS= read -r -d '' k; do
i="$(basename "$k")"
echo "Processing File: $i"
if year=($(grep -oP "$regexPat" <<<"$i")); then
for yr in "${year[@]}"; do
if [ "$yr" -lt 1970 ] || [ "$yr" -gt 2020 ]; then
echo " x Out of range year ($yr) parsed from $i"
else
echo " o Found year $yr"
if [ "$yr" -le 1984 ]; then
echo "1984 or earlier: $i" >> "$TestScriptResultFile"
else
echo "After 1984: $i" >> "$TestScriptResultFile"
fi
fi
done
else
echo " x No valid year found in file $i"
fi
done
1984
. To read exifinfo it's better to useexiftool
orexiftran