I have a directory with files with string names that all start out with 5 numbers (i.e. 12345_a_b, 23456_s_a) and I have a text file within the same directory that has a list of file names named in the same way, however only the numbers match and not anything after the underscore (i.e 12345_q_p, 23456_p_l). I want to copy files in the current directory into a new one, but only those where the first 5 numbers in the name of the file match the first 5 numbers of each filename in the text file, ignoring everything that comes after. It seems like I can use xargs but I'm not sure how to match names partially. Can anyone help?
3 Answers
Iterate over the lines with _
as the IFS
, get the desired first portion containing digits, and the copy the files starting with those digits:
shopt -s nullglob
while IFS=_ read -r i _; do [[ $i =~ ^[0-9]{5}$ ]] && echo cp -it dest/ "${i}"*; done <file.txt
Expanded:
#!/bin/bash
shopt -s nullglob ##Expands to null string if no match while doing glob expansion,
##rather than the literal
while IFS=_ read -r i _; do ##Iterate over the lines of file.txt,
##with `_` as the `IFS` i.e. word splitting
##happens on each `_` only, variable `i`
##will contain the digits at start; `_` is a
##throwaway variable containing the rest
[[ $i =~ ^[0-9]{5}$ ]] \ ##Check if the variable contains only 5 digits
&& echo cp -it /destination/ "${i}"* ##if so, copy the relevant files starting with those digits
done <file.txt
Replace file.txt
with the actual source file, and /destination/
with your actual destination directory. Here echo
is included to do the dry-run; if satisfied with the commands to be run, just get rid of echo
:
shopt -s nullglob
while IFS=_ read -r i _; do [[ $i =~ ^[0-9]{5}$ ]] && cp -it dest/ "${i}"*; done <file.txt
-
Instead of
IFS=
, you could doIFS=_ read i crap
and get some word splitting for free, since the format of the source file is known. Alsonullglob
might be useful.– muruCommented Jan 26, 2017 at 10:31 -
-
@heemayl your comment regarding
nullglob
sounds more like a description offailglob
-nullglob
just expands to nothing. Commented Jan 27, 2017 at 12:36 -
The following commands should do the trick, you can exchange the first few variable assignments with their literal values inside the while loop instead if you prefer.
#!/bin/bash
source_dir=/directory/containing/files
target_dir=/new/directory
list=/full/path/to/number_list
reg='^[0-9]{5}$'
while IFS= read -r line; do
line=${line:0:5}
[[ "$line" =~ $reg ]] && cp -t "$target_dir" "$source_dir"/"$line"* 2>/dev/null
done < "$list"
- The
read
command will read your file line by line, setting the contents of each line into the variable 'line'. - The
cp
command uses-t
to set the target to copy files to, and the glob pattern"$source_dir"/"$line"*
will find any files in the source directory which start with the numerical value in the line variable. - The
while
loop means that theread
andcp
commands are carried out for each line of the list file.IFS=
means that if there are spaces in your list file, they will be included in th search string, it's not directly necessary in this example, but is useful generally when you want to read a file line by line.
-
Thanks for the very sensible suggestions @RoVo, taken on board. Commented Jan 26, 2017 at 10:47
Run a loop on your files. Within the loop find out if the filenames match with the filenames in your textfile using grep
.
Using -q let you use && cp
if it finds something.
#!/bin/bash
while IFS= read -r file; do
grep -Eq "^${f:0:5}" your_text_file && cp ${file} /path/to/destination/
done <( (find . -type f -regex "^[0-9]{5}.*")
This will have some overhead when you have a huge number of files. But can be used for more complex tasks too ...