Copying certain files by partially matching their names to a list within a text file

Question

I have a directory with files with string names that all start out with 5 numbers (i.e. 12345_a_b, 23456_s_a) and I have a text file within the same directory that has a list of file names named in the same way, however only the numbers match and not anything after the underscore (i.e 12345_q_p, 23456_p_l). I want to copy files in the current directory into a new one, but only those where the first 5 numbers in the name of the file match the first 5 numbers of each filename in the text file, ignoring everything that comes after. It seems like I can use xargs but I'm not sure how to match names partially. Can anyone help?

Yes, it's one column with many rows. Each row contains a filename — user5654627, Commented Jan 26, 2017 at 10:19

heemayl · Accepted Answer · 2017-01-27 13:06:37Z

Iterate over the lines with _ as the IFS, get the desired first portion containing digits, and the copy the files starting with those digits:

shopt -s nullglob
while IFS=_ read -r i _; do [[ $i =~ ^[0-9]{5}$ ]] && echo cp -it dest/ "${i}"*; done <file.txt

Expanded:

#!/bin/bash
shopt -s nullglob  ##Expands to null string if no match while doing glob expansion,
                   ##rather than the literal
while IFS=_ read -r i _; do  ##Iterate over the lines of file.txt, 
                              ##with `_` as the `IFS` i.e. word splitting 
                              ##happens on each `_` only, variable `i` 
                              ##will contain the digits at start; `_` is a
                              ##throwaway variable containing the rest
    [[ $i =~ ^[0-9]{5}$ ]] \ ##Check if the variable contains only 5 digits
      && echo cp -it /destination/ "${i}"*  ##if so, copy the relevant files starting with those digits
done <file.txt

Replace file.txt with the actual source file, and /destination/ with your actual destination directory. Here echo is included to do the dry-run; if satisfied with the commands to be run, just get rid of echo:

shopt -s nullglob
while IFS=_ read -r i _; do [[ $i =~ ^[0-9]{5}$ ]] && cp -it dest/ "${i}"*; done <file.txt

Instead of IFS=, you could do IFS=_ read i crap and get some word splitting for free, since the format of the source file is known. Also nullglob might be useful. — muru, Commented Jan 26, 2017 at 10:31
@heemayl your comment regarding nullglob sounds more like a description of failglob - nullglob just expands to nothing. — Tom Fenech, Commented Jan 27, 2017 at 12:36

Arronical · Accepted Answer · 2017-01-26 11:06:49Z

The following commands should do the trick, you can exchange the first few variable assignments with their literal values inside the while loop instead if you prefer.

#!/bin/bash
source_dir=/directory/containing/files
target_dir=/new/directory
list=/full/path/to/number_list
reg='^[0-9]{5}$' 

while IFS= read -r line; do
   line=${line:0:5}
   [[ "$line" =~ $reg ]] && cp -t "$target_dir" "$source_dir"/"$line"* 2>/dev/null
done < "$list"

The read command will read your file line by line, setting the contents of each line into the variable 'line'.
The cp command uses -t to set the target to copy files to, and the glob pattern "$source_dir"/"$line"* will find any files in the source directory which start with the numerical value in the line variable.
The while loop means that the read and cp commands are carried out for each line of the list file. IFS= means that if there are spaces in your list file, they will be included in th search string, it's not directly necessary in this example, but is useful generally when you want to read a file line by line.

Thanks for the very sensible suggestions @RoVo, taken on board. — Arronical, Commented Jan 26, 2017 at 10:47

pLumo · Accepted Answer · 2017-01-26 11:57:55Z

1

Run a loop on your files. Within the loop find out if the filenames match with the filenames in your textfile using grep.

Using -q let you use && cp if it finds something.

#!/bin/bash
while IFS= read -r file; do
  grep -Eq "^${f:0:5}" your_text_file && cp ${file} /path/to/destination/
done <( (find . -type f -regex "^[0-9]{5}.*")

This will have some overhead when you have a huge number of files. But can be used for more complex tasks too ...

edited Jan 26, 2017 at 11:57

answered Jan 26, 2017 at 10:26

pLumo

27.1k2 gold badges61 silver badges92 bronze badges

Add a comment |

Stack Exchange Network

Copying certain files by partially matching their names to a list within a text file

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
command-line
.

Hot Network Questions

Copying certain files by partially matching their names to a list within a text file

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged command-line.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
command-line
.