4

I have a directory with files with string names that all start out with 5 numbers (i.e. 12345_a_b, 23456_s_a) and I have a text file within the same directory that has a list of file names named in the same way, however only the numbers match and not anything after the underscore (i.e 12345_q_p, 23456_p_l). I want to copy files in the current directory into a new one, but only those where the first 5 numbers in the name of the file match the first 5 numbers of each filename in the text file, ignoring everything that comes after. It seems like I can use xargs but I'm not sure how to match names partially. Can anyone help?

1
  • Yes, it's one column with many rows. Each row contains a filename Commented Jan 26, 2017 at 10:19

3 Answers 3

4

Iterate over the lines with _ as the IFS, get the desired first portion containing digits, and the copy the files starting with those digits:

shopt -s nullglob
while IFS=_ read -r i _; do [[ $i =~ ^[0-9]{5}$ ]] && echo cp -it dest/ "${i}"*; done <file.txt

Expanded:

#!/bin/bash
shopt -s nullglob  ##Expands to null string if no match while doing glob expansion,
                   ##rather than the literal
while IFS=_ read -r i _; do  ##Iterate over the lines of file.txt, 
                              ##with `_` as the `IFS` i.e. word splitting 
                              ##happens on each `_` only, variable `i` 
                              ##will contain the digits at start; `_` is a
                              ##throwaway variable containing the rest
    [[ $i =~ ^[0-9]{5}$ ]] \ ##Check if the variable contains only 5 digits
      && echo cp -it /destination/ "${i}"*  ##if so, copy the relevant files starting with those digits
done <file.txt

Replace file.txt with the actual source file, and /destination/ with your actual destination directory. Here echo is included to do the dry-run; if satisfied with the commands to be run, just get rid of echo:

shopt -s nullglob
while IFS=_ read -r i _; do [[ $i =~ ^[0-9]{5}$ ]] && cp -it dest/ "${i}"*; done <file.txt
4
  • Instead of IFS=, you could do IFS=_ read i crap and get some word splitting for free, since the format of the source file is known. Also nullglob might be useful.
    – muru
    Commented Jan 26, 2017 at 10:31
  • @muru hmmm, nullglob, added.
    – heemayl
    Commented Jan 26, 2017 at 11:02
  • @heemayl your comment regarding nullglob sounds more like a description of failglob - nullglob just expands to nothing.
    – Tom Fenech
    Commented Jan 27, 2017 at 12:36
  • @TomFenech Agreed, rephrased.
    – heemayl
    Commented Jan 27, 2017 at 13:06
2

The following commands should do the trick, you can exchange the first few variable assignments with their literal values inside the while loop instead if you prefer.

#!/bin/bash
source_dir=/directory/containing/files
target_dir=/new/directory
list=/full/path/to/number_list
reg='^[0-9]{5}$' 

while IFS= read -r line; do
   line=${line:0:5}
   [[ "$line" =~ $reg ]] && cp -t "$target_dir" "$source_dir"/"$line"* 2>/dev/null
done < "$list"
  • The read command will read your file line by line, setting the contents of each line into the variable 'line'.
  • The cp command uses -t to set the target to copy files to, and the glob pattern "$source_dir"/"$line"* will find any files in the source directory which start with the numerical value in the line variable.
  • The while loop means that the read and cp commands are carried out for each line of the list file. IFS= means that if there are spaces in your list file, they will be included in th search string, it's not directly necessary in this example, but is useful generally when you want to read a file line by line.
1
  • Thanks for the very sensible suggestions @RoVo, taken on board.
    – Arronical
    Commented Jan 26, 2017 at 10:47
1

Run a loop on your files. Within the loop find out if the filenames match with the filenames in your textfile using grep.

Using -q let you use && cp if it finds something.

#!/bin/bash
while IFS= read -r file; do
  grep -Eq "^${f:0:5}" your_text_file && cp ${file} /path/to/destination/
done <( (find . -type f -regex "^[0-9]{5}.*")

This will have some overhead when you have a huge number of files. But can be used for more complex tasks too ...

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .