2

I have a large directory tree and want to copy only files with specific names. I also want to preserve the directory structure, incrementally update, and maintain metadata, so rsync seems like the natural choice. However, I can't get the syntax to work correctly. Consider the following example structure:

$ find .
.
./filter.txt
./dest
./source
./source/no
./source/fold1
./source/fold1/yes
./source/fold1/no
./source/fold2
./source/fold2/no
./source/fold2/yes

So I have a source folder with subfolders and files named yes (copy) and no (do not copy). Based on some other questions (eg. https://serverfault.com/questions/770728/rsync-exclude-all-directories-except-a-few) I have been trying to use a filter file that includes files I want but excludes everything else. I have tried several things, but a simple filter file that is not working is:

$ cat filter.txt 
+ /fold1/yes
- *

$ rsync -avv --progress --include-from=filter.txt source/ dest/
sending incremental file list
[sender] hiding file no because of pattern *
[sender] hiding directory fold1 because of pattern *
[sender] hiding directory fold2 because of pattern *
delta-transmission disabled for local transfer or --whole-file
total: matches=0  hash_hits=0  false_alarms=0 data=0

sent 59 bytes  received 86 bytes  290.00 bytes/sec
total size is 0  speedup is 0.00

Changing the order or using slightly different syntax (eg. */yes, yes) do not change anything. In the final implementation, I would like to be able to include all yes files without describing their folders, and there are many other files besides no so excluding them individually is not a good solution. It seems like the - * filter line is excluding everything regardless of whatever has been included above or below it, but that is inconsistent with the other information I've seen.

How can I rsync only specific-named files (in subfolders) while excluding everything else?

I have seen some similar questions but (as discussed above) they do not seem to solve my problem, eg.: https://serverfault.com/questions/770728/rsync-exclude-all-directories-except-a-few https://serverfault.com/questions/1063730/rsync-exclude-all-files-in-dir-except-specific-files.

Edit 1 Per @harrymc, a filter that explicitly includes the folder and file works for that file:

$ cat filter2.txt 
+ /fold1/
+ /fold1/yes
- *

$ rsync -avv --progress --include-from=filter2.txt source/ dest/
sending incremental file list
[sender] hiding file no because of pattern *
[sender] showing directory fold1 because of pattern /fold1/
[sender] hiding directory fold2 because of pattern *
[sender] showing file fold1/yes because of pattern /fold1/yes
[sender] hiding file fold1/no because of pattern *
delta-transmission disabled for local transfer or --whole-file
[generator] risking file fold1/yes because of pattern /fold1/yes
./
fold1/
fold1/yes
              0 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=0/3)
total: matches=0  hash_hits=0  false_alarms=0 data=0

sent 157 bytes  received 178 bytes  670.00 bytes/sec
total size is 0  speedup is 0.00

Edit 2 A generic filter that satisfies my needs is, based on the accepted answer:

+ */
+ */yes
- *

This automatically includes every yes file while excluding everything else.

5
  • Try to write filter.txt as line 1 : + /fold1/, line2 : + /fold1/yes.
    – harrymc
    Commented Apr 17, 2023 at 19:40
  • @harrymc Thanks for the response, see my edit. But I would like to extend that to include all yes files without specifically listing fold1/yes, fold2/yes, and so on. In my real use case I have several types of yes files, many no files, and many folders.
    – Ross
    Commented Apr 17, 2023 at 20:01
  • @harrymc also, if you could explain/point me in the direction of why + /fold1, +/fold1/yes is different than simply +/fold1/yes it might help me build my own solution.
    – Ross
    Commented Apr 17, 2023 at 20:03
  • See if the explanation in my answer can cast some light on the question. Is there a pattern for the yes?
    – harrymc
    Commented Apr 17, 2023 at 20:05
  • A pattern for yes would be /*/yes. In the real example, it could be /*/yes and /*/also: many files all with the same name(s).
    – Ross
    Commented Apr 17, 2023 at 20:15

1 Answer 1

2
+100

The explanation for the problem comes from the confusing message of "hiding file no because of pattern *".

There is a good explanation found in the post rsync --include-from syntax :

If a pattern excludes a particular parent directory, it can render a deeper include pattern ineffectual because rsync did not descend through that excluded section of the hierarchy. This is particularly important when using a trailing ’*’ rule. For instance, this won’t work:

          + /some/path/this-file-will-not-be-found
          + /file-is-included
          - *

This fails because the parent directory "some" is excluded by the ’*’ rule, so rsync never visits any of the files in the "some" or "some/path" directories. One solution is to ask for all directories in the hierarchy to be included by using a single rule: "+ */" (put it somewhere before the "- *" rule), and perhaps use the --prune-empty-dirs option. Another solution is to add specific include rules for all the parent dirs that need to be visited. For instance, this set of rules works fine:

          + /some/
          + /some/path/
          + /some/path/this-file-is-found
          + /file-also-included
          - *

You may have to re-define your include file list.

The problem is then that the parent folder /fold1 is excluded and needs to be included as well.

By this explanation, the file filter.txt needs to be written as :

+ /fold1/
+ /fold1/yes
3
  • That explanation really helps. So if I want to generalize to many folders (instead of writing out fold1/, and so on), I would need to use the + */ line?
    – Ross
    Commented Apr 17, 2023 at 20:17
  • Based on this answer, I was able to generalize the function to my specific case - see the second edit to my main question. Thanks!
    – Ross
    Commented Apr 17, 2023 at 20:21
  • Bounty awarded, I was able to get my main problem working with your help.
    – Ross
    Commented Apr 19, 2023 at 3:14

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .