1

I'm using this command:

ls -R |find . -name "*.avi*" -or -name "*.mp4*" -or -name "*.mkv*" > movies.txt

The problem is the movies are listed twice! Example:

./_NEW/Rogue (2020)/Rogue.2020.1080p.BluRay.x264.AAC5.1-[YTS.MX].mp4
./_NEW/Rogue (2020)/._Rogue.2020.1080p.BluRay.x264.AAC5.1-[YTS.MX].mp4

How can I eliminate the extra line? (./_NEW/Rogue (2020)/._Rogue.2020.1080p.BluRay.x264.AAC5.1-[YTS.MX].mp4)

I've tried many commands using find and grep but they always get listed twice!

2 Answers 2

3

There are inaccuracies in your question and in your answer.

  1. In your original command (ls -R | find … > movies.txt) ls is only a burden because find ignores its standard input. Whatever gets to movies.txt is solely from find and does not depend on ls.

  2. You wrote "movies are listed twice". Movies are not. Files with names beginning with ._ are not movies. They are metadata associated with actual movies, a prosthetic way to store some metadata in a filesystem unable to store such metadata strictly as metadata. They are hidden. I have no experience with Macs but I wouldn't be surprised if some macOS-specific software hid them from view even when asked to show hidden files. You may have never noticed them, they are designed to be seen by your OS, not by you. Still they are only prostheses and any software not aware of their purpose will find them as regular files (which they are after all). Your find does find them.

  3. You solved the problem by piping to grep -v "._". This is not a strict solution because grep interprets ._ as a regular expression where . denotes any character. The pattern ._ matches any _ except _ in the beginning of a line; but your find . produces lines that must start with (literal) ., so _ (if any) appearing anywhere in the relative path to a file will make your grep -v filter this path out. Anywhere, not necessarily in the basename of the file itself; possibly in the name of some directory.

    This problem can be fixed by escaping the dot in the pattern (\._) or by telling grep to treat the pattern literally (grep -F). If I were you I would also make sure ._ appears directly after /. The command would be like

    find … | grep -vF '/._'
    

    (Quotes are not needed in this particular case but I think it's a good habit to always quote patterns for grep, even if the shell wouldn't do anything to them while unquoted; because this way one won't forget to quote in a case when the shell can interfere.)

    Note this will filter out whole directories whose names happen to start with ._. Probably there are no such directories in your system; or at least you have no movies in such directories.

    Paths with newlines (if any) can mislead grep. This is a general problem, I won't elaborate. Most likely you don't use newline characters in paths (but technically you can).

Enough about inaccuracies. This is what I would do. I would exclude files with names beginning with ._ using find itself:

find . -type f ! -name '._*' '(' -name '*.avi' -o -name '*.mp4' -o -name '*.mkv' ')'

Notes:

  • -or is not a POSIX option, -o is.

  • Parentheses are needed.

  • If your find supports -iname (case-insensitive analogue of -name) then you may want to use it. Alternatively -name '*.[Mm][Pp]4' etc.

  • Instead of identifying files by extensions (which are rather a Windows concept) one can implement a custom test in find to identify by content. In Ubuntu this would be like:

    find . -type f -exec sh -c '
       for f do
          file --brief --mime-type "$f" | grep -q "^video/" && printf "%s\n" "$f"
       done
     ' sh {} +
    
    • I don't know if file in macOS supports --mime-type.
    • The command spawns one file, one grep and (on average) some fractional part of sh per regular file. It's slow.
    • The command won't print paths to ._ files because such files are not videos. Adding ! -name '._*' will speed things up a little, so it's a good idea anyway.
    • The second sh is explained here: What is the second sh in sh -c 'some shell code' sh?

I think Apple chose ._ prefix to hide the files in question thanks to the leading dot. Personally I think maybe they should have added some postfix (i.e. a string at the end) as well. This would allow you to use -name '*.avi' and similar tests without finding these files (but not -name '*.avi*'). E.g. backup files with ~ at the end of their basename are nice in this context.

0

Ah! I've found an answer although I don't think it is the most elegant of commands:

find . -name "*.avi*" -or -name "*.mp4" -or -name "*.mkv*" |grep -v "._"

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .