0

There are at least two ways of doing the same thing :) ls -l *ABC* and ls -l | grep ABC

But which one is more efficient? Are there others, more efficient?

6
  • 2
    Of course ls -l *ABC*
    – anubhava
    Commented Nov 2, 2017 at 15:23
  • 3
    In almost all cases a single process will be far more efficient than piping.
    – 123
    Commented Nov 2, 2017 at 15:25
  • What's the use case? If you want to pass a list of files to a different program, or iterate over them in your shell script, you shouldn't use ls at all -- not just for performance reasons, but for correctness reasons as well. See Why you shouldn't parse the output of ls Commented Nov 2, 2017 at 15:35
  • 1
    @CharlesDuffy sometimes people just want to list files...
    – 123
    Commented Nov 2, 2017 at 15:42
  • 1
    @123, sure, but it's useful to distinguish whether that's the situation here. And frankly, if someone "just want[s] to list files", it's surprising that performance matters (unless it's a really huge directory, in which case the biggest gains available come by way of telling ls not to sort). Commented Nov 2, 2017 at 15:43

2 Answers 2

4

They do two subtly different things.

ls -l *ABC* lists all entries in the current directory whose names contain ABC (and don't start with .).

ls -l | grep ABC lists all entries in the current directory whose names don't start with . and then filters out all lines that don't contain ABC.

ls -l lists user and group names as well as file names, so if there happen to be files owned by a user or group whose name contains ABC, they'll be listed regardless of their names. Most group and user names don't contain uppercase letters, but that's not a firm requirement, and you'll want to do the same thing with other patterns like abc. If the pattern happens to be something contained in the word total, you'll match the first line of the output of ls -l.

More obscurely, file names can legally contain any characters other than / and the null character -- including newlines. Using such a name is a really bad idea, but such a file's name will be listed across two or more lines, and grep operates on lines.

The output of ls -l is intended to be human-readable. It's not really intended to be processed automatically.

ls -l *ABC* says what you mean more clearly and directly. Think about that before you consider performance. Unless your current directory is positively huge, any performance difference is likely to be swamped by the time it takes to print the output.

Having said all that, let's look at the likely performance issues.

In ls -l *ABC*, the *ABC* wildcard is handled by the shell; ls sees only a list of arguments. It requires the shell to scan the current directory and build a sorted list of file names matching the pattern. The ls command will then sort it again (and depending on your shell and locale settings, I'm not certain both sorts will yield the same order). Sorting might be a performance issue for very large directories. (Solution: Avoid making very large directories.) ls will be sorting fewer items than the shell does -- unless everything in the current directory matches *ABC*.

In ls -l | grep ABC, the ls command has to scan the current directory, sort it all, fetch metadata for everything, and then print it all, to have (probably) most of it filtered out by the separate grep process.

I don't know which is going to be faster. It likely depends on the contents of your current directory. But unless you're either working with huge directories or performing this operation many many times, the performance difference probably doesn't matter. If it does matter, measure it; that's the only way to know the difference in your environment.

1

Glob vs Grep.

The way ls -l *ABC* works is that the wildcards match the regex and populate an array of the file names matching that regex. Once that is done, ls simply lists out the files in long -l format.

ls -l | grep ABC uses linux pipes. The way pipes work is that it connects the STD_OUT of the left command to the STD_IN of the right command. Thus ls -l first generates a list of all the file names in long format, and then pipe passes this list to grep, which filters out the list based on its matching regex. Now, suppose this list is of a million files, it is unnecessary to pass this whole list when glob could populate it for you.

Thus ls -l | grep ABC would be much slower than ls -l *ABC*

2
  • 4
    It's important to distinguish that in the *ABC* case, the list of filenames is generated by the shell before ls is even started. This implies some limits not present in the ls | grep case -- the local platform's maximum command-line length is pertinent. Commented Nov 2, 2017 at 15:36
  • (Contrast with printf '%s\0' *ABC* | xargs -0 ls -l, which avoids that limit by splitting into multiple ls invocations if there are more names than just one can handle). Commented Nov 2, 2017 at 15:44

Not the answer you're looking for? Browse other questions tagged or ask your own question.