59

I've got to get a directory listing that contains about 2 million files, but when I do an ls command on it nothing comes back. I've waited 3 hours. I've tried ls | tee directory.txt, but that seems to hang forever.

I assume the server is doing a lot of inode sorting. Is there any way to speed up the ls command to just get a directory listing of filenames? I don't care about size, dates, permission or the like at this time.

1
  • In a restricted shell (zOS), doing a bare ls on a million files actually fails with out of memory error. The only way around it is find or ls -1f.
    – Stavr00
    Commented Feb 5 at 16:57

19 Answers 19

54
ls -U

will do the ls without sorting.

Another source of slowness is --color. On some linux machines, there is a convenience alias which adds --color=auto' to the ls call, making it look up file attributes for each file found (slow), to color the display. This can be avoided by ls -U --color=never or \ls -U.

4
  • 1
    Do you know if ls -U|sort is faster than ls?
    – User1
    Commented Oct 20, 2010 at 14:21
  • 1
    I don't know. I doubt it, because sort can't complete until it's seen all the records, whether it's done in a separate program in in ls. But the only way to find out is to test it. Commented Oct 20, 2010 at 14:35
  • 4
    Note: on some systems, ls -f is equivalent to ls -aU; i.e., include all files (even those whose names begin with ‘.’) and don’t sort. And on some systems, -f is the option to suppress sorting, and -U does something else (or nothing). Commented Mar 5, 2013 at 15:43
  • 1
    Does not work on BSD. On BSD -U sorts by file creation time.
    – rustyx
    Commented Jul 20, 2018 at 13:37
22

I have a directory with 4 million files in it and the only way I got ls to spit out files immediately without a lot of churning first was

ls -1U
4
  • 3
    Saved my day! Thanks!
    – dschu
    Commented Nov 4, 2016 at 14:58
  • Absolutely crucial for a huge folder on a network-mounted drive (such as Google Drive on google-drive-ocamlfuse)
    – masterxilo
    Commented Jan 22, 2021 at 20:43
  • 1
    ls -1f seems a lot better than ls -1U for me. They are both similar in output speed, but ls -1U seems un-interuptable.
    – Ben Farmer
    Commented Dec 6, 2021 at 23:24
  • 1
    uninterruptable? it's writing output to a terminal, any attempt to cancel/ctrl-c/etc would have more to do with your terminal than with ls.
    – stu
    Commented Dec 8, 2021 at 1:01
15

Try using:

find . -type f -maxdepth 1

This will only list the files in the directory, leave out the -type f argument if you want to list files and directories.

0
9

This question seems to be interesting and I was going through multiple answers that were posted. To understand the efficiency of the answers posted, I have executed them on 2 million files and found the results as below.

$ time tar cvf /dev/null . &> /tmp/file-count

real    37m16.553s
user    0m11.525s
sys     0m41.291s

------------------------------------------------------

$ time echo ./* &> /tmp/file-count

real    0m50.808s
user    0m49.291s
sys     0m1.404s

------------------------------------------------------

$ time ls &> /tmp/file-count

real    0m42.167s
user    0m40.323s
sys     0m1.648s

------------------------------------------------------

$ time find . &> /tmp/file-count

real    0m2.738s
user    0m1.044s
sys     0m1.684s

------------------------------------------------------

$ time ls -U &> /tmp/file-count

real    0m2.494s
user    0m0.848s
sys     0m1.452s


------------------------------------------------------

$ time ls -f &> /tmp/file-count

real    0m2.313s
user    0m0.856s
sys     0m1.448s

------------------------------------------------------

To summarize the results

  1. ls -f command ran a bit faster than ls -U. Disabling color might have caused this improvement.
  2. find command ran third with an average speed of 2.738 seconds.
  3. Running just ls took 42.16 seconds. Here in my system ls is an alias for ls --color=auto
  4. Using shell expansion feature with echo ./* ran for 50.80 seconds.
  5. And the tar based solution took about 37 miuntes.

All tests were done seperately when system was in idle condition.

One important thing to note here is that the file lists are not printed in the terminal rather they were redirected to a file and the file count was calculated later with wc command. Commands ran too slow if the outputs where printed on the screen.

Any ideas why this happens ?

1
  • the terminal is slow, is has to scroll and do formatting, file writes go to block devices, and in reality, they go to the page cache first, so you're really just writing to memory, which is quicker than a terminal.
    – stu
    Commented Aug 8, 2021 at 13:13
7

This would be the fastest option AFAIK: ls -1 -f.

  • -1 (No columns)
  • -f (No sorting)
1
  • This works for both macOS (BSD) and Linux
    – TiLogic
    Commented Jun 3, 2021 at 13:55
6

Using

ls -1 -f 

is about 10 times faster and it is easy to do (I tested with 1 million files, but my original problem had 6 800 000 000 files)

But in my case I needed to check if some specific directory contains more than 10 000 files. If there were more than 10 000 files, I am not anymore interested that how many files there is. I just quit the program so that it will run faster and wont try to read the rest one-by-one. If there are less than 10 000, I will print the exact amount. Speed of my program is quite similar to ls -1 -f if you specify bigger value for parameter than amount of files.

You can use my program find_if_more.pl in current directory by typing:

find_if_more.pl 999999999

If you are just interested if there are more than n files, script will finish faster than ls -1 -f with very large amount of files.

#!/usr/bin/perl
    use warnings;
    my ($maxcount) = @ARGV;
    my $dir = '.';
    $filecount = 0;
    if (not defined $maxcount) {
      die "Need maxcount\n";
    }
    opendir(DIR, $dir) or die $!;
    while (my $file = readdir(DIR)) {
        $filecount = $filecount + 1;
        last if $filecount> $maxcount
    }
    print $filecount;
    closedir(DIR);
    exit 0;
5

You can redirect output and run the ls process in the background.

ls > myls.txt &

This would allow you to go on about your business while its running. It wouldn't lock up your shell.

Not sure about what options are for running ls and getting less data back. You could always run man ls to check.

4

This is probably not a helpful answer, but if you don't have find you may be able to make do with tar

$ tar cvf /dev/null .

I am told by people older than me that, "back in the day", single-user and recovery environments were a lot more limited than they are nowadays. That's where this trick comes from.

3

I'm assuming you are using GNU ls? try

\ls

It will unalias the usual ls (ls --color=auto).

1
  • True, coloring is the usual culprit for me: when coloring, ls tries to determine type and mode of each directory entry, resulting in lots of stat(2) calls, thus in loads of disk activity.
    – Ruslan
    Commented May 27, 2017 at 5:38
2

If a process "doesn't come back", I recommend strace to analyze how a process is interacting with the operating system.

In case of ls:

$strace ls

you would have seen that it reads all directory entries (getdents(2)) before it actually outputs anything. (sorting… as it was already mentioned here)

1

Things to try:

Check ls isn't aliased?

alias ls

Perhaps try find instead?

find . \( -type d -name . -prune \) -o \( -type f -print \)

Hope this helps.

1

Some followup: You don't mention what OS you're running on, which would help indicate which version of ls you're using. This probably isn't a 'bash' question as much as an ls question. My guess is that you're using GNU ls, which has some features that are useful in some contexts, but kill you on big directories.

GNU ls Trying to have prettier arranging of columns. GNU ls tries to do a smart arrange of all the filenames. In a huge directory, this will take some time, and memory.

To 'fix' this, you can try:

ls -1 # no columns at all

find BSD ls someplace, http://www.freebsd.org/cgi/cvsweb.cgi/src/bin/ls/ and use that on your big directories.

Use other tools, such as find

1

There are several ways to get a list of files:

Use this command to get a list without sorting:

ls -U

or send the list of files to a file by using:

ls /Folder/path > ~/Desktop/List.txt
1

What partition type are you using?

Having millions of small files in one directory it might be a good idea to use JFS or ReiserFS which have better performance with many small sized files.

1

How about find ./ -type f (which will find all files in the currently directory)? Take off the -type f to find everything.

1
  • This would find files in the current directory, and also in any subdirectories.
    – mwfearnley
    Commented Oct 11, 2017 at 15:46
0

You should provide information about what operating system and the type of filesystem you are using. On certain flavours of UNIX and certain filesystems you might be able to use the commands ff and ncheck as alternatives.

0

I had a directory with timestamps in the file names. I wanted to check the date of the latest file and found find . -type f -maxdepth 1 | sort | tail -n 1 to be about twice as fast as ls -alh.

-1

Lots of other good solutions here, but in the interest of completeness:

echo *
1
  • 3
    With 2 million files, that is likely to return only a "command line too long" error.
    – richq
    Commented Sep 8, 2008 at 7:35
-2

You can also make use of xargs. Just pipe the output of ls through xargs.

ls | xargs

If that doesn't work and the find examples above aren't working, try piping them to xargs as it can help the memory usage that might be causing your problems.

Not the answer you're looking for? Browse other questions tagged or ask your own question.