83

I get the following error when trying to ls *.txt | wc -l a directory that contains many files:

-bash: /bin/ls: Argument list too long

Does the threshold of this "Argument list" dependent on distro or computer's spec? Usually, I'd pipe the result of such big result to some other commands (wc -l for example), so I'm not concerned with limits of the terminal.

zahypeti
  • 117

5 Answers5

86

Your error message argument list too long comes from the * of ls *.txt.

This limit is a safety for both binary programs and your Kernel. See ARG_MAX, maximum length of arguments for a new process for more information about it, and how it's used and computed.

There is no such limit on pipe size. So you can simply issue this command:

find -type f -name '*.txt'  | wc -l

NB: On modern Linux, weird characters in filenames (like newlines) will be escaped with tools like ls or find, but still displayed from *. If you are on an old Unix, you'll need this command

find -type f -name '*.txt' -exec echo \;  | wc -l

NB2: I was wondering how one can create a file with a newline in its name. It's not that hard, once you know the trick:

touch "hello
world"
Coren
  • 5,010
17

It depends mainly on your version of the Linux kernel.

You should be able to see the limit for your system by running

getconf ARG_MAX

which tells you the maximum number of bytes a command line can have after being expanded by the shell.

In Linux < 2.6.23, the limit is usually 128 KB.

In Linux >= 2.6.25, the limit is either 128 KB, or 1/4 of your stack size (see ulimit -s), whichever is larger.

See the execve(2) man page for all the details.


Unfortunately, piping ls *.txt isn't going to fix the problem, because the limit is in the operating system, not the shell.

The shell expands the *.txt, then tries to call

exec("ls", "a.txt", "b.txt", ...)

and you have so many files matching *.txt that you're exceeding the 128 KB limit.

You'll have to do something like

find . -maxdepth 1 -name "*.txt" | wc -l

instead.

(And see Shawn J. Goff's comments below about file names that contain newlines.)

Mikel
  • 57,299
  • 15
  • 134
  • 153
  • Could you explain what the . and the -maxdepth 1 mean in the last line? Thanks! :D – Guilherme Salomé Jun 27 '17 at 15:41
  • 2
    @GuilhermeSalomé . means current directory, -maxdepth 1 means it doesn't look in subdirectories. This was intended to match the same files as *.txt. – Mikel Jun 28 '17 at 00:33
14

Another workaround:

ls | grep -c '\.txt$'

Even though ls produces more output than ls *.txt produces (or attempts to produce), it doesn't run into the "argument too long" problem, because you're not passing any arguments to ls. Note that grep takes a regular expression rather than a file matching pattern.

You might want to use:

ls -U | grep -c '\.txt$'

(assuming your version of ls supports this option). This tells ls not to sort its output, which could save both time and memory -- and in this case the order doesn't matter, since you're just counting files. The resources spent sorting the output are usually not significant, but in this case we already know you have a very large number of *.txt files.

And you should consider reorganizing your files so you don't have so many in a single directory. This may or may not be feasible.

2

This might be dirty but it works for my needs and within my competency. I don't think it performs very quickly but it allowed me to get on with my day.

ls | grep jpg | <something>

I was getting a 90,000 long list of jpgs and piping them to avconv to generate a timelapse.

I was previously using ls *.jpg| avconv before I ran into this issue.

1

MAX_ARG_PAGES appears to be a kernel parameter. Using find and xargs is a typical combination to address this limit but I'm not sure it'll work for wc.

Piping the output of find . -name \*\.txt to a file and counting the lines in that file should serve as a workaround.

Bram
  • 2,459
  • You can do anything with ls's output, will not solve this. As long as the *.txt wildcard is expanded over the limit, will fail before even starting ls and generating any output. – manatwork May 18 '12 at 16:05
  • True, I've updated my answer. – Bram May 18 '12 at 16:11
  • Better. But to make it a replacement for ls you should specify -maxdepth 1 to avoid recursively scanning the subdirectories. – manatwork May 18 '12 at 16:19