46

Trying to count number of files in current directory, I found ls -1 | wc -l, which means: send the list of files (where every filename is printed in a new line) to the input of wc, where -l will count the number of lines on input. This makes sense.

I decided to try simply ls | wc -l and was very surprised about it also gives me a correct number of files. I wonder why this happens, because ls command with no options prints the filenames on a single line.

Braiam
  • 35,991
Kirill
  • 995

4 Answers4

58

From info ls:

'-1'
'--format=single-column'

List one file per line. This is the default for 'ls' when standard output is not a terminal.

When you pipe the output of ls, you get one filename per line.
ls only outputs the files in columns when the output is destined for human eyes.


Here's where ls decides what to do:

  switch (ls_mode)
    {
    case LS_MULTI_COL:
      /* This is for the 'dir' program.  */
      format = many_per_line;
      set_quoting_style (NULL, escape_quoting_style);
      break;

    case LS_LONG_FORMAT:
      /* This is for the 'vdir' program.  */
      format = long_format;
      set_quoting_style (NULL, escape_quoting_style);
      break;

    case LS_LS:
      /* This is for the 'ls' program.  */
      if (isatty (STDOUT_FILENO))
        {
          format = many_per_line;
          /* See description of qmark_funny_chars, above.  */
          qmark_funny_chars = true;
        }
      else
        {
          format = one_per_line;
          qmark_funny_chars = false;
        }
      break;

    default:
      abort ();
    }

source: http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/ls.c

glenn jackman
  • 85,964
  • "When you pipe the output of ls, you get one filename per line." Why does that happen, exactly? It seems your previous sentence is relevant, but does ls "know" in this case that it is writing standard output to something that is not a terminal? And if so, how? Can you elaborate? – Faheem Mitha Sep 24 '14 at 16:53
  • I don't know exactly. The shell spawns a process for ls and a process for wc and it connects the stdout of the first process to the stdin of the second. I assume that ls can determine where its stdout is going and act accordingly. (details left as an exercise, I'm not a C programmer) – glenn jackman Sep 24 '14 at 17:02
  • I see. Interesting, thanks. Do any C/Unix programmers care to elaborate? :-) – Faheem Mitha Sep 24 '14 at 17:08
  • 8
    You'll want to read about isatty() – glenn jackman Sep 24 '14 at 17:09
  • Thanks, Glenn. Another Unix/C function I've never heard of. man isatty is reasonably clear- "The isatty() function tests whether fd is an open file descriptor referring to a terminal." – Faheem Mitha Sep 24 '14 at 17:11
  • I just noticed you may get different output with ls -C and ls -C | cat -- I assume that is the output is not a tty, ls has to guess about the line width of the output device. – glenn jackman Sep 24 '14 at 17:11
  • 1
    You can write Bash scripts that have this capability. See help test in Bash. Examples: testing for input from a pipe: [[ -p /dev/stdin ]] && echo "stdin is from a pipe", testing for input from a terminal: [[ -t 0 ]] && echo "input from terminal" and testing for redirection: [[ ! -t 0 && ! -p /dev/stdin ]] && echo "input redirected". Put all of those in a script and run it like this: first: echo foo | ./script, then ./script < some_file, then ./script and type something and press Ctrl-D. – Dennis Williamson Sep 24 '14 at 23:23
  • You can test this ls behaviour by comparing the output of ls with the output of ls | cat. – abligh Sep 25 '14 at 06:15
15

Because the output of ls depends on the std output, it is different for terminal and pipe. Try

/bin/ls | cat
jimmij
  • 47,140
12

Historically, ls wrote its output one file per line, which is a convenient format for processing with other text-based Unix tools (like wc). However, on a 24 line terminal with no scrollback, large listings had a tendency to scroll off the screen, making it hard to find what you were looking for. So, at some point, the BSD developers changed the behavior so, when printing to a terminal, ls would format its output in multiple columns. The old behavior was retained when writing to a pipe or a file to avoid breaking existing shell scripts --- and because the old behavior is more useful when processing the output with a command like wc. The decisions to incorporate the multi-column output into ls and to make it the default on the terminal, exercised Rob Pike quite a bit; Research Unix didn't pick up the new features until 8th Edition (which was based directly on BSD) and Plan 9 reverted to separate commands, ls for scripts and lc for interactive use, with lc a shell script calling ls and a command mc providing multi-column output.

The -1 and -C options to ls are a belated attempt to restore sanity, by at least allowing the user to force a specific output format regardless of the output destination.

7

Why does “ls | wc -l” show the correct number of files in current directory?

Well, that's a false premise right there. It does not! Try this:

mkdir testdir
cd testdir
# below two lines are one command, the newline is quoted so will be part of argument
echo text | tee "file
name"
ls -l
ls | wc -l

Output of that last line is 2.

Note how, when printing to the console in ls -l command, ls will not print the newline as is, but will instead print ?. But this is a specifically implemented feature of ls, it does this when it detects the output is going to an actual terminal, to avoid funny file names from messing the terminal up. This same detection determines if file names are printed one-per line (in pipe) or according to terminal width (which obviously only makes sense if there is a terminal with width). You can fool ls with command like ls | cat if you want the raw file names printed, separated with newlines.

wc -l just counts number of lines, and if a file name happens to contain a newline, well then wc will count it as two lines.


ls also has switches to force hiding control chars, -q/--hide-control-chars, so ls -q | wc -l should actually give accurate number of files listed by ls (which usually is not same as actual number of files in the directory, without -a switch), because then only newlines in ls output should be those separating file names.

hyde
  • 1,288
  • 1
  • 13
  • 20