count lines in a file

Question

I'm sure there are many ways to do this: how can I count the number of lines in a text file?

$ <cmd> file.txt
1020 lines

score 122 · Accepted Answer · answered Dec 01 '10 at 22:16

122

The standard way is with wc, which takes arguments to specify what it should count (bytes, chars, words, etc.); -l is for lines:

$ wc -l file.txt
1020 file.txt

answered Dec 01 '10 at 22:16

Michael Mrozek

93,103
40
240
233

How do I count the lines in a file if I want to ignore comments? Specifically, I want to not count lines that begin with a +, some white space (could be no white space) and then a %, which is the way comment lines appear in a git diff of a MATLAB file. I tried doing this with grep, but couldn't figure out the correct regular expression. – Gdalya Jul 11 '13 at 01:36
@Gdalya I hope the following pipeline will do this (no tests were perfomed): cat matlab.git.diff | sed -e '/^\+[ ]*.*\%$/d' | wc -l. /regexp/d deletes a line if it matches regexp, and -e turns on an adequate (IMNSHO) syntax for regexp. – dbanet Nov 06 '13 at 21:29
2

Why not simply grep -v '^+ *%' matlab.git.diff | wc -l? – celtschk Jul 06 '14 at 19:51
@celtschk , as long as this is usual in comment lines: is it possible to modify your grep command in order to consider as comment cases like " + Hello" (note the space(s) before the +)? – Sopalajo de Arrierez Jan 18 '15 at 20:46
1

@SopalajodeArrierez: Of course it is possible: grep -v '^ *+' matlab.git.diff | wc -l (I'm assuming the quote signs were not actually meant to be part of the line; I also assume that both lines with and without spaces in front of the + are meant to be comments; if at least one space is mandatory, either replace the star * with \+, or just add another space in front of the star). Probably instead of matching only spaces, you'd want to match arbitrary whitespace; for this replace the space with [[:space:]]. Note that I've also removed matching the % since it's not in your example. – celtschk Feb 14 '15 at 15:02

Dennis Williamson · Answer 2 · 2015-10-18T00:08:07.380

21

Steven D forgot GNU sed:

sed -n '$=' file.txt

Also, if you want the count without outputting the filename and you're using wc:

wc -l < file.txt

Just for the heck of it:

cat -n file.txt | tail -n 1 | cut -f1

edited Oct 18 '15 at 00:08

answered Dec 03 '10 at 01:40

Dennis Williamson

6,680

2

Or grep -c '', or tr -dc '\n' | wc -c, or nl -ba -nln | tail -n 1 |sed -e 's/[^0-9].*//'... Is any of these useful in itself (as opposed to things to build upon to make a program that does more than counting lines), other than wc -l and pure (ba)sh? – Gilles 'SO- stop being evil' Dec 03 '10 at 01:58
1

@Gilles: I think the phrase "many ways" in the question triggered a challenge that Steve and I rose to. – Dennis Williamson Dec 03 '10 at 02:03
+1. Small note: cat -n is a GNU extension. – Steven D Dec 03 '10 at 02:03
@Dennis: I'm obviously not objecting... Follow-up challenge: can you do it by piping the data into uniq, with only POSIX tools to help you and no constraint on the line length? – Gilles 'SO- stop being evil' Dec 03 '10 at 02:21
@Steven: I'm pretty sure I've been using cat -n since before GNU existed. As a matter of fact, HP-UX has it (PDF), Solaris does, FreeBSD as early as 1.0 does, even 2.10 BSD (1986) ... – Dennis Williamson Dec 03 '10 at 02:27
... and AIX, too. – Dennis Williamson Dec 03 '10 at 02:27
1

@Gilles: sed 's/.*//' file.txt | uniq -c – Dennis Williamson Dec 03 '10 at 02:30
@Dennis: Yet another reason I should stop trusting wikipedia: http://en.wikipedia.org/wiki/Cat_%28Unix%29 – Steven D Dec 03 '10 at 02:32
@Steven: I believe what they mean by "GNU only" is the long option --number. – Dennis Williamson Dec 03 '10 at 02:35
@Dennis: Of course this works, but it doesn't answer my question (you're not piping the data into uniq, you're piping it into sed). – Gilles 'SO- stop being evil' Dec 03 '10 at 08:10
2

@Gilles: Oh, you meant first. uniq -c -w 0 file.txt and you can cut -c -7 to keep only the number. Or, more POSIXly: uniq -c file.txt | awk '{c+=$1}END{print c}'. How about dc (even though it's not POSIX)? uniq -c file.txt | cut -c -7 | sed '$alax' | dc -e '[pq]sb[+z1=blax]sa' -. bc is POSIX: uniq -c file.txt | cut -c -7 | sed -n ':a;${s/\n/ + /gp;b};N;ba' | bc. The easy answer if you assume a limited line length: uniq -c -f 100000 file.txt. – Dennis Williamson Dec 03 '10 at 16:21
You have to quote the expression '$=' so it doesn't get expanded by the shell - the command above as is fails with zsh. (Would have edited it myself, but it's just 2 characters of difference...) – Josip Rodin Oct 17 '15 at 20:19
1

@JosipRodin: Quotes added – Dennis Williamson Oct 18 '15 at 00:09
sed '$=' is not only GNU sed, the = is a SUSv4 sed command and the address $ is also specified as the last line by SUSv4. – Kusalananda Jul 05 '16 at 09:49

score 18 · Answer 3 · edited Dec 02 '10 at 20:10

As Michael said, wc -l is the way to go. But, just in case you inexplicably have bash, perl, or awk but not wc, here are a few more solutions:

Bash-only

$ LINECT=0; while read -r LINE; do (( LINECT++ )); done < file.txt; echo $LINECT

Perl Solutions

$ perl -lne 'END { print $. }' file.txt

and the far less readable:

$ perl -lne '}{ print $.' file.txt

Awk Solution

$  awk 'END {print NR}' file.txt

score 14 · Answer 4 · answered Sep 18 '14 at 14:56

14

Word of warning when using

wc -l

because wc -l functions by counting \n, if the last line in your file doesn't end in a newline effectively the line count will be off by 1. (hence the old convention leaving newline at the end of your file)

Since I can never be sure if any given file follows the convention of ending the last line with a newline or not, I recommend using any of these alternate commands which will include the last line in the count regardless of newline or not.

sed -n $= filename
perl -lne 'END { print $. }' filename
awk 'END {print NR}' filename
grep -c '' filename

answered Sep 18 '14 at 14:56

pretzels1337

241

nice summary. And welcome to unix & linux – Sebastian Sep 18 '14 at 15:29
Hm is the last piece really line? – gena2x Sep 18 '14 at 19:48
1

I'm sure it depends on everyone's usecase; for the 'last piece' is usually a line of text that someone didn't cap off with a newline. The usecase I most often encounter is a file with a single string of text that does not end in a newline. wc -l would count this as "0", when I would otherwise expect a count of "1". – pretzels1337 Sep 23 '14 at 16:22
It is not the count of wc that will be off by one, but your count: While in Windows (and DOS before it), the CR/LF sequence is a line separator, on Unix the LF character is a line terminator. That is, without a newline at the end, you don't have a line (and strictly speaking, not a valid text file). – celtschk Feb 07 '20 at 09:15

score 3 · Answer 5 · answered Jul 05 '16 at 07:50

3

You can always use the command grep as follows:

grep -c "^" file.txt

It will count all the actual rows of file.txt, whether or not its last row contains a LF character at the end.

answered Jul 05 '16 at 07:50

Paolo

31

score 2 · Answer 6 · answered Jul 06 '14 at 20:06

2

In case you only have bash and absolutely no external tools available, you could also do the following:

count=0
while read
do
  ((count=$count+1))
done <file.txt
echo $count

Explanation: the loop reads standard input line by line (read; since we do nothing with the read input anyway, no variable is provided to store it in), and increases the variable count each time. Due to redirection (<file.txt after done), standard input for the loop is from file.txt.

answered Jul 06 '14 at 20:06

celtschk

10,844

This is a very inefficient way to do it. Remember, bash reads are slow. – codeforester Feb 07 '20 at 01:51
@codeforester: That's true, but (a) it was a solution for when you have no other tool available, and (b) slow doesn't mean unawaitable. I just tried with a text file of 125MB (taking an actual text file and concatenating it a thousand times) and more than 2.6 million lines, and it took slightly less than 14 seconds. Not nothing — the tools do it in a fraction of a second — but certainly awaitable. – celtschk Feb 07 '20 at 08:58
1

This would miscount if any line ended with a backslash. – Kusalananda Apr 11 '21 at 17:54

Henry Tseng · Answer 7 · 2021-04-11T17:45:16.670

If you're looking to count smaller files a simple wc -l file.txt could work.

Looking for an answer to this question myself, working with large files that are several gigs, I found the following tool:

https://github.com/crioux/turbo-linecount

Also, depending on your system configuration--if you're using an older version of wc you might be better off piping larger chunks with dd like so:

dd if={file_path} bs=128M | wc -l

score 0 · Answer 8 · answered Oct 30 '22 at 09:09

0

grep -c $ is very simple and works great.

I even saved it as an alias since I use it a lot _{^{(lc stand for line count)}}:

alias lc="grep -c $"

It can be used either this way:

lc myFile

Or that way:

cat myFile | lc

Note that this will not count the last line if it is empty. For my uses that is almost always OK though.

answered Oct 30 '22 at 09:09

pitamer

101

If your grep implementation does not count the last line if it's empty, it's severely broken. On the contrary there are some grep implementations that count the extra bytes found after the last newline in non-text files as an extra line. For instance, printf foo | grep -c $ outputs 1 with GNU grep even though printf outputs no line. printf foo | wc -l correctly outputs 0. – Stéphane Chazelas Oct 30 '22 at 10:38
@StéphaneChazelas But wc -l will be off by 1 for files without a newline at the end of the file (in those cases it will not count the last line, even if it has text). Isn't that sort of more broken? For my personal use, I'd rather skip counting the newline at EOF than skip counting a line with actual stuff in it. But I guess everybody counts lines for different purposes, and wc -l might be better for some! :) – pitamer Oct 30 '22 at 13:10
That's the point. A line has to delimited by a newline character. The characters after the last newline if any don't form part of a line. If there are such characters, by definition the file is not a text file. The output of printf foo does not form a text file. – Stéphane Chazelas Oct 30 '22 at 17:21

count lines in a file

8 Answers8

Bash-only

Perl Solutions

Awk Solution

Linked

Related