12

-L is a useful feature of wc, or so I thought. It prints the length of the longest line. For some reason it expands a single-byte tab-char to a length of 8.
Is there some way to set this to not "expand" the tab? and what might be the rationale behind this expansion?

echo -n $'\t' | wc -L

outputs 8

wc (GNU coreutils) 7.4
GNU bash, version 4.1.5

Peter.O
  • 32,916

3 Answers3

11

I find no bug report related to this, and the following lines in the source file wc.c

    case '\t':
        linepos += 8 - (linepos % 8);

seem to deliberately choose to behave in this way, probably to give an hint for width needed to display the file on screen.

A quick alternative could be

echo -n $'\t' | tr '\t' ' ' | wc -L
enzotib
  • 51,661
  • 3
    Thanks enzo, I've now found that although man wc makes no mention of this issue, it is stated in info coreutils 'wc invocation' (which 'man' refers too)... Also, after trawling the google-sphere a bit more, I found this as an alternative echo -n $'\t' | expand -t1 | wc -L, which is pretty much the same as your alternative, but I've thrown it in for good measure.. And although the following link is a recompile wc hack*, it may be of interest to some: wc support for different tab widths – Peter.O Sep 12 '11 at 22:36
  • Counting tabs as 8 characters was added to GNU Coreutils in 1997 in the original patch that implemented the -L/--max-line-length feature. Before that they were counted as 1 character. –  Dec 08 '20 at 21:53
1

The wc -L description was ambiguous. It returns the widest display width. To control tab expansion you can filter through expand first.

0

Normally a tab is expanded to the next position, (divisible by 8)+1 [1, 9, 17, 25, ...], so if you ask for it, you get it.

Note, that the -n is irrelevant for the question, but the $ is not.

echo foo$'\t' | wc -L

will return 8 too, because

echo foo$'\t'bar 
foo     bar

You can omit the $, if you use -e for echo:

echo -e '\t' | wc -L
8

So if you want to count the '\t' as a single byte, just omit -e and $:

echo '\t' | wc -L
2
user unknown
  • 10,482
  • Yes, expanding tabs is common enough for a printed/displayed output, but I found it odd that a program which counts bytes and words would count 1 character as anything other than 1 character... btw echo '\t' does not output a tab-char (\x09). It outputs a line whose length is 2, ie. a '\' and a 't'. A newline is not parte of a line's length... (I had a -n in my example to check whether wc would properly process a file which has no trailing newline-char...) – Peter.O Sep 12 '11 at 22:12
  • wc --help says: -L, --max-line-length print the length of the longest line?. It doesn't talk about bytes, but line lengths. – user unknown Sep 12 '11 at 23:05
  • 1
    Yes, it does say "print the length of the longest line"...` but it does not say "We assume that you want tabs expanded (not the usual character count, like most other length functions).. Oh, by the way, we will expand tabs to 8 spaces, regardless of what your specific tab stops are set to." ... That is the trap.. It is not properly documented. – Peter.O Nov 01 '11 at 21:35
  • How do you set the tab with? In Bash? Furthermore: Tabs are not expanded to 8 spaces, but to positions, see echo -e foo'\t'bar | wc -L which results in 11, not in 14. – user unknown Nov 02 '11 at 11:17
  • In the above foo\tbar example, wc has assumed tab-stops at a nominal spacing of 8... The following example shows how wc ignores the currently active tab-stop settings. It outputs a line to the terminal which is 8 terminal-columns wide/long, yet wc reports it to be 11. This example sets tab-stops to every 6th column... tabs -6; echo 12345678; echo -e "foo\tbar"|tee >(wc -L) – Peter.O Nov 13 '11 at 19:54
  • It think it's a gross error from the devs to assume anyone using this switch will know upfront about \t adding +8 to line length. How hard is to add an extra sentence in the manual clarifying? – ata Nov 14 '11 at 03:49
  • @fered: So I conclude, that wc does not assume, that the size of the text should be measured in lengths, defined by the current terminal session. – user unknown Nov 14 '11 at 06:54
  • @Juaco: If you think it is important, you should try it. – user unknown Nov 14 '11 at 06:57
  • @user unknown: it's the maintainer's responsibility to not leave undocumented such a detail. Not user's. – ata Nov 14 '11 at 12:43
  • Well - maybe the developer is not reading your wishes here, but maybe he would response to an email? Talking to me is rather senseless. Do you want to complain, or do you want to change the situation? – user unknown Nov 15 '11 at 01:23
  • @userunknown: I certainly don't want to change the situation. Do YOU want it? You send an email. I do want to just express what I have in my mind, "complain" is a different thing. – ata Nov 17 '11 at 00:34