2

In the following example, there are 4 spaces before inet.

wolf@linux:~$ ip address show eth0 | grep 'inet '
    inet 10.10.10.10/24 brd 10.10.10.255 scope global dynamic eth0
wolf@linux:~$ 

How do I count the number of spaces like this example.

This sample is easy as it only has 4 spaces.

What if it has more than that? Hundreds, thousands?

Is there an easy way to do this?

Wolf
  • 1,631

6 Answers6

8

You can use tr to delete everything that’s not the character you’re interested in, the wc to count the remaining characters:

ip address show eth0 | grep 'inet ' | tr -d -c ' ' | wc -m

This scales well to large amounts of text, tr is very efficient.

Note however that with some implementations of tr including GNU tr, that only works properly for single-byte characters (such as the space character).

If you only want to count leading spaces, you’ll need something a little more powerful than tr:

ip address show eth0 | grep 'inet ' | sed 's/[^ ].*$//' | tr -d '\n' | wc -m

This deletes every part of each line which is not leading space, then deletes newlines and counts.

See How to count the number of a specific character in each line? if you’re interested in counts per line.

Stephen Kitt
  • 434,908
7

To count the number of space characters at the start of each line, you could do:

awk -F '[^ ].*' '{print length($1)}'

Which prints the length (in number of characters) of the first field, where field are separated by any sequence of characters starting with a non-space.

To report the maximum amount of whitespace found at the start of any line of the input (the maximum indentation), with GNU wc:

sed 's/[^[:blank:]].*//' | wc -L

That reports that amount of whitespace in terms of display width on a display device where tab stops are 8 columns appart:

$ printf '\tfoo\n' | sed 's/[^[:blank:]].*//' | wc -L
8
$ printf '\u3000foo\n' | sed 's/[^[:blank:]].*//' | wc -L
2

The U+3000 character (the ideographic space character, classified as blank in my locale) is a double-width character encoded on 3 bytes in UTF-8.

If you'd rather wanted that maximum length to be reported in terms of number of characters:

sed 's/[^[:blank:]].*//;s/./x/g' | wc -L

(s/./x/g converts every character on each line to x which we know has a display width of 1).

Or in terms of number of bytes:

sed 's/[^[:blank:]].*//' |
  LC_ALL=C tr -c '\n' '[x*]' | # convert each byte other than newline to x
  wc -L
  • It's cool to find out so many things can be accomplished with awk. Thanks @Stéphane Chazelas – Wolf Aug 27 '20 at 14:56
3
  • Print the number of leading spaces:

    awk '{print match($0,/[^ ]|$/)-1}' file
    

    match($0,/[^ ]|$/) matches the first non-space ([^ ]) or the end-of-line ($) and returns its position.

  • Print the number of spaces:

    awk -F '[ ]' '{print (NF?NF-1:0)}' file
    

    -F '[ ]' sets the field separator to space. NF is the number of fields. The ternary expression means: "If NF is not 0, print NF-1, else print 0". This is because NF is 0 if the line is empty.

Quasímodo
  • 18,865
  • 4
  • 36
  • 73
2

it reads like what you really want is how to delete leading white space

many ways to do that, assuming you want to do it in bash I found this from

https://www.cyberciti.biz/tips/delete-leading-spaces-from-front-of-each-word.html

echo "     This is a test"

remove leading white space on the output

echo " This is a test" | sed -e 's/^[ \t]*//'

so in your case you could do

ip address show eth0 | grep 'inet ' | sed -e 's/^[ \t]*//'

also check out How do I trim leading and trailing whitespace from each line of some output?

ron
  • 6,575
  • Thanks @ron. I appreciate the answer and tips given. – Wolf Aug 27 '20 at 14:55
  • 2
    In standard sed implementations, [ \t] matches on either SPC, backslash or t. GNU sed however only does it when POSIXLY_CORRECT is in the environment. sed 's/^[[:blank:]]*//' would be standard. It would remove spaces and tabs and all other characters classified as blank in the locale. – Stéphane Chazelas Aug 27 '20 at 14:59
0

I have taken below example

`echo "      praveen"| grep -o "^ *"| awk '{print length($0)}'`6

output

6

Python

>>> a="      praveen"
>>> import re
>>> k=re.compile(r'^ *')
>>> m=re.search(k,a)
>>> print len(m.group())
6
>>> 
0
$ ip address show eth0 \
| grep -oP '^\h*(?=inet\h)' \
| wc -m;

This uses GNU grep with PCRE mode and looks for any leading horizontal whitespace, aka, [[:blank :]], followed by inet and another blank. Then we feed it to wc to get a char count.

Using gnu awk with the FPATH variable set to a run of blanks.

$ ... |
 gawk -v FPAT='\\s*' '$0=length($1)""'

Using python list compression we can also feed grep's o/p to get the count.

$ ... | python3 -c 'import sys;print(*[len(l)-len(l.lstrip()) for l in sys.stdin],sep="\n")'

We can also feed the grep output to perl.

$ ..... |
  perl -F'\H' -pE '$_=$F[0]=~y///c'

Here we split the record on nonhorizontal whitespace and then replace every character in the first field, in a scalar context returns the translations done. Assign it to the record and have the -p option get it autoprinted.