120

I have a file, f1.txt:

ID     Name
1      a
2         b
3   g
6            f

The number of spaces is not fixed. What is the best way to replace all the white spaces with one space using only tr?

This is what I have so far:

cat f1.txt | tr -d " "

But the output is:

IDName
1a
2b
3g
6f

But I want it to look like this:

ID Name
1 a
2 b
3 g
6 f

Please try and avoid sed.

cuonglm
  • 153,898
gkmohit
  • 3,309

6 Answers6

170

With tr, use the squeeze repeat option:

$ tr -s " " < file
ID Name
1 a
2 b
3 g
6 f

Or you can use an awk solution:

$ awk '{$2=$2};1' file
ID Name
1 a
2 b
3 g
6 f

When you change a field in record, awk rebuild $0, takes all field and concat them together, separated by OFS, which is a space by default.

That will squeeze sequences of space and tabs (and possibly other blank characters depending on the locale and implementation of awk) into one space, but also remove the leading and trailing blanks off each line.

cuonglm
  • 153,898
  • 1
    This is a great solution too . . . I dont know which one to choose now :/ @Gnouc – gkmohit Jul 22 '14 at 19:11
  • Feel free to chose any solution that you like and it works for you. A note that my solution is different with @polym's answer. – cuonglm Jul 22 '14 at 19:13
  • I am going to choose @polym because he could use the reputation points :). Please remember both of your answers are equally good :) – gkmohit Jul 22 '14 at 19:15
  • @Unknown: Didn't read your answer carefully. I added a tr solution. – cuonglm Jul 22 '14 at 19:17
  • 1
    :)) yay! @Gnouc s answer is really dynamic, because he uses awk, he can do anything. You can also accept his solution. Just one thing: Gnouc can you possibly explain what the awk format in your command does? Also can you add tabs/spaces so that the output is conforming to Unknown's expected output? – polym Jul 22 '14 at 19:17
  • @polym Sorry my output file had a formatting error. replaced it. – gkmohit Jul 22 '14 at 19:19
  • 1
    @polym: With Unknown's last edit, he seems only want one space, not output like column -t does. Add explaination for awk. – cuonglm Jul 22 '14 at 19:30
  • @Gnouc - you might want to get tabs in there too. The POSIX class [:blank:] could do it, but a \tab followed by a ' ' would not result in them being squeezed unless you first convert all of one to the other. – mikeserv Jul 24 '14 at 23:16
  • 4
    There is a small difference here. tr will replace two spaces at the end of a line with a single space. awk will remove all trailing spaces. – Anne van Rossum Sep 24 '15 at 10:40
  • @AnnevanRossum There's also a huge difference in readability. Goddamnit awk. Why you got to be so useful and yet so unreadable? – Vala Jul 26 '16 at 11:39
  • Only awk cleanly squeezed any type of spaces while using fortune | paste -s | .... – Pablo A Apr 13 '19 at 05:57
  • +1 for the squeeze bit. I finally have a mnemonic device for remembering that darn command line flag. – Brent Writes Code Jun 30 '20 at 18:08
  • One of the most useful tricks to parse text output at the terminal – Pablo Adames Feb 19 '24 at 00:59
32

Just use column:

column -t inputFile

Output:

ID  Name
1   a
2   b
3   g
6   f
Volker Siegel
  • 17,283
polym
  • 10,852
13

If you want to squeeze "white space" you will want to use tr's pre-defined character sets ":blank:" (horizontal whitespace tab and space) or ":space:" (verical whitespace) :

/bin/echo -e  "val1\t\tval2   val3" | tr -s "[:blank:]"

Examples were run on Red Hat 5 (GNU tr).

In my case I wanted to normalize all whitespace to a single space so I could rely on the space as a delmitter.

As pointed out by dastrobu's second comment I missed the wording in the man page:

 -s uses the last specified SET, and occurs after translation or deletion.

This allows us to eliminate the first tr. Kudo's to scott for his patiences in the face of my denseness.

Before, parsing port from Redis config. file:

grep "^port" $redisconf | tr "[:blank:]" " " | tr -s "[:blank:]"  | cut -d" " -f2

After, with SET2 being specified with the squeeze:

grep "^port" $redisconf | tr -s "[:blank:]" " " | cut -d" " -f2

Output:

6379

For more details covering the nuances of whitespace

Demonstrate where squeeze alone fails when successive mixed characters which fall into the [:blank:] character class are involved:

 /usr/bin/printf '%s \t %s' id myname | tr -s "[:blank:]"  | od -cb
0000000   i   d      \t       m   y   n   a   m   e
        151 144 040 011 040 155 171 156 141 155 145
0000013

Note: My two string fields in the printf format are separated by 1 space, 1 tab, 1 space. After the squeeze this sequence still exists. In the output of the Octal dump this is represented by ascii sequence 040 011 040.

  • To just squeeze, tr is quite reasonable, but when you ALSO want to match a line to a pattern AND select one (horizontal) space delimited field, use awk '/^port/ {print $2}' $redisconf – dave_thompson_085 Jun 15 '16 at 20:39
  • @dave_thompson_85: Yes, the accepted answer already points this out. Awk is the superior tool. But, at the cost of a much higher learning curve. I was trying to answer the question as asked at the OP's perceived level of experience. My power tool, when I choose to wield it is Perl: perl -ane 'print $F[1] if /^port/' – user3183018 Jun 18 '16 at 13:50
  • 1
    Do you really need tr "[:blank:]" " " | tr -s "[:blank:]"? I guess the first part will suffice, i.e. tr "[:blank:]" " " since it normalizes whitespace and does the substitution already. From man page: "Squeeze multiple occurrences of the characters [...] This occurs after all deletion and translation is completed." – dastrobu Mar 24 '18 at 12:21
  • @dastrobu: I was trying to account for situations where others can edit the file and introduce a mix of characters which fall into the :blank: character class. E.g. Port6379 The first part will fail on the "cut" if more than one :blank: exists because field "2" will have shifted. Without the first part, the squeeze will leave one of each :blank: type which will again cause the cut field to miss the port. I can add an additional example if you think it helps? – user3183018 Mar 26 '18 at 17:44
  • 3
    so ´tr -s "[:blank:]" " "´ should do it it first translates all blanks to spaces and then squeezes the spaces. No need for a second ´tr´. – dastrobu Mar 28 '18 at 13:49
  • @dastrobu: No. The squeeze will translate all successive characters which fall within the :blank: character class to a single version of the same. The character is not the only type of :blank:. Reread my previous comment and let me know if that is unclear. – user3183018 Mar 28 '18 at 15:59
  • 2
    I tried printf 'ID \t Name\n' | tr -s "[:blank:]" " " | od -cb (as suggested by @dastrobu) and I got ID Name\n (with *one* space) as output.  Did you actually try it, @user3183018? – Scott - Слава Україні Apr 11 '19 at 17:03
  • @scott: yes I did. You incorrectly replicated the situation. I was accounting for successive mixed space characters. See my comment to dastrobu. In many cases, you don't control the input. I was simply offering a solution where this nuance was involved. – user3183018 Apr 12 '19 at 11:50
  • 2
    OK, let me try to say this again. I did printf 'ID␣\t␣Name\n' | tr -s "[:blank:]" "␣"  (as suggested by @dastrobu), where represents a space, and I got ID␣Name\n (with one space) as output.  This is exactly the same as your example of “Port6379” except I used the heading strings from the question.  I’m wondering whether you tried tr -s "[:blank:]" (without the final "␣" argument). – Scott - Слава Україні Apr 12 '19 at 19:29
  • @scott: sorry I must have missed the spaces in your comment. Is it possible we have different implementations of printf? I'm using /usr/bin/printf under Bash 4.4.19 -- I had updated my answer following your comment to show the squeeze failing. What does your printf show when piped directly to "od"? – user3183018 Apr 13 '19 at 09:53
  • 2
    When I do printf 'ID \t Name\n' | od -cb, it shows exactly what it’s supposed to: ID \t N a m e \n (i.e., ID 040 011 040 N a m e\n). Meanwhile, by your own evidence, you’re making exactly the error that I guessed that you were: you are running tr -s "[:blank:]" (i.e., tr with one option and one argument), instead of the command that @dastrobu and I have presented four times now: tr -s '[:blank:]' '␣' (i.e., tr with one option and *two arguments*). – Scott - Слава Україні Apr 13 '19 at 22:42
  • Ok, I see now. To be fair, dastrobu's first comment is not using the "-s". I then glossed over his second and thought he was repeating the same. I stand corrected. – user3183018 Apr 14 '19 at 03:55
6

Who needs a program (other than the shell)?

while read a b
do
    echo "$a $b"
done < f1.txt

If you want the values in the second column to line up, as in polym’s column answer, use printf instead of echo:

while read a b
do
    printf '%-2s %s\n' "$a" "$b"
done < f1.txt
  • 1
    In the first place, when compared with tr - this is a terribly weak suggestion efficiency-wise unless the input is just too small too outweigh the tiny cost of tr's invocation - which is not to mention how much more work it takes to write. Last, wouldn't you say that this post does not actually answer the question as asked? What is the best way to replace all the white spaces with one space using only tr? – mikeserv Jul 24 '14 at 23:22
  • 1
    And besides - couldn't you more easily just do something with $IFS? Maybe like: IFS=' <tab>' set -f ; echo $(cat <file)? – mikeserv Jul 24 '14 at 23:49
2

This is an old question and solved many times. Just for completeness: I had a simillar issue, but wanted to pass lines via pipe to antother program. I used xargs.

-L max-lines
   Use at most max-lines nonblank input lines per command line.
   Trailing blanks cause an input line to be logically continued 
   on the next input line.  Implies -x.

so cat f1.txt | xargs -L1 seems to output exactly what you want.

  • Note that with this method, if a line has trailing spaces it will be merged with the following line (as stated in the quoted description), e.g. printf ' 1 a\n2 b \n3 d\n' | xargs -L1 outputs 1 a<newline>2 b 3 d<newline>. – tom Mar 06 '22 at 10:20
-1

The simplest solution of trimming the string of white spaces is to use the xargs command. Example:

[root@localhost ~]# echo "    bla   bla     truc "|xargs
bla bla truc

After that you can easily program the loop using the cut command to get the values and manipulate them. Here is the exact example:

[root@localhost ~]# cat <<EOF>f1.txt
ID     Name
1      a
2         b
3   g
6            f
EOF
[root@localhost ~]# cat testscr.sh

#!/bin/bash i=0 onelinevars=cat f1.txt |xargs while [[ "$i" -lt echo $onelinevars|wc -w ]]; do if [[ $((i%2)) -eq 1 ]]; then echo -n echo $onelinevars|cut -d ' ' -f $i echo -n ' ' echo echo $onelinevars|cut -d ' ' -f $((i+1)) fi i=$((i+1)) done

[root@localhost ~]# ./testscr.sh

ID Name
1 a
2 b
3 g
6 f
Stephen Kitt
  • 434,908