381

I would like to remove all leading and trailing spaces and tabs from each line in an output.

Is there a simple tool like trim I could pipe my output into?

Example file:

test space at back 
 test space at front
TAB at end  
    TAB at front
sequence of some    space in the middle
some empty lines with differing TABS and spaces:





 test space at both ends 
rubo77
  • 28,966
  • 11
    To anyone looking here for a solution to remove newlines, that is a different problem. By definition a newline creates a new line of text. Therefore a line of text cannot contain a newline. The question you want to ask is how to remove a newline from the beginning or end of a string: https://stackoverflow.com/questions/369758, or how to remove blank lines or lines that are just whitespace: https://serverfault.com/questions/252921 – Tony Jun 25 '18 at 23:24

21 Answers21

485
awk '{$1=$1;print}'

or shorter:

awk '{$1=$1};1'

Would trim leading and trailing space or tab characters1 and also squeeze sequences of tabs and spaces into a single space.

That works because when you assign something to any one field and then try to access the whole record ($0, the thing print prints be default), awk needs to rebuild that record by joining all fields ($1, ..., $NF) with OFS (space by default).

To also remove blank lines, change it to awk 'NF{$1=$1;print}' (where NF as a condition selects the records for which the Number of Fields is non-zero). Do not do awk '$1=$1' as sometimes suggested as that would also remove lines whose first field is any representation of 0 supported by awk (0, 00, -0e+12...)


¹ and possibly other blank characters depending on the locale and the awk implementation

115

The command can be condensed like so if you're using GNU sed:

$ sed 's/^[ \t]*//;s/[ \t]*$//' < file

Example

Here's the above command in action.

$ echo -e " \t   blahblah  \t  " | sed 's/^[ \t]*//;s/[ \t]*$//'
blahblah

You can use hexdump to confirm that the sed command is stripping the desired characters correctly.

$ echo -e " \t   blahblah  \t  " | sed 's/^[ \t]*//;s/[ \t]*$//' | hexdump -C
00000000  62 6c 61 68 62 6c 61 68  0a                       |blahblah.|
00000009

Character classes

You can also use character class names instead of literally listing the sets like this, [ \t]:

$ sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' < file

Example

$ echo -e " \t   blahblah  \t  " | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//'

Most of the GNU tools that make use of regular expressions (regex) support these classes (here with their equivalent in the typical C locale of an ASCII-based system (and there only)).

 [[:alnum:]]  - [A-Za-z0-9]     Alphanumeric characters
 [[:alpha:]]  - [A-Za-z]        Alphabetic characters
 [[:blank:]]  - [ \t]           Space or tab characters only
 [[:cntrl:]]  - [\x00-\x1F\x7F] Control characters
 [[:digit:]]  - [0-9]           Numeric characters
 [[:graph:]]  - [!-~]           Printable and visible characters
 [[:lower:]]  - [a-z]           Lower-case alphabetic characters
 [[:print:]]  - [ -~]           Printable (non-Control) characters
 [[:punct:]]  - [!-/:-@[-`{-~]  Punctuation characters
 [[:space:]]  - [ \t\v\f\n\r]   All whitespace chars
 [[:upper:]]  - [A-Z]           Upper-case alphabetic characters
 [[:xdigit:]] - [0-9a-fA-F]     Hexadecimal digit characters

Using these instead of literal sets always seems like a waste of space, but if you're concerned with your code being portable, or having to deal with alternative character sets (think international), then you'll likely want to use the class names instead.

References

slm
  • 369,824
  • Note that [[:space:]] is not equivalent to [ \t] in the general case (unicode, etc). [[:space:]] will probably be much slower (as there are many more types of whitespaces in unicode than just ' ' and '\t'). Same thing for all the others. – Olivier Dulac Nov 21 '13 at 12:44
  • 1
    sed 's/^[ \t]*//' is not portable. Atually POSIX even requires that to remove a sequence of space, backslash or t characters, and that's what GNU sed also does when POSIXLY_CORRECT is in the environment. – Stéphane Chazelas Aug 11 '16 at 14:56
  • What if I want to trim newlines characters? '\n \n text \n \n' – Eugene Biryukov Jun 01 '18 at 08:54
  • 2
    I like the sed solution because of the lack of other side-affects as in the awk solution. The first variation does not work when I tried it in bash on OSX jsut now, but the character class version does work: sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' – Tony Jun 25 '18 at 23:13
  • @EugeneBiryukov see my comment on the original post – Tony Jun 25 '18 at 23:27
  • 1
    instead of [ \t]* why don't you use \s* to catch all white-spaces ? – Noam Manos Feb 19 '20 at 12:55
  • Sorry it doesn't work if there is a tab in the begging of a line. – Raymond Jan 21 '23 at 16:29
  • Can someone explain why s,^[[:space:]]*,, works, but s,^[[:space:]]+,, does not? – MichaelK Aug 30 '23 at 13:04
  • 1
    @MichaelK escaping that + works - s,^[[:space:]]\+,,. If you tell sed to do extended regex, -r it'll work w/ sed -r 's,^[[:space:]]+,,'. BTW I typically pipe this to cat -A to see the output more clearly. – slm Sep 08 '23 at 03:11
68

xargs without arguments do that.

Example:

trimmed_string=$(echo "no_trimmed_string" | xargs) 
  • 20
    This also contracts multiple spaces within a line, which was not requested in the question – Chris Davies Sep 09 '15 at 16:04
  • 5
    @roaima - true but the accepted answer also squeezes spaces (which was not requested in the question). I think the real problem here is that xargs will fail to deliver if the input contains backslashes and single quotes. – don_crissti Sep 09 '15 at 18:28
  • 1
    @don_crissti that doesn't mean the accepted answer correctly answers the question as asked, though. But in this case here it wasn't flagged as a caveat whereas in the accepted answer it was. I've hopefully highlighted the fact in case it's of relevance to a future reader. – Chris Davies Sep 09 '15 at 19:22
  • 2
    It also breaks on single quotes, double quotes, backslash characters. It also runs one or more echo invocations. Some echo implementations will also process options and/or backslashes... That also only works for single-line input. – Stéphane Chazelas May 21 '19 at 17:19
  • 1
    This is a clever (unorthodox) answer! I agree w/ @StéphaneChazelas, essentially invoking the default xargs /bin/echo command. – John Doe Aug 18 '21 at 13:40
  • I know this has shortcomings, but if you want something simple, with known-restricted input, this is beautifully succinct in 5 chars... Obtuse perhaps, but succinct! – spechter Apr 26 '22 at 05:53
  • 1
    for multi-line inputs you can <multi-line> | xargs -L1 – Nitsan Avni Feb 09 '23 at 21:55
  • This treats backslashes improperly and may break in shells like Bash as they extended echo with options etc. Also not suit OP's needs as it works on single line. – Wilderness Ranger Nov 16 '23 at 13:01
41

As suggested by Stéphane Chazelas in the accepted answer, you can now
create a script /usr/local/bin/trim:

#!/bin/bash
awk '{$1=$1};1'

and give that file executable rights:

chmod +x /usr/local/bin/trim

Now you can pass every output to trim for example:

cat file | trim

(for the comments below: i used this before: while read i; do echo "$i"; done
which also works fine, but is less performant)

rubo77
  • 28,966
36

If you store lines as variables, you can use bash to do the job:

remove leading whitespace from a string:

shopt -s extglob
printf '%s\n' "${text##+([[:space:]])}"

remove trailing whitespace from a string:

shopt -s extglob
printf '%s\n' "${text%%+([[:space:]])}"

remove all whitespace from a string:

printf '%s\n' "${text//[[:space:]]}"
Łukasz Rajchel
  • 469
  • 4
  • 2
  • 2
    Removing all white-space from a string is not same as removing both leading and trailing spaces (as in question). – catpnosis Mar 24 '18 at 16:04
  • 5
    Far the best solution - it requires only bash builtins and no external process forks. – peterh Jul 05 '18 at 13:56
  • 2
    Nice. Scripts run a LOT faster if they don't have to pull in outside programs (such as awk or sed). This works with "modern" (93u+) versions of ksh, as well. – user1683793 Jul 10 '18 at 22:54
  • 1
    Upon testing, your ltrim and rtrm solutions do not work because they are too greedy. Given a string such as \n\n\n\v\f\t\t\r\r Anthony Rutledge. \n\r\v\\f\n\n\n\t\t\t, your solutions will wipe away the entire string. – Anthony Rutledge Jul 26 '21 at 14:40
32

To remove all leading and trailing spaces from a given line thanks to a 'piped' tool, I can identify 3 different ways which are not completely equivalent. These differences concern the spaces between words of the input line. Depending on the expected behaviour, you'll make your choice.

Examples

To explain the differences, let consider this dummy input line:

"   \t  A   \tB\tC   \t  "

tr

$ echo -e "   \t  A   \tB\tC   \t  " | tr -d "[:blank:]"
ABC

tr is really a simple command. In this case, it deletes any space or tabulation character.

awk

$ echo -e "   \t  A   \tB\tC   \t  " | awk '{$1=$1};1'
A B C

awk deletes leading and tailing spaces and squeezes to a single space every spaces between words.

sed

$ echo -e "   \t  A   \tB\tC   \t  " | sed 's/^[ \t]*//;s/[ \t]*$//'
A       B   C

In this case, sed deletes leading and tailing spaces without touching any spaces between words.

Remark:

In the case of one word per line, tr does the job.

frozar
  • 421
26
sed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'

If you're reading a line into a shell variable, read does that already unless instructed otherwise.

  • 1
    +1 for read. So if you pipe to while read it works: cat file | while read i; do echo $i; done – rubo77 Nov 21 '13 at 03:36
  • 2
    @rubo except that in your example the unquoted variable is also reprocessed by the shell. Use echo "$i" to see the true effect of the read – Chris Davies Sep 09 '15 at 19:19
11

An answer you can understand in a glance:

#!/usr/bin/env python3
import sys
for line in sys.stdin: print(line.strip()) 

Bonus: replace str.strip([chars]) with arbitrary characters to trim or use .lstrip() or .rstrip() as needed.

Like rubo77's answer, save as script /usr/local/bin/trim and give permissions with chmod +x.

qwr
  • 709
  • 3
    I'm not normally a fan of python in my scripts but this is by far one of the most legible scripts in comparison to all the other incantations in these answers. – Victor Mar 22 '22 at 19:10
9

sed is a great tool for that:

                        # substitute ("s/")
sed 's/^[[:blank:]]*//; # parts of lines that start ("^")  with a space/tab 
     s/[[:blank:]]*$//' # or end ("$") with a space/tab
                        # with nothing (/)

You can use it for your case be either piping in the text, e.g.

<file sed -e 's/^[[...

or by acting on it 'inline' if your sed is the GNU one:

sed -i 's/...' file

but changing the source this way is "dangerous" as it may be unrecoverable when it doesn't work right (or even when it does!), so backup first (or use -i.bak which also has the benefit to be portable to some BSD seds)!

7

You will be adding this to your little Bash library. I can almost bet on it! This has the benefit of not adding a newline character to the end of your output, as will happen with echo throwing off your expected output. Moreover, these solutions are reusable, do not require modifying the shell options, can be called in-line with your pipelines, and are posix compliant. This is the best answer, by far. Modify to your liking.

Output tested with od -cb, something some of the other solutions might want to do with their output.

BTW: The correct quantifier is the +, not the *, as you want the replacement to be triggered upon 1 or more whitespace characters!

ltrim (that you can pipe input into)

function ltrim ()
{
    sed -E 's/^[[:space:]]+//'
}

rtrim (that you can pipe input into)

function rtrim ()
{
    sed -E 's/[[:space:]]+$//'
}

trim (the best of both worlds and yes, you can pipe to it)

function trim ()
{
    ltrim | rtrim
}

Update: I have improved this solution to use bash native constructs. The Stream Editor (sed) is not required. You can use shell expansions to achieve what you want, and it works better than the sed solution!

Bash Reference Manual -- Shell Expansions

4

If the string one is trying to trim is short and continuous/contiguous, one can simply pass it as a parameter to any bash function:

    trim(){
        echo $@
    }

    a="     some random string   "

    echo ">>`trim $a`<<"
Output
>>some random string<<
4

Using Raku (formerly known as Perl_6):

raku -ne '.trim.put;'

Or more simply:

raku -pe '.=trim;'

As a previous answer suggests (thanks, @Jeff_Clayton!), you can create a trim alias in your bash environment:

alias trim="raku -pe '.=trim;'"

Finally, to only remove leading/trailing whitespace (e.g. unwanted indentation), you can use the appropriate trim-leading or trim-trailing command instead.

https://raku.org/

jubilatious1
  • 3,195
  • 8
  • 17
3

translate command would work

cat file | tr -d [:blank:]
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
  • 9
    This command is not correct as it removes all spaces from the file, not just leading/trailing whitespace. – Brian Redbeard Sep 28 '18 at 16:41
  • 2
    @BrianRedbeard You are correct. This is still a useful answer for a monolithic string, without spaces. – Anthony Rutledge May 18 '19 at 23:37
  • 2
    You might call this stripper instead of some kind of trim. :-) – Anthony Rutledge Jul 26 '21 at 14:18
  • I completely agree w/ @AnthonyRutledge. Not everyone (perhaps no one) actually winds up here - 11 years after the OP - with the same exact problem as stated in the OP!! It's disappointing to me to see 4 downvotes to this answer as it has utility to many of us who wind up here via various "search terms" like I used: bash cut spaces from start of string. A pox on you plutonian supercilious downvoters - you're disgusting. – Seamus Mar 08 '24 at 02:40
  • @Seamus You can use shell expansions to achieve what you want, and it works better than a sed solution, or one that uses tr! https://www.gnu.org/software/bash/manual/bash.html#Shell-Expansions – Anthony Rutledge Mar 09 '24 at 03:39
3
trimpy () {
    python3 -c 'import sys
for line in sys.stdin: print(line.strip())'
}
trimsed () {
gsed -e 's/^[[:space:]]*//' -e 's/[[:space:]]*$//'
}
trimzsh () {
   local out="$(</dev/stdin)"
   [[ "$out" =~ '^\s*(.*\S)\s*$' ]] && out="$match[1]"  || out=''
   print -nr -- "$out"
}
# example usage
echo " hi " | trimpy

Bonus: replace str.strip([chars]) with arbitrary characters to trim or use .lstrip() or .rstrip() as needed.

HappyFace
  • 1,612
1

I wrote this shell function using awk

awkcliptor(){
    awk -e 'BEGIN{ RS="^$" } {gsub(/^[\n\t ]*|[\n\t ]*$/,"");print ;exit}' "$1" ; } 

BEGIN{ RS="^$" }:
in the beginning before start parsing set record
separator to none i.e. treat the whole input as
a single record

gsub(this,that):
substitute this regexp with that string

/^[\n\t ]*|[\n\t ]*$/:
of that string catch any pre newline space and tab class
or post newline space and tab class and replace them with
empty string

print;exit: then print and exit

"$1":
and pass the first argument of the function to be
process by awk

how to use:
copy above code , paste in shell, and then enter to
define the function.
then you can use awkcliptor as a command with first argument as the input file

sample usage:

echo '
 ggggg    

      ' > a_file
awkcliptor a_file

output:

ggggg

or

echo -e "\n ggggg    \n\n      "|awkcliptor 

output:

ggggg
1

For those of us without enough space in the brain to remember obscure sed syntax, just reverse the string, cut the 1st field with a delimiter of space, and reverse it back again.

cat file | rev | cut -d' ' -f1 | rev
Stewart
  • 13,677
1

My favorite is using perl: perl -n -e'/[\s]*(.*)?[\s]*/ms && print $1'

Take for example:

MY_SPACED_STRING="\n\n   my\nmulti-line\nstring  \n\n"

echo $MY_SPACED_STRING

Would output:


my multi-line string

Then:

echo $MY_SPACED_STRING | perl -n -e'/[\s]*(.*)?[\s]*/ms && print $1'

Would output:

my
multi-line
string 
tin
  • 111
0

for bash example:

alias trim="awk '{\$1=\$1};1'"

usage:

echo -e  "    hello\t\tkitty   " | trim | hexdump  -C

result:

00000000  68 65 6c 6c 6f 20 6b 69  74 74 79 0a              |hello kitty.|
0000000c
  • 1
    The awk '{$1=$1};1' answer was given long ago.  The idea of making an alias out of it was suggested in a comment almost as long ago.  Yes, you are allowed to take somebody else’s comment and turn it into an answer.  But, if you do, you should give credit to the people who posted the idea before you.  And this is such a trivial extension of the accepted answer that it’s not really worth the bother. – Scott - Слава Україні Sep 04 '20 at 04:08
  • Idea was to make alias. I doesn't seen that answer before. – Marek Lisiecki Sep 05 '20 at 18:13
  • and second thing from stack:

    "Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score."

    – Marek Lisiecki Sep 05 '20 at 18:25
0

Remove start space and tab and end space and tab:

alias strip='python3 -c "from sys import argv; print(argv[1].strip(\" \").strip(\"\t\"))"'

Remove every space and tab

alias strip='python3 -c "from sys import argv; print(argv[1].replace(\"\t\", \"\").replace(\" \", \"\")"'

Give argument to strip. Use sys.stdin().read() to make pipeable instead of argv.

Machinexa
  • 123
0

simple enough for my purposes was this:

_text_="    one    two       three         "

echo "$text" | { read __ ; echo ."$__". ; }

... giving ...

.one    two       three.

... if you want to squeeze the spaces then ...

echo .$( echo $_text_ ).

... gives ...

.one two three.
sol
  • 101
0

rust sd command sd '^\s*(.*)\s*' '$1'

walkman
  • 101