Trim lines to a specific length

Question

I have a file with many lines, and I want to trim each line to be 80 characters in length. How could I do this?

I have already filtered out lines shorter than 80 characters, so now I'm left with a file that has lines 80+ characters in length and I want to trim each line so that all are exactly 80. In other words I want to preserve the first 80 characters in each line and remove the rest of the line.

@RuiFRibeiro You have made this a duplicate of an invalid (off-topic) question, Interesting !!. Yes there are other questions that may match the concept of duplicate a bit better, but not that one. — , May 31 '18 at 04:11

jesse_b · Answer 1 · 2018-05-30T17:36:08.747

22

You can use the cut command:

cut -c -80 file

With grep:

grep -Eo '.{80}' file

edited May 30 '18 at 17:36

answered May 30 '18 at 17:29

jesse_b

37,005

I only want to cut each line in the file to 80, the file itself has thousands of lines, each line longer than 80 characters. – mah May 30 '18 at 17:31
@mah That is what this command does. – jesse_b May 30 '18 at 17:31
Well, not exactly, cut counts bytes (not characters). – May 30 '18 at 17:59
@Isaac: True for GNU cut – jesse_b May 30 '18 at 18:04
@Isaac: Also apparently GNU has fixed that in later versions of coreutils – jesse_b May 30 '18 at 18:07
Testing on cut (GNU coreutils) 8.28, it cuts on bytes, is there a newer version available? – May 30 '18 at 18:11
I don't think it was ever broken on BSD cut and it cuts by character for me with coreutils 8.28 – jesse_b May 30 '18 at 18:15
Trying echo áááááááááá\ éééééééééé{,,} | tee /dev/stderr | cut -c -80printsáááááááááá éééééééééé áááááááááá éééééééééé áááááááááá éééééééééé áááááááááá éééééééééé áááááááááá éééééééé�` Which looks broken to me. – May 30 '18 at 18:27
Are you using glibc? What OS? – May 30 '18 at 18:28
1

Alright yeah it's not working with gnu cut afterall, bsd cut works fine though. macos – jesse_b May 30 '18 at 18:33
fwiw, that grep syntax just reformatted the output to 80 columns without cutting it. λ awk '{print substr($0,1,80)}' did what I was looking for. Thanks all. – IdusOrtus Apr 05 '19 at 16:27
Here is fix for grep: grep -Eo '^.{80}' file – red_led Jun 05 '20 at 11:36

score 10 · Answer 2 · answered May 30 '18 at 18:27

10

Using AWK:

awk '{print substr($0,1,80)}' file.txt

Using cut:

 cut -c -80 file.txt

Using colrm:

colrm 81 file.txt

Using sed:

sed 's/^\(.\{80\}\).*$/\1/' file.txt

Using grep:

grep -Eo '.{80}' file.txt

answered May 30 '18 at 18:27

Siva

9,077

awk worked for me. grep did not. I didn't try the others. Thanks for the solution I needed! – IdusOrtus Apr 05 '19 at 16:28
1

Could you have meant '^.{80}' in your grep RegExp? – AdminBee Sep 23 '20 at 10:10

score 5 · Answer 3 · answered May 30 '18 at 17:56

To cut (truncate) each line of the file (and have the output in the present console) use:

cut -c -80 infile               # cut only counts bytes (fail with utf8)
grep -o '^.\{1,80\}' infile
sed 's/\(^.\{1,80\}\).*/\1/' infile

If what you want is to insert a newline at the 80 character and split each line longer than 80 characters into more lines, use:

fold -w 80 infile            # fold, like cut, counts bytes.

If you want to split only at spaces (whole words), use:

fold -sw 80 infile

For all the solutions above, redirect to some other file like >outfile (do not use the same name, that will not work) at the end of any command to store the result in outfile. Example:

fold -sw 80 infile > outfile

score 1 · Answer 4 · answered May 30 '18 at 17:33

1

With sed:

sed 's/^\(.\{80\}\).*$/\1/' file

With cut:

cut -c -80 file

answered May 30 '18 at 17:33

DopeGhoti

76,081

jubilatious1 · Answer 5 · 2020-09-25T03:16:38.793

Using Raku (née Perl6)

~$ raku -ne 'put ~$0 if m/ ^^(. ** 80) /;'

OUTPUT:

the of and to in a is that for it as was with be by on not he i this are or his
the of and to in a is that for it as was with be by on not he i this are or his
the of and to in a is that for it as was with be by on not he i this are or his
the of and to in a is that for it as was with be by on not he i this are or his
[TRUNCATED]

The code above return the first 80 characters of a line (the ^^ zero-width assertion means "start-of-line"). If the line is too short, nothing is returned. To return UP TO 80 characters, use the form ** 1..80.

Numbering captures starts with $0. Get a readout of the number of characters returned by adding .chars to the ~$0 capture variable:

~$ raku -ne 'put ~$0.chars if m/ ^^(. ** 80) /;' ~/top50.txt
80
80
80
80
[TRUNCATED]

HTH.

https://raku.org

Trim lines to a specific length

5 Answers5