191

I have multiple files that contain ascii text information in the first 5-10 lines, followed by well-tabulated matrix information. In a shell script, I want to remove these first few lines of text so that I can use the pure matrix information in another program. How can I use bash shell commands to do this?

If it's any help, I'm using RedHat and an Ubuntu linux systems.

Paul
  • 9,423

7 Answers7

284

As long as the file is not a symlink or hardlink, you can use sed, tail, or awk. Example below.

$ cat t.txt
12
34
56
78
90

sed

$ sed -e '1,3d' < t.txt
78
90

You can also use sed in-place without a temp file: sed -i -e 1,3d yourfile. This won't echo anything, it will just modify the file in-place. If you don't need to pipe the result to another command, this is easier.

tail

$ tail -n +4 t.txt
78
90

awk

$ awk 'NR > 3 { print }' < t.txt
78
90
  • 16
    You can also use sed in-place without a temp file: sed -i -e 1,3d yourfile. This won't echo anything, it will just modify the file in-place. If you don't need to pipe the result to another command, this is easier. – Yanick Girouard May 02 '12 at 23:46
  • As long as the file is not a symlink or hardlink. – jw013 May 03 '12 at 00:26
  • @jw013 What is the symlink limitation about? Sed? Awk? Tail? All of them? –  Jan 07 '14 at 22:19
  • 2
    @Svetlana sed -i specifically. Most implementations just delete the file and replace it with a new one, which doesn't work for links since you end up leaving the original at its other location. – jw013 Jan 07 '14 at 22:21
  • 14
    how about explaining what '1,3d', +4, et.c. means? The question was for n lines, but you didn't tell what n is (as apparently n is 2 in your examples, though it's not obvious for a noob what to change in order to change n) – Robin Manoli Feb 01 '15 at 10:29
  • @Ignacio Vazquez-Abrams what would the command look like if I need to remove a variable number of lines. e.g. sed -i -e "1,${n}d" yourfile. what is the correct syntax to remove n number of lines – Syed Moez Aug 24 '15 at 23:15
  • 3
    This uses a temp file so not very useful for a 100% util disk space. Would be interesting to have a solution that does this literally "in-place". – Shai Sep 02 '16 at 21:09
  • 1
    @Shai yes sed doesn't do it in-place at all, it creates a temp file. – jfa Nov 26 '16 at 18:24
  • You can also do a sed -e '1,3d' t.txt – Philippe Delteil Sep 08 '17 at 14:39
  • Dont try on 'live' (log) files as this will replace the actual file with truncated file. – Makesh Nov 11 '19 at 09:37
  • For the tail solution, if you're trying to get the last 2 lines you can do tail -n -2 t.txt. – Joshua Pinter Jul 08 '21 at 15:01
54

sed -i '1,3d' file.txt

This deletes first 3 line from file.txt.

alhelal
  • 1,301
  • 2
    I need to remove the 50 first lines from a 10GB+ text file. Even if it's supposed to work "in-place", this command still takes a few minutes. Is there any really fast alternative ? – Sébastien Oct 25 '19 at 08:09
  • 1
    @Sébastien if you only have to remove the 50 first lines, open the file in a text editor, select the 50 first lines and delete them maybe? It took 17 seconds to remove the 8 473 386 first lines of a 7GB+ text files with this command, I have to admit I find it quite fast. – smonff Apr 06 '20 at 09:13
  • 1
    If you have a 10GB log file and only ~1GB left of space, this solution won't work as creates a tmp file – bk201 Oct 06 '21 at 22:00
9

If the tabulated lines are the ones that have a tab character:

grep '␉' <input_file >output_file

( being a literal tab character) or equivalently

sed -n '/␉/p' <input_file >output_file

In a bash/ksh/zsh script, you can write $'\t' for a tab, e.g. grep $'\t' or sed -n $'/\t/p'.

If you want to eliminate 10 lines at the beginning of the file:

tail -n +11 <input_file >output_file

(note that it's +11 to eliminate 10 lines, because +11 means “start from line 11” and tail numbers lines from 1) or

sed '1,10d' <input_file >output_file

On Linux, you can take advantage of GNU sed's -i option to modify files in place:

sed -i -n '/\t/p' *.txt

Or you can use a shell loop and temporary files:

for x in *.txt; do
  tail -n +11 <"$x" >"$x.tmp"
  mv "$x.tmp" "$x"
done

Or if you don't want to modify the files in place, but instead give them a different name:

for x in *.txt; do
  tail -n +11 <"$x" >"${x%.txt}.data"
done
6

You can use Vim in Ex mode:

ex -s -c '1d5|x' file
  1. 1 move to first line

  2. 5 select 5 lines

  3. d delete

  4. x save and close

Zombo
  • 1
  • 5
  • 44
  • 63
1

By percentage

Using bash, to clean up a file using a percentage number instead of an absolute number of lines:

sed -i -e 1,$( printf  "$((`cat php_errors.log | wc -l` * 75 /100 ))" )d php_errors.log

Watch out because that command can be destructive since it deletes content in-place, without creating a copy.

It deletes the first 75% of lines from the mentioned file.

Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
pgr
  • 151
  • 3
0
# deletes first line
echo "a\nb" | sed '1d' 

# read list.txt and write list.csv without first line
cat list.txt | sed '1d' > list.csv 

Other useful commands:

# finds first character (pipe|)
grep '^|' 

# deletes pipe
sed 's/|//g'  

# deletes space
sed 's/ //g' 
smonff
  • 189
  • 8
-1

i created a small script which will delete everything from the /var/log/messages except last 4 lines.

# cat remove-range-of-lines.sh

#!/usr/bin/bash

#print total line number line_count=awk 'END{print NR}' /var/log/messages

#exclude last 4 lines remove_line=expr $line_count - 4

#remove everything except last 4 lines sed -i '1,'$remove_line'd' /var/log/messages

using this script we can always keep latest entry in the /var/log/messages based on our requirement

if you want to keep last 10000 entries in the /var/log/messages below is the modified script.

#!/usr/bin/bash

#print total line number line_count=awk 'END{print NR}' /var/log/messages

#exclude last 10000 lines remove_line=expr $line_count - 10000

#remove everything except last 10000 lines sed -i '1,'$remove_line'd' /var/log/messages

  • 3
    This is not an answer to this question. The issue is to remove the first n lines, not to keep the last n lines. – Kusalananda Aug 12 '21 at 16:29