5

I want to find the last line of text in a file, and delete the comma at the end of it. I asked about this already, but, after I got an answer I realized my question was not specific enough.

This sed command will go to the last line of a file and take action on it. In my case, I want to remove the trailing comma:

sed -i '$ s/",/"/g' file.txt

So this:

blah blah blah,
blah blah blah,
blah blah blah,

... becomes this:

blah blah blah,
blah blah blah,
blah blah blah

However, this won't work if there are blank lines after the last line of text in the file.

I've been searching for ways to get the last line of text but haven't come up with anything that I can understand and apply. I've also looked for ways to remove all trailing blank lines, and found this command:

sed -e :a -e '/^\n*$/{$d;N;ba' -e '}' *.txt

But it doesn't work for me (it just seems to output the contents of my files on the command line). In any case, it's inelegant. I'd rather not delete the trailing blank lines, it would be much better to just identify the last line with text in it and act on that.

How do I remove the comma from the last line of text in multiple files in a directory?

Questioner
  • 1,150
  • 5
  • 21
  • 35

4 Answers4

5

On large files, use Guru's answer, it's faster. On small files (<25 lines), however, I found this to be slightly faster (assuming you have GNU tac):

tac file | awk '!/^[[:blank:]]*$/{i++;if(i==1){sub(",$","")}}1' | tac
Chris Down
  • 125,559
  • 25
  • 270
  • 266
3

sed can be put into multi-line search & replace mode so that it can handle the entire file contents as a single line in "pattern space".

(Note: FreeBSD sed requires the option sequence -n -i -e for correct in-place file editing.)

# delete the last , in the last non-empty line of a file
# cf. http://austinmatzko.com/2008/04/26/sed-multi-line-search-and-replace/

cat -n testfile

sed -n -i -e '
# if the first line copy the pattern to the hold buffer
1h
# if not the first line then append the pattern to the hold buffer
1!H
# if the last line then ...
$ {
# copy from the hold to the pattern buffer
g
# remove last , in last non-empty line of file 
#s/,\([^,[:cntrl:]]*\n*\)$/\1/
s/,\([^,]*\)$/\1/
p
}' testfile
marcos
  • 31
3

If a Perl solution is ok for you:

 perl -00pe 's/(.*),/$1/s' file

To save the changes in the file itself:

perl -i -00pe 's/(.*),/$1/s' file

To apply this on multiple files:

perl -i -00pe 's/(.*),/$1/s' *.txt
Guru
  • 5,905
  • Does this work on multiple files? Like perl -00pe 's/(.*),/$1/s' *.txt? I've been trying it, but all it seems to do is output the contents of the files to the terminal screen without making any changes. Sorry... I know nothing about perl. – Questioner Feb 21 '13 at 06:25
  • @DaveMG : updated the solution for your requirement... – Guru Feb 21 '13 at 06:39
  • So close... the command has removed the comma from the end of every line. Sorry if it's not clear in the question. I just want to remove the comma from only the last non-blank line. – Questioner Feb 21 '13 at 06:44
3

Answer

perl -0777 -p -i -e 's/,(\n*)\Z/\1/m' *.txt

will remove the last ',' in all files ending in .txt, if the ',' is followed only by 0-or-more newline characters then the end of the file.

From your example:

reedm@www:~/tmp $ cat > test.txt
blah blah blah,
blah blah blah,
blah blah blah,


reedm@www:~/tmp $ perl -0777 -p -i -e 's/,(\n*)\Z/\1/m' *.txt
reedm@www:~/tmp $ cat test.txt
blah blah blah,
blah blah blah,
blah blah blah


reedm@www:~/tmp $ 

Wat?

Perl is an esoteric beast at the best of times, and perl one-liners can be particularly cryptic.

The -e flag allows us to pass a perl program on the command line. In this case, the 's/regex/replace/flags' is the program.

The -p flag causes perl to apply your supplied program in a loop over each "line" (see -0) for each filename provided.

The -i flag causes perl to replace the file with the output of the program, rather than printing the output to standard out.

The -0 flag changes what delimiter perl uses to break a file into "lines". 0777 is a special value, used by convention to make perl read the entire file into a single "line".

The regular expression is somewhat complicated by the use of a few perl-specific tricks:

  • First, the m flag at the end causes the regex to operate on multiple lines.
  • , is simple, and matches a single, literal comma.
  • (\n*) matches 0-or-more newlines in a row, and stores them as a subpattern (the ( and ) characters denote a subpattern). As this is the first subpattern, we can use \1 in the replacement section to mean "whatever this subpattern matched".
  • \Z is a perl specific extension, and matches the end of the string being worked with -- in this case, that's the entire file.
  • In the replacement part, we use \1 to replace the match with only the series of newlines, removing the comma.

For man information on perl regular expressions and perl command line flags, check out the man pages for perlre and perlrun respectively.

  • For anyone who needs to modify the above perl one-liner to handle the case where you wish to append text to the end of the last non-empty line (in this case, appending an "x" to the trailing comma in the last line), try perl -0777 -p -i -e 's/,(\n*)\Z/\x\1/m' – Digger Mar 17 '16 at 04:21