0

I want to create a script that deletes the Nth word from the standard input, for a given N. For example, for this input:

One two three four, five
six seven eight, nine

If we ask to delete the 8th word, it should delete the eight,. For my purposes, a word is any sequence of non-space characters.

One two three four, five
six seven  nine

Is there some clever one-liner that can accomplish this using standard command line utilities? Currently I have a fairly long script to do this, but it feels like overkill.

hugomg
  • 5,747
  • 4
  • 39
  • 54

5 Answers5

3

With perl:

$ perl -pe 's/\S+/++$c == 8 ? "" : $&/ge' <your-file
One two three four, five
six seven  nine
$ perl -pse 's/\S+/--$n ? $& : ""/ge' -- -n=8 <your-file
One two three four, five
six seven  nine

Or optimising a bit and not perform the substitutions once the nth word has been found:

perl -pse 's/\S+/--$n ? $& : ""/ge if $n > 0' -- -n=8 <your-file
2

Using any awk and without reading all of the input into memory at once:

$ awk -v t=8 '{p=n; n+=NF} (n>t) && !f++{$(t-p)=""} 1' file
One two three four, five
six seven  nine
Ed Morton
  • 31,617
  • 3
    Would be worth noting that it may affect the spacing in the line with the word to remove as sequences of one or more whitespace characters are turned into one space and the leading and trailing whitespace are removed. – Stéphane Chazelas Sep 02 '22 at 18:20
1

Using GNU sed

$ sed -Ez 's/(([^ \n]*( |\n)){7})[^ ]*/\1/' input_file
One two three four, five
six seven  nine
hugomg
  • 5,747
  • 4
  • 39
  • 54
sseLtaH
  • 2,786
  • 2
    Hmm... what does this do that the naive sed -z 's/[^[:space:]]\{1,\}//8' doesn't? – steeldriver Sep 02 '22 at 17:01
  • 5
    Since -z for reading the whole file into memory requires GNU sed you could also take advantage of other GNU-isms and abbreviate that to sed -Ez 's/\S+//8' – Ed Morton Sep 02 '22 at 17:41
1

A simple solution with awk:

awk 'n>0 && n<=NF {$n=""} {n-=NF} 1' n=8 infile

If you need to re-adjust the white spaces (as one field has been removed, two consecutive FS appear).

$ awk 'n>0 && n<=NF {$n="";gsub(/[ \t]+/, " ")} {n-=NF} 1' n=8 infile

One two three four, five six seven nine

0

This solution uses GNU sed's -z feature, which reads the whole file as a "single line"

sed -Ez 's/\S+//8'

Alternatively, to also remove the spaces following that word

sed -Ez 's/\S+ *//8'

Credit goes to Ed Morton, who posted this as a comment to another answer.

hugomg
  • 5,747
  • 4
  • 39
  • 54