108

I need some help to figure out how to use the sed command to only show the first column and last column in a text file. Here is what I have so far for column 1:

cat logfile | sed 's/\|/ /'|awk '{print $1}'

My feeble attempt at getting the last column to show as well was:

cat logfile | sed 's/\|/ /'|awk '{print $1}{print $8}'

However this takes the first column and last column and merges them together in one list. Is there a way to print the first column and last columns clearly with sed and awk commands?

Sample input:

foo|dog|cat|mouse|lion|ox|tiger|bar
kenorb
  • 20,988
user70573
  • 1,319

6 Answers6

145

Almost there. Just put both column references next to each other.

cat logfile | sed 's/|/ /' | awk '{print $1, $8}'

Also note that you don't need cat here.

sed 's/|/ /' logfile | awk '{print $1, $8}'

Also note you can tell awk that the column separators is |, instead of blanks, so you don't need sed either.

awk -F '|' '{print $1, $8}' logfile

As per suggestions by Caleb, if you want a solution that still outputs the last field, even if there are not exactly eight, you can use $NF.

awk -F '|' '{print $1, $NF}' logfile

Also, if you want the output to retain the | separators, instead of using a space, you can specify the output field separators. Unfortunately, it's a bit more clumsy than just using the -F flag, but here are three approaches.

  • You can assign the input and output field separators in awk itself, in the BEGIN block.

    awk 'BEGIN {FS = OFS = "|"} {print $1, $8}' logfile
    
  • You can assign these variables when calling awk from the command line, via the -v flag.

    awk -v 'FS=|' -v 'OFS=|' '{print $1, $8}' logfile
    
  • or simply:

    awk -F '|' '{print $1 "|" $8}' logfile
    
Sparhawk
  • 19,941
  • 6
    Good job breaking down how this problem can be simplified. You might add a note about how to use | as an output separator instead of the default space for string concatenation. Also you could explain to use $NF instead of hard coding $8 to get the last column. – Caleb Jun 13 '14 at 07:29
  • after that how to update the file? – pankaj prasad Aug 26 '20 at 06:22
  • @pankajprasad Write to a new file with > then overwrite the old one, or use sponge. This is really a new question though. – Sparhawk Aug 26 '20 at 06:26
  • @Sparhawk it works, but reaming content is erased. how to deal with that? – pankaj prasad Aug 26 '20 at 07:21
  • @pankajprasad You need to ask a new question. Click the big blue button up the top that says "Ask Question". – Sparhawk Aug 26 '20 at 09:43
31

You are using awk anyway:

awk '{ print $1, $NF }' file
jasonwryan
  • 73,126
  • 4
    Wouldn't you need to specify the input field separator (since in this case it seems to be | rather that space) with -F\| or similar? Also what if he wanted to use the same delimiter for output? – Caleb Jun 13 '14 at 07:22
  • 1
    @Caleb Probably: I was waiting for the OP to confirm what exactly the input looked like, rather than trying to guess based on the non-working examples... – jasonwryan Jun 13 '14 at 07:28
  • 1
    Note that that assumes the input contains at least 2 fields. – Stéphane Chazelas Jun 13 '14 at 07:56
  • @StéphaneChazelas OP clearly stated in code that it has eight fields, always. – michaelb958--GoFundMonica Jun 13 '14 at 07:58
  • 3
    @michaelb958 I think "clearly" is overstating the case, just a little :) – jasonwryan Jun 13 '14 at 07:59
  • @michaelb958, though I'd agree that will probably address the OP's specific requirements, I think it's still worth mentioning for anyone coming here wanting to retain the first and last field on the input. Leaving it as a comment (as I did) is probably enough. – Stéphane Chazelas Jun 13 '14 at 08:30
19

Just replace from the first to last | with a | (or space if you prefer):

sed 's/|.*|/|/'

Note that though there's no sed implementation where | is special (as long as extended regular expressions are not enabled via -E or -r in some implementations), \| itself is special in some like GNU sed. So you should not escape | if you intend it to match the | character.

If replacing with space and if the input may already contain lines with only one |, then, you'll have to treat that specially as |.*| won't match on those. That could be:

sed 's/|\(.*|\)\{0,1\}/ /'

(that is make the .*| part optional) Or:

sed 's/|.*|/ /;s/|/ /'

or:

sed 's/\([^|]*\).*|/\1 /'

If you want the first and eighth fields regardless of the number of fields in the input, then it's just:

cut -d'|' -f1,8


(all those would work with any POSIX compliant utility assuming the input forms valid text (in particular, the sed ones will generally not work if the input has bytes or sequences of bytes that don't form valid characters in the current locale like for instance printf 'unix|St\351phane|Chazelas\n' | sed 's/|.*|/|/' in a UTF-8 locale)).

11

If you find yourself awk- and sed-less, you can achieve the same thing with coreutils:

paste <(           cut -d'|' -f1  file) \ 
      <(rev file | cut -d'|' -f1 | rev)
Thor
  • 17,182
3

It seems like you are try to get the first and last fields of text which are delimited by |.

I assumed your log file contains the text like below,

foo|dog|cat|mouse|lion|ox|tiger|bar
bar|dog|cat|mouse|lion|ox|tiger|foo

And you want the output like,

foo bar
bar foo

If yes, then here comes the command for your's

Through GNU sed,

sed -r 's~^([^|]*).*\|(.*)$~\1 \2~' file

Example:

$ echo 'foo|dog|cat|mouse|lion|ox|tiger|bar' | sed -r 's~^([^|]*).*\|(.*)$~\1 \2~'
foo bar
Avinash Raj
  • 3,703
  • The columns are not delimited by a pipe | but they are in columns, I am interested in using sed but not using the awk command like you did in your command: sed -r 's~^([^|]).|(.*)$~\1 \2~' file – user70573 Jun 16 '14 at 00:46
  • "The columns are not delimited by a pipe | but they are in columns", you mean columns are separated by spaces? – Avinash Raj Jun 16 '14 at 00:50
  • A sample input and an output would be better. – Avinash Raj Jun 16 '14 at 00:51
1

You should probably do it with sed - I would anyway - but, just cause no one has written this one yet:

while IFS=\| read col1 cols
do  printf %10s%-s\\n "$col1 |" " ${cols##*|}"
done <<\INPUT
foo|dog|cat|mouse|lion|ox|tiger|bar
INPUT

OUTPUT

     foo | bar
mikeserv
  • 58,310