5

I want to remove lines which exist already in the previous line from the command line in UNIX. I have the following data in a file.

 <xref id="gi_525506931_ref_NP_001266519.1__brain_aromatase"/>
                    <entry ac="IPR002401" desc="Cytochrome P450, E-class, group I" name="Cyt_P450_E_grp-I" type="FAMILY">
                    <entry ac="IPR001128" desc="Cytochrome P450" name="Cyt_P450" type="FAMILY">
                    <entry ac="IPR001128" desc="Cytochrome P450" name="Cyt_P450" type="FAMILY">
                    <entry ac="IPR001128" desc="Cytochrome P450" name="Cyt_P450" type="FAMILY">

So, it should look like this:

 <xref id="gi_525506931_ref_NP_001266519.1__brain_aromatase"/>
                    <entry ac="IPR002401" desc="Cytochrome P450, E-class, group I" name="Cyt_P450_E_grp-I" type="FAMILY">
                    <entry ac="IPR001128" desc="Cytochrome P450" name="Cyt_P450" type="FAMILY">
  • 2
    Welcome to U&L. We are (often) a pretty forgiving bunch, if you make mistakes we'll tell you or correct them. You can review the changes that I made to your post, just click on the link above my (or any follow up editor's) avatar. – Anthon Nov 05 '14 at 05:12
  • Hi Anthon, I am not able to accept two answers although cuonglm's answer also gave me required result. Also, I am not able to upvote the answers. – Prakki Rama Nov 05 '14 at 06:57
  • 1
    Upvoting comes with 15 reputations, wait for it ..., there you go, should have it – Anthon Nov 05 '14 at 06:58

2 Answers2

9

If you can guarantee that the identical lines will be consecutive, you can use

uniq your_file

and cuonglm's answer will work even if they're not.

Joseph R.
  • 39,549
4

With awk:

awk '!a[$0]++' file

An explanation for this idiom you can read here.

cuonglm
  • 153,898