1

I have following content in a file.

$ cat file.txt
code-coverage-api 
jsch 
cloudbees-folder 
apache-httpcomponents-client-4-api 
apache-httpcomponents-client-4-api 
jsch 
apache-httpcomponents-client-4-api 
jsch 
apache-httpcomponents-client-4-api 
jackson2-api 
apache-httpcomponents-client-4-api 
workflow-api 
echarts-api 
workflow-api 
envinject-api 
workflow-durable-task-step 
apache-httpcomponents-client-4-api 

My expected output is:

code-coverage-api 
jsch 
cloudbees-folder 
apache-httpcomponents-client-4-api  
jackson2-api 
workflow-api 
echarts-api 
envinject-api 
workflow-durable-task-step 

At the moment, I am sorting the content like below and then removing the duplicates (except one element) by hand.

$ cat file.txt |sort

Is there a way to keep only one duplicate element in the file and remove the remaining duplicate elements from the list? also, keep in mind that there are some elements which don't have any duplicates.

smc
  • 591
  • 3
  • 11
  • 25

2 Answers2

3

You could add line numbers to the output with cat -n followed by a unique sort on the second field.
Then do a numeric sort on the first field to retain the original order and remove the line numbers with cut:

$ cat -n file.txt | sort -uk2,2 | sort -nk1,1 | cut -f2
code-coverage-api
jsch
cloudbees-folder
apache-httpcomponents-client-4-api
jackson2-api
workflow-api
echarts-api
envinject-api
workflow-durable-task-step
Freddy
  • 25,565
2

Try the following to get unique elements of file

cat file.txt | sort | uniq

If u want to delete duplicates. Then u can just update the file by the following the command

cat file.txt | sort | uniq > file.txt

[NOTE: uniq only considers adjacent elements. That why we must sort the.]