I have a large database of this format:
0500506 gi|561108321|ref|NC_018143.2| 71446 0
0500506 gi|561108321|ref|NC_018143.2| 71447 0
0500508 gi|561108321|ref|NC_018143.2| 71449 0
0500509 gi|561108321|ref|NC_018143.2| 71447 0
What I want to do is to create a new table with the number of occurrences of each of the third column, something like this
71446 1 0500506
71447 2 0500506,0500509
71449 1 0500508
Importantly, the whole lines won't match, so I want to make sure that I'm just matching the values in a single column. I'm assuming I can adapt one of the solutions from Count total number of occurrences using grep, but am not sure how.
Thanks very much, drea