-1

I have a file with the following lines:

1 a
2 a
3 a
1 b
2 b
1 c
2 c
3 c
4 c
1 d

I want to get the result as:

a 1 2 3
b 1 2
c 1 2 3 4
d 1
Rui F Ribeiro
  • 56,709
  • 26
  • 150
  • 232
syed
  • 11
  • 3
    Hello and welcome to SE. Have you tried anything on your own at all or are you just waiting for somebody to do your work? And why did you roll back all the edits on this badly formatted (and titled/tagged) question? – Panki Mar 18 '19 at 11:34
  • 1
    Please add more details to the question: Are the data lines unique or are duplicate lines possible? In case of e.g. a duplicate line 2 a, how should the output look like? Are the input lines always sorted already (by the 2nd column, followed by the 1st column)? In the output, should the numbers be sorted or do you want them to appear in the same order as in the input data? – Bodo Mar 18 '19 at 11:44
  • Yes.same as mentioned in the question. – syed Mar 18 '19 at 11:48
  • 4
    Answering 4 specific questions with a simple 'Yes' isn't going to help anybody. – Panki Mar 18 '19 at 11:50
  • Syed. I'm looking back at your original question text. Is your input a single line containing 1 a 2 a 3 a 1 b 2 b 1 c 2 c 3 c 4 c 1 d? Or is it multiple lines 1 a, 2 a, etc.? – Chris Davies Mar 18 '19 at 11:57
  • It's with multiple line. I want the distinct of column 2 i.e. (a,b,c,d) with column value horizontally i.e. a 1 2 3, then b 1 2, etc – syed Mar 18 '19 at 12:01
  • OK. Please [edit] your question to include this requirement ("I want the distinct...") as well as clarifications that answer @Rui's four questions. – Chris Davies Mar 18 '19 at 12:57

3 Answers3

2

Using awk:

awk '{ group[$2] = (group[$2] == "" ? $1 : group[$2] OFS $1 ) }
     END { for (group_name in group) print group_name, group[group_name] }' inputfile

This stores the groups in an array called group. This array is indexed on the group name (the second column in the input data) and for each line of input from inputfile, the value in the first column is appended to the correct group.

The END block loops over all collected groups and outputs the group name and the entries of that group.

This awk program with a nicer layout:

{
    group[$2] = (group[$2] == "" ? $1 : group[$2] OFS $1 )
}

END {
    for (group_name in group)
        print group_name, group[group_name]
}

Note that this is not what you'd want to do if you have massive amounts of data as the group array will actually store all input data read from the file.

For huge amounts of data, we assume that the input is sorted on the group names (the second column) and use

awk '$2 != group_name { if (group != "") print group_name, group; group = ""; group_name = $2 }
    { group = (group == "" ? $1 : group OFS $1) }
    END { if (group != "") print group_name, group }' inputfile

This keeps track of what the current group is, and collects the data for that group. Whenever the second column in the input switches to another value, it outputs the collected group data and starts collecting new data. This means that only a few lines of input is ever stored, rather than storing the whole input data set.

This last awk program with a nicer layout:

$2 != group_name {
    if (group != "")
        print group_name, group

    group = ""
    group_name = $2
}

{
    group = (group == "" ? $1 : group OFS $1)
}

END {
    # Output last group (only), if there was any data at all.
    if (group != "")
        print group_name, group
}
Kusalananda
  • 333,661
0

Try this,

for i in  `awk '!a[$2]++ { print $2}' file.txt`
do
        echo "$i `awk -v z=$i '$2==z{print $1}' file.txt | tr '\n' ' '`"
done
  • awk '!a[$2]++ { print $2} will give the unique value of column 2.
  • $2==z{print $1} will print all values where $2 equals variable z.
Siva
  • 9,077
-1

Command:for i in a b c d; do echo $i;awk -v i="$i" '$2 == i{print $1}' filename| perl -pne "s/\n/ /g";echo " "| perl -pne "s/ /\n/g";done| sed '/^$/d'| sed "N;s/\n/ /g"

output

for i in a b c d; do echo $i;awk -v i="$i" '$2 == i{print $1}' l.txt | perl -pne "s/\n/ /g";echo " "| perl -pne "s/ /\n/g";done| sed '/^$/d'| sed "N;s/\n/ /g"

a 1 2 3 
b 1 2 
c 1 2 3 4 
d 1