-1

I have a text file containing below:

title1 A1
title3 A3
title4 A4
title5 A5

title1 B1
title2 B2
title5 B5

title1 C1
title2 C2
title4 C4
title5 C5

title1 D1
title2 D2
title3 D3

I would like to have an output like below:

title1      title2       title3        title4      title5  
  A1                        A3           A4          A5  
  B1           B2                                    B5  
  C1           C2                        C4          C5  
  D1           D2           D3                        

Could you please let me know how can I write a piece of code using AWK?

Thanks in advance!

3 Answers3

1

If you can change the data to use space separation only (or ":" only) then the following awk program could do it. You might need to adjust it for perfecting your layout.

BEGIN { i = 1; }
$1 != "" { C[$1] = $1; X[$1,i] = $2 ; next; }
{ i++; }
END {
    asort(C);
    for ( k in C ) printf "  %8s\t", C[k];
    printf "\n";
    for ( j = 1; j <= i; j++ ) {
        for ( k in C ) printf "%8s\t",X[C[k],j];
        printf "\n";
    }
}
  • Ralph, Thank you so much. "If you can change the data to use space separation only (or ":" only)" : Sorry I had a typo there. Corrected the question with only ' ' as a separator.

    Had a basic question though, what is the way I can execute the code? Save it as .awk first and then call it with the argument as my file with -f option?

    P.S. (This is my first day with awk code hence the basic query) :)

    – user157638 Feb 22 '16 at 05:50
  • Never mind, tried with the same mentioned as above and it worked !! Broke my head for the whole day but you solved it in minutes. :) Thanks Again. – user157638 Feb 22 '16 at 06:00
  • This assumes that the real data look very much like the sample data: (1) the columns (titles) are in alphabetical order (rather than title1, title2, title3, title4, and title5, I suspect that they might be something like name, address, phone, height, and weight), and (2) the data values are all just one word. – Scott - Слава Україні Feb 22 '16 at 06:11
0
awk -f  transpose_rows_to_cols.awk /tmp/1
title1  title2  title3  title4  title5
A1      A3  A4  A5
B1  B2          B5
C1  C2      C4  C5
D1  D2  D3      

PS. formatting is ok on the terminal.

script --

#!/usr/bin/awk -f
BEGIN {
    printf("title1\ttitle2\ttitle3\ttitle4\ttitle5\n");
    a["title1"] = a["title2"]= a["title3"]= a["title4"]= a["title5"] = ""
}
{
    if ($0 !~ /^$/) {
        if ($0 ~ /:/)  {FS=":"; $0=$0} else {FS=" "; $0=$0}
        a[$1]=$2
    } else {
        printf("%s\t%s\t%s\t%s\t%s\n", a["title1"], a["title2"], a["title3"], a["title4"], a["title5"])
        a["title1"] = a["title2"]= a["title3"]= a["title4"]= a["title5"] = ""
    }
}
END{
        printf("%s\t%s\t%s\t%s\t%s\n", a["title1"], a["title2"], a["title3"], a["title4"], a["title5"])
}
jai_s
  • 1,500
0

This answer builds on Ralph Rönnquist’s answer, but

  1. assumes that the columns should be output in user-specified order rather than alphabetical order by heading, and
  2. allows the data values to be multi-word.

Like Ralph’s answer, this assumes that the titles are separated from the data by a space and not a colon.

In order to implement goal #1, it requires that the input begin with a stanza containing all the titles (with no accompanying data) in the desired order.

BEGIN { i = 0; j= 0; }
$1 != "" { if (i==0) C[++j] = $1; else { label = $1; $1 = ""; X[label,i] = $0; } next; }
{ i++; }
END {
    for ( k in C ) printf "  %8s\t", C[k];
    printf "\n";
    for ( j = 1; j <= i; j++ ) {
        for ( k in C ) printf "%8s\t", X[C[k],j];
        printf "\n";
    }
}

Note that i is initialized to 0, and that the i==0 case is handled specially — only the title is captured (since that’s the only thing that should be there), and the titles are stored in the C array indexed by an increasing integer (j) rather than by the value of the title itself.  Otherwise, we capture data.  $1 = ""; erases the first field in the line and rebuilds $0 as the concatenation of all the other fields.  (This loses multiple spaces between fields; that can be corrected with a little bit of work.)  We save the $1 value in the label variable so we can use it as an index into the data array, X.

So, for example, this input:

name
address
phone
height
weight

name John Lennon
phone 123
height 6' 1"
weight 180

name Sir Paul M.
address Liverpool
weight 175

name George
address 42                 Main St.
height 71"
weight 185 lbs

name Ringo Starr
address Penny Lane
phone 456 789

yields this output:

      name         address           phone          height          weight
 John Lennon                         123           6' 1"             180
 Sir Paul M.     Liverpool                                           175
  George         42 Main St.                         71"         185 lbs
 Ringo Starr     Penny Lane      456 789

The columns will be off if any value is 15 characters long or longer.  This can also be improved.