13

I have 2 large files (3000 columns, 15000 rows) of the following format

file1 (tab-separated):

1/0 0/0 0/0
0/0 1/1 0/0
1/1 0/1 0/0

file2 (tab-separated):

3 5 2
1 7 10
3 4 3

I'd like to combine the values from the first column of each file with a ":" separator, then move on to the second, third, etc. columns. Desired output (tab-separated):

1/0:3 0/0:5 0/0:2
0/0:1 1/1:7 0/0:10
1/1:3 0/1:4 0/0:3

Efficiency isn't critical, so any language is fine. I apologize if this has been asked before.

3 Answers3

14

Something like this? Worked with your sample data:

paste  file{1,2} | awk '{for (i=1;i<=NF/2; i++){printf "%s:%s\t",$i,$(NF/2+i)};printf "\n"}'
1/0:3   0/0:5   0/0:2
0/0:1   1/1:7   0/0:10
1/1:3   0/1:4   0/0:3
tink
  • 6,765
9
awk '{
    getline f2 < "file2"
    split(f2, a)
    for (i=1; i<=NF; i++) 
        printf "%s:%s\t", $i, a[i]
    print ""
}' file1
glenn jackman
  • 85,964
  • Worked perfectly, although I prefer the simplicity of tink's response. – Jon Degner Jul 06 '16 at 03:02
  • 1
    @JonDegner then if that answer (or this one) solved your issue, please take a moment and accept it by clicking on the check mark to the left. That will mark the question as answered and is the way thanks are expressed on the Stack Exchange sites. – terdon Jul 06 '16 at 11:36
6

A slightly different approach:

paste -d: <(xargs -n1 <file1) <(xargs -n1 <file2) | xargs -n 3
  • I upvoted this, but just realised that the -n 3 part only works on the sample provided. Column count needs to be modified to accommodate the actual data. – tink Jul 06 '16 at 21:23
  • @tink Obviously, yes. You could calculate the column count with something like head -n1 | wc -w, however. – Michael Vehrs Jul 07 '16 at 08:53
  • Heh. That wasn't meant for you to respond, I'm well aware of how to work around it ... just an explanation that your answer should have one less upvote :} – tink Jul 07 '16 at 19:37