I am trying to obtain the pairwise combination of of string available for each stack of data,
input file contains two columns: col1 is genenames, col2 is name of various stressors.
gene1 FishKairomones
gene1 Microcystin
gene1 Calcium
gene2 Cadmium
gene2 Microcystis
gene2 FishKairomones
gene2 Phosphorous
gene3 FishKairomones
gene3 Microcystin
gene3 Phosphorous
gene3 Cadmium
So here from the table, gene1 is responsive to 3 stressors, fishkairomones, microcystin and calcium.
I would like to obtain a pairwise table like this:
gene1 FishKairomones gene1 Microcystin
gene1 FishKairomones gene1 Calcium
gene1 Microcystin gene1 Calcium
gene2 Cadmium gene2 Microcystis
gene2 Cadmium gene2 FishKairomones
gene2 Cadmium gene2 Phosphorous
gene2 Microcystis gene2 FishKairomones
gene2 Microcystis gene2 Phosphorous
gene2 FishKairomones gene2 Phosphorous
As you can see, gene1 FishKairomones is linked to gene1 microcystin, gene1 fishkairomones is linked to also calcium, and gene1 microcystin is linked to gene1 calcium. Similarly I would like to do it for all genes.
Sometimes the gene can have 3 stressors, sometimes 4 and so on.
I tried the code here: Command line tool to "cat" pairwise expansion of all rows in a file
This creates all pairwise combinations of the entire file, which is not what I want.
cat dappu_gene_strees.tab
? – RomanPerekhrest Nov 07 '17 at 13:27