I am dealing with fasta files having lines such as:
\>97977-100;sample=Samp1
TAATGATGATTTGT
\>97978-60;sample=Samp2
AACATTCAACGCGGTCGGTGAGTA
\>97979-30;sample=Samp3
AACCGTAGGAGTTGATGTGCGGT
\>97980-20;sample=Samp4
ACTGTCTGTATGTGGTG
I would like to find all characters between -
and ;
and add them to the end of the line along with the text ;size="(value)";
, so I would get:
\>97977-100;sample=Samp1;size=100;
TAATGATGATTTGT
\>97978-60;sample=Samp2;size=60;
AACATTCAACGCGGTCGGTGAGTA
\>97979-30;sample=Samp3;size=30;
AACCGTAGGAGTTGATGTGCGGT
\>97980-20;sample=Samp4;size=20;
ACTGTCTGTATGTGGTG
I have seen on this Question some help on how to find the characters between 2 strings, and I can get them with something like:
sed -n 1~2p $file | sed -e 's/.*-\(.*\);.*/\1/'
And I know how to append to end of a line with:
sed "1~2s/$/;size=(I want this to be the output of the command above);/" $file
But I am not getting the two together. Neither sed
with a command as it gives too large argument error.