This does what I believe you're asking for. NOTE: input.txt
is your input file.
just sed
$ sed 's/\^[^`]*//g' input.txt
GO:0005634`GO:0003677`
GO:0005634`GO:0003677`
GO:0005634`GO:0003677`
GO:0005634`GO:0003677`
Explanation
sed
is used to remove the sub-strings that begin with a caret (^) and can contain anything except a single backtick. Once a backtick is encountered, sed
will replace this with nothing, effectively deleting it. This pattern is repeated until exhausted. This has the effect of removing all the ^....
strings.
grep + paste + sed
$ grep -o 'GO:[0-9]\+' input.txt | paste -d'`' - - | sed 's/$/`/'
GO:0005634`GO:0003677`
GO:0005634`GO:0003677`
GO:0005634`GO:0003677`
GO:0005634`GO:0003677`
Explanation
grep
pulls out all the GO:XXXXX strings from the input.txt
file, paste
puts them into 2 columns, with a single tick between the 2 GO:XXXXX strings, and finally the sed
adds a single tick to the end.
References