how to select n first characters from first column according to the number of the second column

Question

During my workflow I have created this file:

AAGGAGGGAGCTGCATGGAACCTGTGGATATATACACACAAGGTTAACCTCTGTCCTGTAAA  8  
GGAGTTCAGATGTGTGCTCTTCCGATCTGGAGGTCTCTGCTGGGGCCACCCTGTCCTCTCAG  30     
GAGAGAGGAAAGGAAGCGATTGCAGAACTTTCCACAAGGCTTTAGATTCCCCTGTCACAGAG  15  
GGAGGAGAAAGAATCAACTTTATAGCATCAGCCCCTTGTTTATTTTAAGTTCAGGGTTTAAG  13  
GGGAGAACATTTCCCTCCTTGTCCTCTCCTATCTCACTTACTACATTCCCACTGGTCACTGT  7  
GGGACATTTGTGATTACATGGTTGCAGTATTCTTTTTGTTCTTAGTCAGACTGTATAATTGG  4

I would like to select from each text of the first column the first number of letters as present in the amount of the second column. Like first 8 character of the first row, first 30 character of the second row etc..

Like the first as example the output would be something like this:

AAGGAGGG  
GGAGTTCAGATGTGTGCTCTTCCGATCTGG

Any idea would be really appreciated.

Satō Katsura · Accepted Answer · 2016-08-29T18:10:11.780

8

With awk:

awk '{ $0 = substr($1, 0, $2) } 1' file.txt

With GNU sed:

sed -r 's/.* ([0-9]+).*/s!^(.{\1}).*!\\1!/' file.txt | \
    cat -n | \
    sed -r -f - file.txt

(GNU sed because it can read script files from stdin).

With perl:

perl -lpe 's/.*?([ACTG]+)\s+(\d+).*/ substr($1, 0, $2)/e' file.txt

Another way with perl:

perl -lape '$_ = substr($F[0], 0, $F[1])' file.txt

edited Aug 29 '16 at 18:10

answered Aug 29 '16 at 17:58

Satō Katsura

13,368
2
31
50

1

awk '{ print substr($1, 0, $2) }'. No side effect on $0 or cryptic 1 pattern with implicit print action. – Kaz Sep 01 '16 at 02:49
Also, this: awk '$0 = substr($1, 0, $2)'. – Kaz Sep 01 '16 at 02:53
@kaz: Re: cryptic 1: you must be new here... – Satō Katsura Sep 01 '16 at 05:08

score 1 · Answer 2 · answered Aug 29 '16 at 17:24

1

Without sed:

while read -r d n;do echo ${d:0:$n};done < file.txt

answered Aug 29 '16 at 17:24

Ipor Sircer

14,546
1
27
39

1

Downvoted for using echo of unquoted variables in a shell loop to process text. – Wildcard Nov 05 '16 at 02:36

how to select n first characters from first column according to the number of the second column

2 Answers2