0

how to capture string from csv line that comes after specific word

for example , this is the csv line that we want to cut the strings that comes after /data/

status=true /data/sdb/hadoop/hdfs/log,/data/sdc/hadoop/hdfs/log,/data/sdd/hadoop/hdfs/log,/data/sde/hadoop/hdfs/log,/data/sdf/hadoop/hdfs/log

example of expected resuls

sdb
sdc
sdd
sde
sdf
Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
yael
  • 13,106

6 Answers6

4

Use grep:

with PCRE:

grep -Po '/data/\K[^/]*'

if that is not available:

grep -o '/data/[^/]*' | cut -d'/' -f3
pLumo
  • 22,565
1

@pLumo absolutely has the right answer. If, for whatever reason, you wanted to use awk and bash's builtin parameter expansion, all the while being slightly convoluted...

LINE_COUNTER=0
while read line; do
    COUNT_SEP="${line//[^,]}"
    for col in $(seq 2 $((${#COUNT_SEP}+1))); do
        LINE_COUNTER=$(($LINE_COUNTER+1))
        COLUMN=$(echo "${line}" | awk -v variable="${col}" -F, '{ print $variable }')
        if [ $LINE_COUNTER -eq 1 ]
        then
            echo "${COLUMN}" > /tmp/splitCSV
        else
            echo "${COLUMN}" >> /tmp/splitCSV
        fi
    done
    while read splitCol; do
        echo "${splitCol}" | awk -F'/data/' '{ print $2 }' | awk -F'/' '{ print $1 }'
    done < /tmp/splitCSV
done < test.csv
1

Just to add an option, having in mind that there's only one pattern that match three characters between slashes, with sed and grep:

grep -o "/.../"  foo | sed 's;/;;g' file

Output:

sdb
sdc
sdd
sde
sdf
1

For Above input below command will work

perl -pne "s/,/\n/g"  filename|awk -F '/data/' '{gsub("/.*","",$2);print $2}'

output

sdb
sdc
sdd
sde
sdf
1

This works for me with awk

awk -F'/' '{for(i=1;i<=NF;i++) if($i=="data") print $(i+1)}' <file>

1: -F defines field separator as /

2: loop on every field on each line

3: if field equals "data" print next field

Clement
  • 57
1

We can choose from the following :

awk -F/ '
     BEGIN { OFS = RS }
     {
       N = split($0, a, /\//)
       $0 = "" 
        for ( i=j=1; i<N; i++ ) 
            if ( a[i] == "data" ) 
                 $(j++) = a[++i]
      }N>1' file.csv


perl -F/ -lane '
   shift(@F) eq q(data) and print(shift(@F)) 
      while(@F && m{/data/});
' file.csv


perl -lne 'print for m{/data/([^/,]+)}g' file.csv


sed -re '
    /\n/{P;D;}
    s:/data/([^/,]+):\n\1\n:
   D
' file.csv