0

I am reading and extracting some filenames from a directory and trying to add them into an array. The given directory has duplicate files so I would be extracting some duplicate file names also. Original file names in the directory as:

100_abc strategy-42005_04May2020_0000-04May2020_first_file.csv   
100_abc strategy-42005_04May2020_0000-04May2020_second_file.csv   
101_xyz statitics strategy_04May2020_first_file.csv

Script used:


#!/bin/bash

c=0

for filename in /home/vikrant_singh_rana/testing/*; do #stripping a file name GroupName=$(basename "$filename" ".csv" | awk -F "_" '{print $2}' | awk -F "-" '{print $1}') echo "$GroupName"

    var=["$c"]="$GroupName"
    c=$(($c+1))

done echo "print my array" echo "${var[*]}"

the file name it extracts from directory contains spaces with them. for example.

abc strategy
abc strategy
xyz statistics strategy

so when I print my array it would be printing like as

abc strategy abc strategy xyz statistics strategy

above code is adding same file name again to array if it encounter same file again while reading.

so I have added a if statement in order to prevent that, which is not working as expected. I was expecting that array should have unique file name as an element only.

for filename in /home/vikrant_singh_rana/testing/*; do
        GroupName=$(basename "$filename" ".csv" | awk -F "_" '{print $2}' | awk -F "-" '{print $1}')
    if [[  "${var[@]}"  =~ "$GroupName" ]]; then
            echo "I am here "
            c=$(($c+1))
            var["$c"]="$GroupName"
    fi

done

1 Answers1

1

It might be easier to sort in a pipeline:

readarray -t var < <(
    cd "$HOME/testing"
    printf "%s\n" * | cut -d"_" -f2 | cut -d"-" -f1 | sort -u
)

readarray will slurp the lines of stdin into the array.

You can inspect the array with declare -p var

glenn jackman
  • 85,964