12

I have written a sample script to split the string but it is not working as expected

#!/bin/bash
IN="One-XX-X-17.0.0"
IFS='-' read -r -a ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
    echo "Element:$i"
done
#split 17.0.0 into NUM
IFS='.' read -a array <<<${ADDR[3]};
for element in "${array[@]}"
do
    echo "Num:$element"
done
  • Actual output
    One
    XX
    X
    17.0.0
    17 0 0
    
  • but I expected the output to be:
    One
    XX
    X
    17.0.0
    17
    0
    0
    
AdminBee
  • 22,803
  • 1
    By the way, if one of the answers below solved your issue, please take a moment and accept it by clicking on the check mark to the left. That will mark the question as answered and is the way thanks are expressed on the Stack Exchange sites. – terdon Oct 11 '17 at 11:23

5 Answers5

12

In old versions of bash you had to quote variables after <<<. That was fixed in 4.4. In older versions, the variable would be split on IFS and the resulting words joined on space before being stored in the temporary file that makes up that <<< redirection.

In 4.2 and before, when redirecting builtins like read or command, that splitting would even take the IFS for that builtin (4.3 fixed that):

$ bash-4.2 -c 'a=a.b.c.d; IFS=. read x <<< $a; echo  "$x"'
a b c d
$ bash-4.2 -c 'a=a.b.c.d; IFS=. cat <<< $a'
a.b.c.d
$ bash-4.2 -c 'a=a.b.c.d; IFS=. command cat <<< $a'
a b c d

That one fixed in 4.3:

$ bash-4.3 -c 'a=a.b.c.d; IFS=. read x <<< $a; echo  "$x"'
a.b.c.d

But $a is still subject to word splitting there:

$ bash-4.3 -c 'a=a.b.c.d; IFS=.; read x <<< $a; echo  "$x"'
a b c d

In 4.4:

$ bash-4.4 -c 'a=a.b.c.d; IFS=.; read x <<< $a; echo  "$x"'
a.b.c.d

For portability to older versions, quote your variable (or use zsh where that <<< comes from in the first place and that doesn't have that issue)

$ bash-any-version -c 'a=a.b.c.d; IFS=.; read x <<< "$a"; echo "$x"'
a.b.c.d

Note that that approach to split a string only works for strings that don't contain newline characters. Also note that a..b.c. would be split into "a", "", "b", "c" (no empty last element).

To split arbitrary strings you can use the split+glob operator instead (which would make it standard and avoid storing the content of a variable in a temp file as <<< does):

var='a.new
line..b.c.'
set -o noglob # disable glob
IFS=.
set -- $var'' # split+glob
for i do
  printf 'item: <%s>\n' "$i"
done

or:

array=($var'') # in shells with array support

The '' is to preserve a trailing empty element if any. That would also split an empty $var into one empty element.

Or use a shell with a proper splitting operator:

  • zsh:

    array=(${(s:.:)var} # removes empty elements
    array=("${(@s:.:)var}") # preserves empty elements
    
  • rc:

    array = ``(.){printf %s $var} # removes empty elements
    
  • fish

    set array (string split . -- $var) # not for multiline $var
    
4

Fix, (see also S. Chazelas' answer for background), with sensible output:

#!/bin/bash
IN="One-XX-X-17.0.0"
IFS='-' read -r -a ADDR <<< "$IN"
for i in "${ADDR[@]}"; do
    if [ "$i" = "${i//.}" ] ; then 
        echo "Element:$i" 
        continue
    fi
    # split 17.0.0 into NUM
    IFS='.' read -a array <<< "$i"
    for element in "${array[@]}" ; do
        echo "Num:$element"
    done
done

Output:

Element:One
Element:XX
Element:X
Num:17
Num:0
Num:0

Notes:

  • It's better to put the conditional 2nd loop in the 1st loop.

  • bash pattern substitution ("${i//.}") checks if there's a . in an element. (A case statement might be simpler, albeit less similar to the OP's code.)

  • reading $array by inputting <<< "${ADDR[3]}" is less general than <<< "$i". It avoids needing to know which element has the .s.

  • The code assumes that printing "Element:17.0.0" is unintentional. If That behavior is intended, replace the main loop with:

    for i in "${ADDR[@]}"; do
       echo "Element:$i" 
       if [ "$i" != "${i//.}" ] ; then 
       # split 17.0.0 into NUM
           IFS='.' read -a array <<< "$i"
           for element in "${array[@]}" ; do
               echo "Num:$element"
           done
       fi
    done
    
agc
  • 7,223
  • 1
    case $i in (*.*) ... would be a more canonical way to check that $i contains . (and also portable to sh). If you're into kshisms, see also: [[ $i = *.* ]] – Stéphane Chazelas Oct 10 '17 at 11:50
  • @StéphaneChazelas, Already mentioned case in notes at end, but we agree. (Since the OP uses both <<< and arrays, this isn't much of a sh question.) – agc Oct 10 '17 at 11:57
1

With awk it would cost you one line:

IN="One-XX-X-17.0.0"

awk -F'[-.]' '{ for(i=1;i<=NF;i++) printf "%s : %s\n",($i~/^[0-9]+$/?"Num":"Element"),$i }' <<<"$IN"
  • -F'[-.]' - field separator based on multiple characters, in our case - and .

The output:

Element : One
Element : XX
Element : X
Num : 17
Num : 0
Num : 0
  • The same could be done with IFS=-. read -r a array <<< "$IN" – Stéphane Chazelas Oct 10 '17 at 11:49
  • @StéphaneChazelas, it's different. You are showing just step with converting a string into array. But my one-line is dedicated to cover all: splitting into fields, processing and outputting. I'm not compete with your answer, they are just different – RomanPerekhrest Oct 10 '17 at 11:52
1

Often it is possible to put the string splitting into a subshell. If so, that solves the trouble of properly restoring noglob or IFS settings.

a="world-thing-hello"
t="$(set -o noglob; IFS=-; set -- $a; echo $2)"
echo "This is the $t."

This is the thing.

0

Here my way:

OIFS=$IFS
IFS='-'
IN="One-XX-X-17.0.0"
ADDR=($IN)
for i in "${ADDR[@]}"; do
 echo "Element:$i"
done
IFS='.'
array=(${ADDR[3]})
for element in "${array[@]}"
do
  echo "Num:$element"
done

result as expected:

Num:17
Num:0
Num:0
tonioc
  • 2,069
  • That $IN is invoking the split+glob operator. Here, you don't want the glob part (try on IN=*-*-/*-17.0.0 for instance), so you'd want to do set -o noglob before invoking it. See my answer for details. – Stéphane Chazelas Oct 10 '17 at 11:31
  • 1
    In general, try to avoid "saving" IFS and setting it globally. You really only want to change the value of IFS for when $IN is expanded, and you also don't want pathname expansion performed on the expansion. Further, OIFS=$IFS doesn't distinguish between the cases when IFS was set to an empty string, and when IFS was completely unset. – chepner Oct 10 '17 at 15:42