Using Raku (formerly known as Perl_6):
~$ OLDIFS="$IFS"
~$ IFS=" "
~$ string=11111001
~$ read -a array <<< "$(raku -e lines.comb.print <<<"$string")"
~$ declare -p array
declare -a array='([0]="1" [1]="1" [2]="1" [3]="1" [4]="1" [5]="0" [6]="0" [7]="1")'
~$ IFS="$OLDIFS"
~$ echo -n "$IFS" | raku -e 'dd($*IN.slurp);'
" \t\n"
Unicode in Raku:
According to the docs, "Raku applies normalization by default to all input and output except for file names, which are read and written as UTF8-C8; graphemes, which are user-visible forms of the characters, will use a normalized representation." So the code/characters below give the following results:
~$ OLDIFS="$IFS"
~$ IFS=" "
~$ string1="palmarés,Würdigung,Témoignages d'honneur"
~$ read -a array1a <<< "$(raku -e lines.subst\(/"\s"/,「_」\).split\(「,」\).print <<<"$string1")"
~$ echo "${array1a[@]}"
palmarés Würdigung Témoignages_d'honneur
~$ declare -p array1a
declare -a array1a='([0]="palmarés" [1]="Würdigung" [2]="Témoignages_d'\''honneur")'
~$ read -a array1b <<< "$(raku -e lines.comb.print <<<"${array1a[2]}")"
~$ echo "${array1b[@]}"
T é m o i g n a g e s _ d ' h o n n e u r
~$ declare -p array1b
declare -a array1b='([0]="T" [1]="é" [2]="m" [3]="o" [4]="i" [5]="g" [6]="n" [7]="a" [8]="g" [9]="e" [10]="s" [11]="_" [12]="d" [13]="'''" [14]="h" [15]="o" [16]="n" [17]="n" [18]="e" [19]="u" [20]="r")'
~$ IFS="$OLDIFS"
~$ echo -n "$IFS" | raku -e 'dd($*IN.slurp);'
" \t\n"
https://docs.raku.org/language/unicode#Normalization
https://github.com/MoarVM/MoarVM/blob/master/docs/strings.asciidoc
[code tested on: GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin14]
EDIT_1: The 'right' way to handle strings with embedded newlines appears to be slurping the string instead of reading with Raku's ne
commandline flag, i.e. raku -e slurp.comb.print
instead of raku -ne .comb.print
. Then $IFS
can be tuned to create an array using (or ignoring) newlines.
EDIT_2: As noted by @StephaneChazelas and @roaima, asterisks (*
s) are problematic due to file-globbing. Here's code showing that quotation here (and above) is proper:
~$ string_star="*11111001"
~$ echo "$string_star"
*11111001
~$ read -a array_star <<< "$(raku -e slurp.comb.print <<<"$string_star")"
~$ echo "${array_star[@]}"
* 1 1 1 1 1 0 0 1
Double-quoting is essential (above), however as an extra measure Raku can be used to delete all *
by adding a call such as .subst(...)
, (here substituting with nothing). Work-in-progress code below (consider applying same approach to delete other special characters in bash
such as \
, [
, and ?
):
~$ read string_nostar <<< "$(raku -e slurp.subst\(「*」\).print <<<"$string_star")"
~$ read -a array_nostar <<< "$(raku -e slurp.comb.print <<<"$string_nostar")"
~$ echo "${array_nostar[@]}"
1 1 1 1 1 0 0 1
arr[1]=0
with a string of11....
. – Jeff Schaller Jun 15 '21 at 15:55