1

I have a function (not created by me) that outputs a series of strings inside of quotes:

command <args>

“Foo” “FooBar” “Foo Bar” “FooBar/Foo Bar”

When I try to assign it to an array (Bash; BSD/Mac), instead of 4 elements, I get 7. For example, for ${array[2]} I should get “Foo Bar”, but instead, I get ”Foo with the next element being Bar”. Any element without the space works correctly (i.e. ${array[0]} = “Foo”)

How can assign each of these elements between the quote including the space to an array that the elements are separated by spaces(?) themselves?

Right now, I am thinking of using sed/awk to “strip” out the quotes, but I think there should be a better and more efficient way.

Currently, I am assigning the output of the command (looks exactly like the output above including the quotes) to a temporary variable then assigning it to an array.

_tempvar=“$(command <args>)”

declare -a _array=(${_tempvar})

Allan
  • 1,040
  • 2
    Are those really U+201C LEFT DOUBLE QUOTATION MARK and U+201D RIGHT DOUBLE QUOTATION MARK quote characters like in your question or are they in reality the simple ASCII U+0022 QUOTATION MARK? – Stéphane Chazelas Jun 15 '23 at 18:51
  • @StéphaneChazelas - What I typed were the simple quotation marks, but to be completely accurate as to what gets output, I’ll have to get back to you on that (on an iPad at the moment and can’t check.) – Allan Jun 15 '23 at 18:54
  • 1
    are those quoted strings one per line? Because that would mean you could just focus on the lines, and not the quotes. Which is much simpler. – ilkkachu Jun 15 '23 at 18:58
  • If you complain that your array assignment does not work as expected then you really should show to us how you did the assignment. – Hauke Laging Jun 15 '23 at 19:01
  • @HaukeLaging, see edit at bottom – Allan Jun 15 '23 at 19:08
  • 1
    those 'style quotes' are the default behaviour on ios devices, there's an option to change them to normal, I used to have the same issue. Since op typed this on ipad its likely the reason. – Nickotine Jun 23 '23 at 02:33

3 Answers3

4

In bash, you'd use readarray to read the lines of a file or from the output of some command into an array; however that was only added in version 4.0 released in 2009, but macos still comes with bash 3.2.

macos comes with zsh though which is a much better shell.

To get the non-empty lines of the output of a command you'd split it with the f parameter expansion flag (which splits on linefeed), and to delete the " (U+0022), (U+0201C) and (U+201D) characters, use the ${var//pattern[/replacement]} operator for instance:

#! /bin/zsh -
array=( ${(f)${"$(cmd)"//['"“”']}} ) || exit

If those are strings quoted with the U+0022 ASCII character and the quoting is compatible with the way quotes work in the zsh language, you can also use its z/Z flag (to tokenise text the same way as the language parser does) and Q flag (to remove quotes) instead of splitting by line (which would assume quoted strings can't span several lines).

#! /bin/zsh -
array=( ${(Q)${(Z[n])"$(cmd)"}} ) || exit

Your

declare -a array=(${tempvar})

in bash uses the split+glob operator which is invoked when an expansion is left (usually unintentionally) unquoted. It splits the output on characters of the special $IFS parameter (which by default in bash contains space, tab and newline) using a complex algorithm and the resulting words subject to globbing aka filename generation (which is hardly ever desirable).

Here, split+glob could be used to get you the non-empty lines of the output of your command, but you'd need to tune it first:

IFS=$'\n' # split on newline only:
set -o noglob # disable the glob part which we don't want
array=( $(cmd) ) # split+glob

Then you can remove the "“” with ${var//pattern[/replacement]} as well but in bash that has to be done in subsequently as it can't cumulate parameter expansion operators and the syntax (inherited from ksh93) is a bit more awkward:

array=( "${array[@]//['"“”']}" )

Note that contrary to the zsh approach, that won't handle things like "foo \"bar\" and \\backslash".

1

You get 7 elements because word splitting is occuring, caused by the spaces.

Set IFS=$'\n' before adding the strings to the array then you'll get 4 elements but with double quotes.

Example:

IFS=$'\n'

arr=($(command <args>))

If you want 4 elements without quotes do this:

IFS=$'\n'

arr=($(command <args> | sed s'#"##'g))

Full example:

IFS=$'\n'

tst.txt has your strings:

arr=($(cat tst.txt | sed s'#"##'g))

declare -p arr

Output:

declare -a arr=([0]="Foo" [1]="FooBar" [2]="Foo Bar" [3]="FooBar/Foo Bar")
Nickotine
  • 467
  • Setting $IFS is not enough. You also need to disable globbing if you're to use split+glob. See my answer – Stéphane Chazelas Jun 23 '23 at 06:07
  • He wanted 4 elements right? Or what did I miss? check my array output – Nickotine Jun 23 '23 at 15:00
  • curious on why you need to disable globbing, is it because on array creation op uses a variable instead of directly using the cmd? – Nickotine Jun 23 '23 at 15:23
  • globbing is done upon unquoted expansions in shells such as bash. See for instance: Security implications of forgetting to quote a variable in bash/POSIX shells (see the What about when you do need the split+glob operator? section there in particular). Here for instance, try with a command that outputs "/*/*/" – Stéphane Chazelas Jun 23 '23 at 19:16
  • alright I see, but couldn't the whole glowing problem be avoided by not doing this ```_tempvar=“$(command )”

    declare -a _array=(${_tempvar})and instead just doingdeclare -a _array=("$(command )")``` setting the variable like that seems redundant since it's used once.

    – Nickotine Jun 23 '23 at 20:15
  • 1
    I appreciate an answer that is direct, to the point and doesn’t go off tangent in to “better” or “different” shells. I am on both macOS (some older than Catalina) and FreeBSD and switching shells or installing new ones isn’t an option - therefore I like to keep things simple. – Allan Jun 24 '23 at 18:38
  • thanks, I noticed you didn't want to use sed you could use the bash substitution that @StéphaneChazelas suggested instead, also @alan I set IFS like in the answer in my bash_profile then you don't have to quote anything and I've never had an issue with it, the security implications are actually special cases one example being using bash for api stuff – Nickotine Jun 24 '23 at 20:20
  • the only time you have to worry about quoting when you have IFS set like in the answer is when you have a wildcard like * that you want to be interpreted literally, again a special case and usually if I use * I'd want it to expand, perhaps if you're using regexes then you'd want to quote so it gets interpreted literally – Nickotine Jun 24 '23 at 20:44
  • @StéphaneChazelas have you heard of set noglob? – Nickotine Nov 08 '23 at 05:33
  • 1
    set noglob is the csh syntax to disable globbing. In bash it's set -o noglob like in any POSIX-like shell as shown in my answer or set -f like in the Bourne shell. – Stéphane Chazelas Feb 04 '24 at 07:55
  • Thank you for correcting yourself @StéphaneChazelas that is very much appreciated, as it was you who have suggested the set noglob option on my bash scripts many times when I'd set IFS to my most suited value. Nice to see you bowed your head down and corrected your erroneous information, that is very commendable, takes a real man to do that, especially since this was an error made by you countless times last year. – Nickotine Feb 05 '24 at 22:01
-2
readarray -t array <<< $(echo $'"a a"\n"b   b"\n"c   c"')
declare -p array
declare -a array=([0]="\"a a\"" [1]="\"b   b\"" [2]="\"c   c\"")
readarray -t array <<< $(command <args>)
Hauke Laging
  • 90,279