8
#!/usr/bin/env bash
awk '
  BEGIN {
    arr[A]=1;
    arr[B]=1;
    arr[C]=1;
    arr[E]=1;
    arr[J]=8;
    arr[Q]=10;
    print arr[J]
  }'

the above command outputs the latest set value for arr['subscript'], in this case 10 that is value of arr[Q] just before print and not 8 that is the value of arr[J].

Also, like in the script above, I don't want to assign values to arr['A'], arr['B'], arr['C'] and arr['E'] that have same value 1 one line at a time, rather pass an array of subscripts as one of the parameters and common value as the other parameter to a function that handles the logic of assigning them value.

Kusalananda
  • 333,661

1 Answers1

16

Array indexes are either integers or quoted strings in awk. What you are doing here are using variables that have not yet been initialised. Their values are therefore empty.

You get the latest value assigned to the array because each assignment is overwriting the previous value. Using print arr[""] would also give you 10 back.

Instead, use strings, as in arr["A"]=1.

For your last issue: There is no real facility for initialising an awk array from the command line, but you may pass an "encoded" value that you "decode" in your BEGIN block (for example) to extract the keys and values for an array.

Example which passes a specially delimited list as a single string and parses it to extract the indexes and values to use:

awk -v vals="A=1:B=1:C=1:E=1:J=8:Q=10" '
    BEGIN {
        n = split(vals, v, ":")
        for (i = 1; i <= n; ++i) {
            split(v[i], a, "=")
            arr[a[1]] = a[2]
        }

        print arr["J"]
    }'

Using separate keys and values:

awk -v keys="A:B:C:E:J:Q" -v vals="1:1:1:1:8:10" '
    BEGIN {
        nk = split(keys, k, ":")
        nv = split(vals, v, ":")

        if (nk != nv) exit 1

        for (i = 1; i <= nk; ++i)
            arr[k[i]] = v[i]

        print arr["J"]
    }'

This is quite a limited way of passing an "array" into awk, but it works for simple values that one has complete control over. The examples would break for any data that embeds colons (and equal signs for the 1st example) in the actual data.

Passing data like this also means backslashes in the data will have to be treated specially (\n will be a newline, so to pass the two character string \n, you would have to use "\\\n" or '\\n').

Also related:


As an aside, you can write a "pure awk script" like this:

#!/usr/bin/awk -f

BEGIN { 
   # some initialisations
}

some_expression { some code }

END {
    # more here
}
Kusalananda
  • 333,661