2

I have a script whose main purpose is to gather some information and output it in a table. The primary part is an awk script:

awk '
    { 
      # do some stuff, including calculating dwt
      printf(format, a, b, c, d)
    }
    END {
      # pass on dwt
    }
' inputfile

The main purpose of the awk is to create and show the table. But there is a side value dwt that it also calculates that I need elsewhere in the main script, and I am trying to figure out the best way to pass it out without disrupting the table output.

There are two ways I know I can do this:

  1. Save the value to a temporary file: END { print dwt > "tempfile" } then read it outside read dwt <tempfile; rm -f tempfile. But even with more care taken to avoid clobbering existing files than shown here, I prefer to avoid this - if nothing else, I'd rather not leave temp files lying around just because a job got interrupted at the wrong time.
  2. Send the value to stdout as well, but flagged. Pipe stdout into a following routine that catches and directs the flagged output appropriately, but sends the rest on:
    awk '
       ...
        END { 
           print "dwt:" dwt 
        }
     ' inputfile | while read line; do
        if [[ $line = dwt:* ]]; then
           dwt="${line#dwt:}"
        else
           echo "$line"
        fi
     done

But that seems contrived and inelegant.

I am wondering if anyone knows a better method. I've experimented with using a different file descriptor, but so far have not managed to get that to work. I have not figured out how to get the information out of the file descriptor and into the dwt environment variable without disrupting stdout as well.

  • 1
    How about stderr? – Jeff Schaller Oct 24 '19 at 16:12
  • @JeffSchaller - I do not want the table output sent to stderr, and for the value of dwt, that falls under "I've experimented with using a different file descriptor, but so far have not managed to get that to work." If you know how I can read dwt with a file descriptor without disrupting stdout, I would appreciate knowing it. – Paul Sinclair Oct 24 '19 at 16:15
  • https://unix.stackexchange.com/a/321682/117549 was what I was looking at... – Jeff Schaller Oct 24 '19 at 16:16
  • https://unix.stackexchange.com/a/289650/117549 will get the results to stderr (print ... | "cat >&2"), but I'm not yet sure how you'd integrate that into the overall script. – Jeff Schaller Oct 24 '19 at 16:31
  • Getting the value out to stderr is not an issue. It is reading it from stderr or another file descriptor into the dwt environment variable that is the issue I'm having. – Paul Sinclair Oct 24 '19 at 16:35
  • You might update your post with your desired destination for the dwt value (assuming it's an env variable). – Jeff Schaller Oct 24 '19 at 16:36
  • I edited my final comment to include that the problem I'm having with file descriptors was getting the value into an environment variable. The two sample solutions I am trying to avoid both already showed placing the value in an environment variable. – Paul Sinclair Oct 24 '19 at 16:40
  • What happens to the stdout of the awk process? Is it piped into some other program? – glenn jackman Oct 24 '19 at 16:46
  • @glennjackman - it is sent out of the script. The user may choose to read it, or do something else, which is why I want it to remain on stdout. – Paul Sinclair Oct 24 '19 at 16:47
  • One extremely ugly but simple way of doing it is to mis-use the exit code (e.g. END {exit dwt}). This will only "work" if dwt is an integer with a value from 0 to 255. It also abuses the concept of exit codes, and will cause the script to terminate if run with set -e and the ec is not immediately captured (e.g. with if or || or &&). – cas Oct 25 '19 at 03:30
  • @cas - Thank you, but the value could potentially be anywhere from 1 to over 1,000,000. Even its common range is int0 the thousands. – Paul Sinclair Oct 25 '19 at 03:34
  • No problem, it's an extremely ugly method and i don't recommend it. the only reason i mentioned it is because it is a side-channel way of exporting a value from awk. Using a tempfile is probably the easiest and least hassle way to do it. If you have mktemp or similar available (to avoid the kind of race conditions made possible by tempfiles), i'd strongly recommend using that and passing the tempfile name to awk, rather than hard-coding it. – cas Oct 25 '19 at 03:40
  • btw, tempfiles can be cleaned up with a trap. – cas Oct 25 '19 at 03:41

2 Answers2

2

Here's one technique:

  • print dwt on stdout in the END
  • capture the awk output into an array
  • extract the last element of the array into a variable in the shell process
  • print the rest of the array
$ seq 5 > inputfile
$ readarray -t output < <(
    awk '
        { print "table", $0; dwt += $1 }
        END {print dwt}
    ' inputfile
)
$ dwt=${output[-1]}
$ echo "dwt = $dwt"
dwt = 15
$ unset output[-1]
$ printf "%s\n" "${output[@]}"
table 1
table 2
table 3
table 4
table 5

OK, ksh without readarray: your shell script can look like:

awk '
    { 
      # do some stuff, including calculating dwt
      printf(format, a, b, c, d)
    }
    END {
      # pass on dwt
      print dwt
    }
' inputfile  |&
# ...........^^

typeset -a output
while IFS= read -r -p line; do output+=( "$line" ); done
# .................^^

dwt=${output[-1]}
unset output[-1]
printf "%s\n" "${output[@]}"

# do stuff with $dwt ...

From my ksh93 man page:

The symbol |& causes asynchronous execution of the preceding pipeline with a two-way pipe established to the parent shell; the standard input and output of the spawned pipeline can be written to and read from by the parent shell [...] by using -p option of the built-in commands read and print described later.

glenn jackman
  • 85,964
1

A bit clumsy but you could print shell commands in awk an evaluate them in in ksh:

eval $( echo -e 1\\nA\\n2\\n3\\n4 |
  awk  'NR == 1 {printf"%s%s%s;","export var1=",$0, "\n"}
        NR > 2 {printf"%s%s%s;","echo -e '", $0 , "'\n" }'
)

Output:

2
3
4

$ echo $var1
1

Although a line-by-line shell evaluation might not be the simplest and in case of some strings, the echo might give funny results. Be aware of this danger!, Hard quotes should avoid this, though:

eval $( echo -e 1\\nA\\n2\\n3\\n4 |
  awk  'NR == 1 {printf"%s%s%s;","export var1=",$0, "\n"}
        NR > 2 {printf"%s%s%s;","echo -e \x27", $0 , "\x27\n" }'
)
FelixJN
  • 13,566
  • +1 Nice idea. A here-document with cat might be a better way to handle the output than echo statements. But In the end, it is just collecting all lines in one big string, then printing it. – Paul Sinclair Oct 25 '19 at 16:17