2

I'm learning awk today, but I cannot succeed in having the most simple scripts to work.

#!/usr/bin/env -S awk -f
BEGIN { }
{ }
END { }

this outputs BEGIN: command not found

or even

#!/usr/bin/env -S awk -f
{}

this outputs {}: command not found

When I launch $ /usr/bin/env -S awk -f , I do have the awk executable that display its default output.

And $ awk --version says it's awk version 5.0.1 , on nixos 19.09.

I need to use /usr/bin/env, because nixos files are not following the traditionnal FHS directory hierarchy.

I suspect I'm missing something obvious but looking awk tutorials and SO questions has not given me any clue for now.

EDIT: the command line I use to launch the script

ls -l | . testawk.sh
  • Does it work if you use a proper #!-line like #!/usr/bin/awk -f? Very few Unices can handle #!-lines with more than a single argument (yours have three). – Kusalananda Jan 21 '20 at 09:56
  • As explained nixos does not follow the FHS, and there are no /usr/bin/awk . /usr/bin/env awk will point to the correct place. which is : /run/current-system/sw/bin/awk . So it cannot work. – Stephane Rolland Jan 21 '20 at 09:58
  • 1
  • Then use #!/run/current-system/sw/bin/awk -f. I don't know why the env -S route does not work, or if it is supposed to work. – Kusalananda Jan 21 '20 at 09:59
  • Does the script actually work if you run it as awk -f scriptname on the command line? I see no reason why it fails to be honest. – Kusalananda Jan 21 '20 at 10:03
  • @Kusalananda following the Multiple arguments question you suggests, the workaround at the bottom of the page : if I put the one liner #!/run/current-system/sw/bin/awk BEGIN {} it works. Same for #!/usr/bin/env awk BEGIN {} . I have no idea why it does not work either. That's puzzling. – Stephane Rolland Jan 21 '20 at 10:05
  • You're not writing the script on a Windows machine or with a text editor that produces DOS text files, are you? – Kusalananda Jan 21 '20 at 10:07
  • 1
    Just for the sake of the argument: Are you trying to run the script "standalone", directly from the bash, as in ~$ ./scriptname.sh? – AdminBee Jan 21 '20 at 10:07
  • @AdminBee you got it. I was calling the script sourcing it: . test.sh or even ls -l | . test.shand not ./test.sh . – Stephane Rolland Jan 21 '20 at 10:09
  • 1
    Is it common to "execute" awk scripts relying on the shebang? I'd always explicitly invoke awk with the script as an argument. – Peter - Reinstate Monica Jan 21 '20 at 23:44
  • @Peter-ReinstateMonica I have edited my answer including your suggestion of calling awk directly with a script without the #! line. – Stephane Rolland Jan 22 '20 at 00:21
  • @Kusalananda nothing to do with it, the OP is using env -S (split 1st argument eg. "-S foo bar" on spaces). Whether that's a good idea in general it's a different problem (env -S is not portable), but it comes in handy sometimes. –  Jan 22 '20 at 15:37
  • @Peter-ReinstateMonica yes, it's quite common. –  Jan 22 '20 at 15:37
  • @mosvy Hadn't seen it before, and doesn't work on my system, so I learnt something. – Kusalananda Jan 22 '20 at 15:47
  • @Peter-ReinstateMonica no, it's not common and it's best avoided in favor of simply calling awk inside your shell script just like you'd call any other standard UNIX tool since calling awk from the shebang has no useful benefits while calling awk within the shell script has the benefits of being able to separate the shell script arguments into awk variables, awk file names, etc. before the call to awk. – Ed Morton Jan 24 '20 at 15:03

3 Answers3

8

Sourcing is not the same as executing. Specifically, sourcing expects a list of commands that can be executed in the current shell. The following is from bash's help .:

.: . filename [arguments]

Execute commands from a file in the current shell.

Read and execute commands from FILENAME in the current shell. The entries in $PATH are used to find the directory containing FILENAME. If any ARGUMENTS are supplied, they become the positional parameters when FILENAME is executed.

So, when you run . file, your shell will read the file and execute each command it finds. However, this means that the shebang line is ignored and treated like a regular comment. Therefore, your shell and not awk, was attempting to execute BEGIN.

To avoid this, you should execute the script instead of sourcing it. If, for some reason, you just have to source it, write an awk command in the script:

awk '
BEGIN { }
{ }
END { }'

Then, you can do

ls | . ./a.awk 

Although I can't really think of why you would ever want to.


As an aside, you should be aware that . (or source, in bash) looks for file names in your $PATH by default. So, if you run . foo, and have a foo file in the current directory and a foo file in any directory in your $PATH, then the file that will be sourced is the one in your $PATH and not the one in your current directory. To avoid this, always use full paths when sourcing: . ./foo.

terdon
  • 242,166
4

Never use a shebang to call awk as that robs you of the ability to separate your script arguments into parts that should be done in shell vs parts that should be done in awk and to separate awk arguments from awk variable assignments. Just write your script as:

#!/usr/bin/env bash

/usr/bin/env awk '
BEGIN { }
{ }
END { }
' "$@"

There will be times when you want to modify it to do things like set awk variables:

#!/usr/bin/env bash

rs="$1"
fs="$2"
shift 2

/usr/bin/env awk -v RS="$rs" -F "$fs" '
BEGIN { }
{ }
END { }
' "$@"

which should be trivial like the above but you can't do if you're invoking awk with a shebang.

Ed Morton
  • 31,617
  • In the awk tutorial I followed they warn about the necessity of adding \ for line breaking. I don't know if it is for some systems, or for some shells. Lots of examples were using csh in this tutorial. Wouldn't your solutions need a \ at each line ? – Stephane Rolland Jan 22 '20 at 16:26
  • You do not need \ at the end of a line in an awk script in any shell you should be writing scripts in. If you're writing scripts in CSH then there are all sorts of twists and loopholes you need to consider since that's not what CSH is designed for (see https://www.grymoire.com/Unix/CshTop10.txt or just google "why not csh") so putting backslashes at the end of awk or other scripts within a CSH script would be one of those twists. The fix isn't to use awk in the shebang though, it's to use a bourne-like shell in the shebang. – Ed Morton Jan 22 '20 at 16:32
3

As mentionned in the comments by AdminBee, the script was called in a wrong manner (sourcing the content).

The correct way is:

ls -l | ./test.awk

Or also, more simply, without relying on the #! line:

ls -l | awk -f ./test.awk
  • 1
    Plus it's misleading to suffix an awk script with ".sh". – Peter - Reinstate Monica Jan 21 '20 at 23:43
  • Good remark. I should use .awk . And I edited the answer this way. – Stephane Rolland Jan 22 '20 at 00:12
  • If you have a script to be called from the shell then it's a command and so should have no suffix, just like the standard UNIX commands (head, grep, awk, tr, sed, etc.) don't have a suffix. If, on the other hand, you have a file that needs to be interpreted by some command (e.g. an awk script like BEGIN{print "hello world"} stored in a file) THEN it should have a suffix to indicate which command to use to interpret that script. – Ed Morton Jan 22 '20 at 18:07
  • So if you're writing a shell script name it foo, not foo.sh, executed it as ./foo and if you're writing an awk script name it bar.awk executed as awk -f bar.awk. If you have an awk script stored in a shell script - it's still a shell script and should be named as such so, among other things, in future you can replace the awk script with a ruby or perl script and it doesn't affect any other scripts calling that shell script. – Ed Morton Jan 22 '20 at 18:08