4

The file contains

cat file
a
b
c
b
d

Trying to remove duplicate lines on SunOS via

awk '!x[$0]++' file

(as found in another postings) results in a syntax error

awk: syntax error near line 1
awk: bailing out near line 1

What I am missing?

nickm
  • 139

1 Answers1

5

awk was first released in the late 70s in Unix V7.

Since then, it has undergone significant changes, some of them not backward compatible.

The GNU awk manual has a very informative section on the subject.

Like for many other utilities, Solaris (contrary to most other Unices) has taken the stance of sticking with the older obsolete implementation for its default awk utility, and make the newer versions available either under a different name (nawk) or at a different location (/usr/xpg4/bin/awk, that one not available by default in some stripped-down configurations of Solaris 11).

On Solaris, if you use the default environment, you generally get utilities that behave in an ancient/obsolete way. For instance, before Solaris 11, sh in the default environment would not be a standard shell, but a Bourne shell. A lot of other utilities (grep, sed, tail, df...) are not POSIX compliant, not even to the 1992 version of the standard.

Solaris is a POSIX (even Unix) certified system (at least in some configurations), however POSIX/Unix only requires a system be compliant in a given (documented) environment (which may not be the default).

So, when you write code that needs to be portable to Solaris, you need either to write in a syntax from another age, or make sure you put yourself in a POSIX environment.

How to do so for a given version of those standards is documented in the standards(5) man page on Solaris.

So, for awk here, you can use:

awk 'x[$0]++ == 0'

Which would work in the 1978 awk from Unix v7 and Solaris /bin/awk (in the original awk, you could not use any arbitrary expression as a pattern, it had to be conditions using relational operators like == here).

Or:

nawk '!x[$0]++'

Or:

/usr/xpg4/bin/awk '!x[$0]++'

Or more generally, to have saner (and more portable) versions of all the utilities (including awk):

PATH=`getconf PATH`:$PATH export PATH
: ^ false || exec sh "$0" ${1+"$@"} # rexec with POSIX sh if we're
                                    # a Bourne shell
awk '!x[$0]++'

Both /usr/bin/getconf PATH and /usr/xpg4/bin/getconf PATH will give you a $PATH like: /usr/xpg4/bin:/usr/ccs/bin:/usr/bin:/opt/SUNWspro/bin, which will get you XPG4 (a superset of POSIX.1-1990, POSIX.2-1992, and POSIX.2a-1992) conformance. On Solaris 11, you also have a /usr/xpg6/bin/getconf which will get you a PATH like /usr/xpg6/bin:/usr/xpg4/bin:/usr/ccs/bin:/usr/bin:/opt/SUNWspro/bin for XPG6 (SUSv3, superset of POSIX 2001) conformance (where it conflicts with XPG4, in practice unlikely to affect you).

  • x[$0]++ == 0 worked with heirloom oawk, isn't it original awk? – cuonglm Feb 08 '16 at 15:58
  • @cuonglm, that's a port of (Open)Solaris /bin/awk, so yes, not much different from the original awk like Solaris' /bin/awk. As I said, x[$0]++ == 0 should work with every awk including the original one on Unix V7 (tested on a PDP11 emulator with the V7 images from the Unix Heritage Society). – Stéphane Chazelas Feb 08 '16 at 16:40