12

I need to check that a directory (let's call it dir) contains one of two files (let's call them filea and fileb), but neither none nor both.

The ideal solution would be to use a XOR operation between the predicates:

if [ -f dir/filea ] ^ [ -f dir/fileb]
then
    echo Structure ok
    # do stuff
fi

However the shell does not support the ^ as a XOR operator, and the [ command does not have the options -X or --xor like it has -a and -o... Using a negated equality did not work either:

if ! [ -f dir/filea -eq -f dir/fileb ]
# or
if ! [ -f dir/filea = -f dir/fileb ]

Is there some way to achieve this, without resorting to a full-blown AND/OR expression like

if { [ -f dir/filea ] || [ -f dir/fileb ]; } && ! { [ -f dir/filea ] && [ -f dir/fileb ]; }

?

The last expression is becoming unreadable, and of course my actual paths are much longer than dir/fileX.

EDIT: I am targetting a POSIX-compliant version of sh, but I am open to extensions specific to other shells (out of curiosity mostly, but also because I use bash or ksh93 on other projects and this could be useful there)

joH1
  • 908

6 Answers6

17

The exit code of a test ... or [ ... ] command is the test result. You can use variables to store the result of individual tests and compare them later.

[ -f dir/filea ]
testA=$?
[ -f dir/fileb ]
testB=$?
if [ "$testA" -ne "$testB" ]
then
   echo "exactly one file"
else
   echo "both files or none"
fi

It might be possible that [ results in different non-zero exit codes for the two tests.
According to the specification in https://pubs.opengroup.org/onlinepubs/007904875/utilities/test.html, an exit code >1 means "An error occurred." You have to define what should happen if [ reports an error.

To avoid this you can use conditional variable assignments similar to Kusalananda's answer ...

testA=0
testB=0
[ -f dir/filea ] || testA=1
[ -f dir/fileb ] || testB=1
if [ "$testA" -ne "$testB" ]
then
   echo "exactly one file"
else
   echo "both files or none"
fi

... or use negation (as mentioned in comments) to make sure the value is either 0 or 1.
(See "2.9.2 Pipelines" - "Exit Status" in https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_02)

! [ -f dir/filea ]
testA=$?
! [ -f dir/fileb ]
testB=$?
if [ "$testA" -ne "$testB" ]
then
   echo "exactly one file"
else
   echo "both files or none"
fi

Both variants handle an error the same as "file does not exist".

ilkkachu
  • 138,973
Bodo
  • 6,068
  • 17
  • 27
  • Great idea, thanks. It has the advantage of allowing to reuse the variables as flags if needed. – joH1 May 31 '22 at 13:55
5

The following uses the arithmetic XOR operator ^ on the values of a and b. These variables are 1 if the corresponding file exists and is a regular file; otherwise, they are zero.

a=0
b=0

[ -f dir/filea ] && a=1 [ -f dir/fileb ] && b=1

[ "$(( a ^ b ))" -eq 1 ] && echo OK

Another approach is to count the number of times the test succeeds:

ok=0

[ -f dir/filea ] && ok=$(( ok + 1 )) [ -f dir/fileb ] && ok=$(( ok + 1 ))

[ "$ok" -eq 1 ] && echo OK

Kusalananda
  • 333,661
  • We don't yet know whether @joH1 is looking for a strict POSIX sh answer, but if so, it should be noted that $(( a ^ b )) isn't POSIX compliant; you need $(( $a ^ $b )) instead - which is sad, because using $ inside $((...)) opens a can of precedence worms. – Martin Kealey Jun 01 '22 at 01:15
  • 2
    @MartinKealey, that's not true $(( a ^ b )) is POSIX and generally preferred (things like $(($a-$b)) yield undefined behaviour if $b is negative while $((a-b)) is fine for instance). It's true that some (very now) old version of some ash-based shells used not to support it, and it was not always very clear in the specification. – Stéphane Chazelas Jun 01 '22 at 05:23
  • @StéphaneChazelas I stand corrected, thankyou. I was looking at an ancient version of POSIX, so I'm glad that has been remedied (as usual, by sensibly accepting common practice) – Martin Kealey Jun 01 '22 at 05:43
5

It's worth noting that compound commands have an exit status, just like any other command, so you can use if a; then b; else c; fi the same way as you'd use a ? b : c in most C-like languages.

Translating a^ba ? !b : b into shell, we get:

if if [ -f fileA ]
   then ! [ -f fileB ]
   else   [ -f fileB ]
   fi
then
   echo ONE file present
else
   echo NO or BOTH files present
fi

(Note that if if is not a typo.)

I'll grant that having to repeat one of the tests is a bit verbose, so I've aligned them vertically to make it clear that they're identical bar the ! inversion.

On the plus side, this avoids messing with temporary variables holding $? or similar, and it's (marginally) the most performant.

Lastly, I'd note that the -a and -o options to the test command are problematic: they can lead to parsing ambiguities which lead to false results. Just use two test commands separated by && or || instead.

4

Not that I would necessarily recommend that, but in zsh, you could do:

if ()(($# == 1)) dir/file[ab](N-.); then
  print "there is one regular file matching the file[ab] pattern"
fi

-. is to restrict the match to regular files after symlink resolution like your [ -f ... ] does. Use (foo|bar) for arbitrary file names.

  • () body args is an anonymous function, the body being the ksh-style (($# == 1)) arithmetic expression which evaluates to true if the number of arguments to the function is 1.
  • dir/file[ab](N-.) is a glob expansion with glob qualifiers, N for Nullglob¹ so that the glob expands to nothing if there's no match, - for the next qualifiers to apply after symlink resolution, and . to restrict to files of type regular (as seen above).

¹ bash has copied the nullglob option from zsh, but not the glob qualifiers yet; ksh93 has a ~(N) glob operator equivalent to zsh's N glob qualifier, but no glob qualifier nor anonymous function either.

  • Never used zsh, so the syntax is highly hermetic to me.. I guess that is why you wouldn't recommend it ^^ – joH1 May 31 '22 at 14:47
  • In Bash one could write:shopt -s nullglob extglob ... if ( set -- @(fileA|fileB) ; (($#==1)) ) ; then echo ONE file ; else echo MORE OR FEWER THAN ONE ; fi – Martin Kealey Jun 01 '22 at 01:05
2

Another option is to use a language that supports xor. e.g. with perl:

$ mkdir dir
$ touch dir/filea
$ perl -le 'print (((-f $ARGV[0]) xor (-f $ARGV[1])) ? 0 : 1)' dir/filea  dir/fileb
0

You can, of course, use command substitution to capture the output into a variable.

Or if you want the result as an exit code that you can use directly with if, &&, or ||:

$ perl -le 'exit (((-f $ARGV[0]) xor (-f $ARGV[1])) ? 0 : 1)'  dir/filea  dir/fileb
$ echo $?
0

Note: it's important to realise that perl's definition of true (non-zero/non-empty/defined) and false (0/empty string/undefined) is different to/the opposite of shell's definition of true (0) or false (non-zero). That's why the ternary operators above are written to return 0 when the xor evaluates as true and 1 when it evaluates as false.

That's not very important in a trivial script like this, but if you wanted to do more boolean calculations in the perl script then you'd need to keep working with perl's definition of true & false until the last moment when you return the true/false value shell expects.

BTW, if you think this is a bit odd, you're right. It is. It does make sense, though, and there's a good reason for it. But Shell is the odd one out. Most languages define true as non-zero and false as zero (and many have specific boolean data types, or only allow integers to be treated as booleans...with conversion required for other types. perl is very flexible in what it will interpret as true/false).

Anyway, Shell does it the other way around because it mostly/often needs to know if a program exited successfully (0) or with an error code (from 1 up to 127, depending on the program, each indicating a different error condition). It's not really true vs false, it's success vs error-code. Any other program which needs to test the exit code returned from an external program has to do the same, regardless of the language's own internal definition of true/false.

Note 2: it's not advisable to run perl (or any external program, really) repeatedly in a shell loop. The startup overhead might only be a few milliseconds on a fast, modern system, but that adds up with hundreds or thousands of loop iterations and makes it extremely, tediously slow. If you absolutely must do it in a shell loop, best to find another way. Otherwise, if you need to do the test in a loop and can't afford to fork external programs repeatedly, it's best to rewrite the whole script (or the time-sensitive parts of it) in perl or awk or python or anything-that-isn't-shell. See Why is using a shell loop to process text considered bad practice? for more on this topic.

cas
  • 78,579
  • BTW, not many language have unary operators for file-related tests (e.g. -f, -x, -s, etc. See perldoc -f -X, or just help test in bash. the tests available are very similar but not the same). There may be others, but perl and various implementations of shell (and the standalone test AKA [ binary, which exists mostly for legacy reasons as (almost?) all modern shells have test/[ as a built-in) are the only ones I can think of right now. In most other languages, you'd have to use stat() (see man 2 stat if you have dev manpages installed) and process the result yourself. – cas Jun 01 '22 at 09:25
  • Perl doesn't need an explicit exit as its exit status will correctly reflect the truthiness of the last command, having regard for what that means in both Perl and Shell. So it suffices to write: perl -e '1 == grep { -f $_ } @ARGV' fileA fileB. – Martin Kealey Jun 02 '22 at 06:49
1

To take an entirely different track from my other answer, this is the sort of problem where a shell function could make things clearer at the calling site:

if exactly_n_files 1 fileA fileB
then echo Exactly one file
else echo No or multiple files
fi

if exactly_n_files 23 someglobhere then echo "Exactly 23 files match glob" else echo "More or fewer than 23 files match glob" fi

For this you would want exactly_n_files defined thus:

exactly_n_files() {
  countdown=$1
  shift
  for f do
    if [ -f "$f" ] ; then
      countdown=$(( countdown - 1 ))
      [ "$countdown" = -1 ] && return 1
    fi
  done
  [ "$countdown" = 0 ]
}

Obviously, if this isn't constrained to POSIX shell, we should use local to constrain the internal variables, and we could make other simplifications:

exactly_n_files() {
  local countdown=$1
  shift
  for f do
    [[ -f $f ]] &&
      (( --countdown < 0 )) &&
        return 1
  done
  (( countdown == 0 ))
}

If you want to count the files, rather than simply check that there's exactly one, then a more general approach would be:

count_files() {
  count=0
  for f do
    if [ -f "$f" ] ; then
      count=$(( count + 1 ))
    fi
  done
}

count_files someglobhere case $count in 0) echo NO FILES ;;

  1. echo ONE FILE ;;
  2. echo TWO FILES ;;

*) echo MANY FILES esac