Can getopts arguments be combined with other input?

Question

I am writing a script that would take a string as input along with other options the user can select by using arguments as indicators. So in other words, something like this:

./script "My input string" -pxz

or

./script -pxz "My input string"

But I run into the following issue. It seems the getopts command stops working after a non-getopts-style argument is entered. Check this out, for example:

#!/bin/bash
while getopts "ab" arg; do
    echo $arg received
done

When we run it we get:

$ ./example.sh -a -b -a string -b
a received
b received
a received
$

It stops at "string" and won't continue. The getopts command returns a nonzero exit status because "string" is not the kind of argument it expects to read, and the while loop ends. I've tried adding a second while getopts but that doesn't do anything. The "read head" of getopts is still stuck on that "string" argument and will simply exit with a nonzero exit status again.

What this means is that seemingly, I wouldn't be able to have the user enter their string first and then the options, because getopts would never be able to read past the string and get there. So I can't do this:

./script "My input string" -pxz

On the other hand, if I told the user to enter first their options and then the string I'd have the problem of figuring out how to retrieve the string. Normally, if there were no options, I'd do this:

string="$1"

But now since there are options and I don't know how many there are, I no longer know what position the string occupies. So how would I be able to retrieve it?

Now, ideally my script should actually be able to work in both of those ways or even a combination of the two. It should be able to handle input like:

./script -p "My input string" -xz

So how can I go about resolving this issue?

ilkkachu · Answer 1 · 2023-11-01T17:20:41.553

If you're on Linux, and you have the getopt command from util-linux (or Busybox), it can also do the argument reordering similarly to how GNU tools do it. It also supports whitespace in arguments, option-arguments, optional option-arguments, and the -- as terminating option processing.

The standard behaviour is to stop option processing when the first non-option is seen, so other implementations probably don't support mixed options and non-options. Even the GNU ones refrain from supporting it if POSIXLY_CORRECT is set.

Anyway, with the script below:

$ ./opts.sh -a 123 "arg with space" more -bc args -- -also -args
option -a with arg '123'
option -b
option -c
remaining arguments (5):
<arg with space> <more> <args> <-also> <-args>

opts.sh:

#!/bin/bash
getopt -T
if [ "$?" -ne 4 ]; then
    echo "wrong version of 'getopt' installed, exiting..." >&2
    exit 1
fi 
params="$(getopt -o a:bc -- "$@")"
eval set -- "$params"
while [ "$#" -gt 0 ]; do
    case "$1" in
    -a)
        echo "option -a with arg '$2'"
        shift 2;;
    -b)
        echo "option -b"
        shift;;
    -c)
        echo "option -c"
        shift;;
    --) 
        shift
        break;;
     *) 
        echo "something else: '$1'"
        shift;;
    esac
done
echo "remaining arguments ($#):"
printf "<%s> " "$@"
echo

_{(Note that the above script only works with implementations of getopt compatible with the util-linux enhancements. Other, "traditional" implementations have issues with white space in arguments, among other things. The getopt -T test is for detecting the implementation is compatible.)}

(Ok, I suppose you could install the util-linux getopt on a non-Linux system too, but I wouldn't expect to see it on one by default. I'm also not sure if it depends on the GNU libraries to do it's thing.) — ilkkachu, Aug 10 '22 at 14:02

Isaac D. Cohen · Answer 2 · 2022-08-10T22:44:23.907

This page (Parsing script arguments after getopts) helped me a lot in my quest to understand this. But I'll attempt to explain it all here from the ground up.

When anything runs from the terminal, it receives an array containing all the arguments it was called with. The first argument, which occupies position 0 in this array, is always the name of the program or script itself (sometimes along with its path - however it was typed in the terminal). In bash scripts we can access this array through the variables numbered $0 - $n where n is the number of arguments we received. So for example, if our script is called like this:

./script --the -great "brown fox" --------jumped

then in our script:

$0 = ./script

$1 = --the

$2 = -great

$3 = brown fox

$4 = --------jumped

So if we do:

echo $3

we get:

brown fox

Now, there is an internal variable that the shell uses to keep track of where this array starts. All the number variables (except $0) are offsets from the start of this array. The shift command can be used to shift over the start of the array so that it begins one or more positions down from where it formerly did. This would make $1 reference what used to be $2. So in our example, if we did:

shift
echo $1
echo $2

we'd get

-great
brown fox

The only exception to this whole shifting thing is $0. $0 never moves. It will always contain the path you used to call the script.

The shift command can also be given a number. This would be the number of positions it shifts the array. So shift 2 is equivalent to using shift twice (i.e. the number is 1 by default).

The getopts command has its own variable called OPTIND that it uses to keep track of where it's up to. When getopts successfully fetches an argument (or finishes fetching the last letter option in an argument like -abc) it adds one to OPTIND to make it point to the next argument on the command line. The next time getopts is invoked it will begin parsing at that next argument.

Now, here's the fun part: The number in OPTIND also refers to an offset from the start of the array. So the shift command can be used to affect getopts:

#!/bin/bash
getopts "abc" arg
echo received $arg
shift
getopts "abc" arg
echo received $arg

If we run this with:

./script -a -b -c

we get:

received a
received c

What happens is this: At the start of every script OPTIND holds the value 1. This means that when we first invoke getopts it begins to read from argument 1, that is, the argument at position 1 from the start of the array (basically it reads $1). After our first invocation of getopts the OPTIND variable holds the value 2, pointing to the -b argument. However, then we do our shift, which makes $2 now refer to what was formerly $3. So now, without having changed the value of OPTIND, it now points to -c.

How does this help us? Well, we can use the value in OPTIND (which is accessible to us) to shift over the array skipping over all the arguments that were already parsed and making $1 the first argument that was not yet parsed. Consider this:

#!/bin/bash
while getopts "abc" arg; do
    echo recieved $arg
done
shift $((OPTIND-1))
echo $1

If we run it:

./script -a -b -a -b string -a

the echo at the bottom will always output string no matter how many options we place before it. This is because after the while loop is done OPTIND will always point to that string. We then use an arithmetic expansion to pass to shift the number that is one less than the position of the string, meaning we shift $1 to its position. Thus, echoing $1 will always echo the string.

We can also continue reading options after encountering the string. The only thing to be aware of is that OPTIND still retains its value even after the shift. So if OPTIND is up to 5 and we shift so that $1 is the string and then we immediately go back to getopts it will start reading at 5 arguments after the string, rather than at the argument right after, as we want it to. To take care of this we can simply set OPTIND to 2:

OPTIND=2

or shift and then set it to 1 (so it will read what is now $1 - the argument right after the string):

shift
OPTIND=1

Putting it all together, here is some code that will read all options and make an array of all other inputs:

#!/bin/bash
while [[ $# -gt 0 ]]; do
while getopts &quot;abc&quot; arg; do
    echo recieved $arg
done

shift $((OPTIND-1))


# when we get here we know we have either hit the end of all
# arguments, or we have come to one that is an
# input string (not an option)
# so see which it is we test if $1 is set
if [[ ${1+set} = set ]]; then
    INPUTS+=(&quot;$1&quot;)
    shift
fi

OPTIND=1

done
echo "Here is the array:"
echo ${INPUTS[@]}

The way you use the unquoted $* and $1 in that last script, any arguments with whitespace (or glob characters) would break. When is double-quoting necessary? and http://mywiki.wooledge.org/WordSplitting. Also the loop of for i in $* repeats once for each original argument, even if the first call to getopts eats all arguments. This creates spurious empty elements in the INPUTS array. — ilkkachu, Aug 09 '22 at 21:36
You'll need to print the array with something other than just echo to see those effects, though. Try e.g. declare -p INPUTS or something like if [[ ${#INPUTS[@]} -gt 0 ]]; then printf "<%s>\n" "${INPUTS[@]}"; fi (printf by itself would print the <...> at least once without the guard) — ilkkachu, Aug 09 '22 at 21:37
Now, to fix those issues, I think you could change the for to while [[ $# -gt 0 ]]; do, and add a guard around the array extension, e.g. if [[ ${1+set} = set ]]; then INPUTS+=("$1"); shift; fi; OPTIND=1 — ilkkachu, Aug 09 '22 at 21:38
$# is the number of positional parameters, or command line arguments, it becomes zero after all the shifts and the loop finishes. The shift $((OPTIND-1)) you have there removes all the arguments processed by getopts, and then you can manually shift off the one you put in INPUTS. (I added that above, but didn't explicitly mention it.) — ilkkachu, Aug 10 '22 at 08:31

Can getopts arguments be combined with other input?

2 Answers2