8

When the status code is useless, is there anyway to construct a pipeline based on output from stdout?

I'd prefer the answer not address the use-case but the question in the scope of shell scripting. What I'm trying to do is find the most-specific package available in the repository by guessing the name based on country and language codes.

Take for instance this,

  • $PACKAGE1=hunspell-en-zz
  • $PACKAGE2=hunspell-en

The first guess is more appropriate but it may not exist. In this case, I want to return hunspell-en ($PACKAGE2) because the first option hunspell-en-zz ($PACKAGE1) does not exist.

pipelines of apt-cache

The command apt-cache returns success (which is defined by shell as exit-code zero) whenever the command is able to run (from the docs of apt-cache)

apt-cache returns zero on normal operation, decimal 100 on error.

That makes using the command in a pipeline more difficult. Normally, I expect the package-search equivalent of a 404 to result in an error (as would happen with curl or wget). I want to search to see if a package exists, and if not fall back to another package if it exists.

This returns nothing, as the first command returns success (so the rhs in the || never runs)

apt-cache search hunspell-en-zz || apt-cache search hunspell-en

apt-cache search with two arguments

This returns nothing, as apt-cache ANDs its arguments,

apt-cache search hunspell-en-zz hunspell-en

From the docs of apt-cache

Separate arguments can be used to specify multiple search patterns that are and'ed together.

So as one of those arguments clearly doesn't exist, this returns nothing.

The question

What is the shell idiom to handle conventions like those found in apt-cache where the return code is useless for the task? And success is determined only by the presence of output on STDOUT?

Similar to

  • make find fail when nothing was found

    they both stemming from the same problem. The chosen answer there mentions find -z which sadly isn't applicable solution here and is use-case specific. There is no mention of an idiom or constructing a pipeline without using null-termination (not an option on apt-cache)

Evan Carroll
  • 30,763
  • 48
  • 183
  • 315
  • Are you sure that hunspell-en exists? Anyway, you can use apt-cache policy and grep for ^$PACKAGENAME:. – AlexP Dec 26 '17 at 23:02
  • @AlexP these are only examples hunspell-en does not exist because they package with country names, hunspell-ar does exist and there are no country-name packages. I need to find the most accurate package for a given country and language. – Evan Carroll Dec 26 '17 at 23:03
  • apt-cache policy 'hunspell-en*' | grep '^\S'. – AlexP Dec 26 '17 at 23:05
  • "fall back to another package if it exists". So if user do search foo bar but foo doesn't exist, it should print bar? – nxnev Dec 26 '17 at 23:15
  • @nxnev I updated it to be more explicit. – Evan Carroll Dec 26 '17 at 23:25
  • 2
    find is just like apt-cache in this respect - useless return code, success is based on output. – muru Dec 27 '17 at 03:00
  • 1
    Yes, I agree they're both stemming from the same problem. The chosen answer mentions there mentions -z which sadly isn't a solution here so the use-case-specific problem isn't applicable. And there is no mention of an idiom or constructing a pipeline without using null-termination (not an option on apt-cache) – Evan Carroll Dec 27 '17 at 03:21
  • 1
    @EvanCarroll the null termination is entirely optional. I only used it because it's the safest way to deal with filenames, so one would expect find to be used with -print0 and so grep with -z. Since apt-cache isn't giving null-terminated output, you don't need -z. – muru Dec 27 '17 at 03:55
  • @muru I did not get that from the answer there at all, but I've clarified here too. That was the trick I was looking for anyway. – Evan Carroll Dec 27 '17 at 04:01

6 Answers6

5

Create a function that takes a command and returns true iff it has some output.

r() { local x=$("$@"); [ -n "$x" ] && echo "$x"; }

( ( r echo -n ) || echo 'nada' ) | cat      # Prints 'nada'
( ( r echo -n foo ) || echo 'nada' ) | cat  # Prints 'foo'

So for this use case it 'll work like this,

r apt-cache search hunspell-en-zz || r apt-cache search hunspell-en
Chris Davies
  • 116,213
  • 16
  • 160
  • 287
  • Note that r printf '\n\n\n' would return false. With shells other than zsh, r printf '\0\0\0' would also return false. So would r printf '\0a\0b\0c' with some shells. – Stéphane Chazelas Jan 04 '18 at 21:59
3

As far as I know, there is no standard way to deal with those cases where the success of a command is determined by the presence of output. You can write workarounds, though.

For example, you can save the output of the command in a variable and then check if that variable is empty or not:

output="$(command)"

if [[ -n "${output}" ]]; then
  # Code to execute if command succeded
else
  # Code to execute if command failed
fi

I think this answers the question in a general way, but if we talk about apt-cache search some solutions come to my mind.

I have a script which makes package management easier. Some of its functions are these:

search() {
  local 'package' 'packages'
  packages="$( apt-cache search '.*' | cut -d ' ' -f '1' | sort )"
  for package; do
    grep -F -i -e "${package}" <<< "${packages}"
  done
}


search_all() {
  local 'package'
  for package; do
    apt-cache search "${package}" | sort
  done
}


search_description() {
  local 'package' 'packages'
  packages="$( apt-cache search '.*' | sort )"
  for package; do
    grep -F -i -e "${package}" <<< "${packages}"
  done
}


search_names_only() {
  local 'package'
  for package; do
    apt-cache search --names-only "${package}" | sort
  done
}

These let you to do multiple searches in a single command. For example:

$ search hunspell-en-zz hunspell-en
hunspell-en-au
hunspell-en-ca
hunspell-en-gb
hunspell-en-med
hunspell-en-us
hunspell-en-za

Every function search the database in a different way, so the results may vary depending of wich function you use:

$ search gnome | wc -l
538
$ search_all gnome | wc -l
1322
$ search_description gnome | wc -l
822
$ search_names_only gnome | wc -l
550
nxnev
  • 3,654
2

I wouldn't call this elegant but I think it might do the job:

search_packages () {
    local packages=($@)
    local results=()
    for package in "${packages[@]}"; do
        results=($(apt-cache -n search "$package"))
        if [[ "${#results[@]}" -eq 0 ]]; then
            echo "$package not found."
        elif [[ "${#results[@]}" -eq 1 ]]; then
            do stuff with "$package"
        else
            echo "Warning! Found multiple packages for ${package}:"
            printf '\t-> %s\n' "${results[@]}"
        fi
    done
}

I don't have a debian machine to test on unfortunately. I've included the -n for "names-only" option of apt-cache to try and limit the search results as it looks like you are mostly sure of what you are searching.

Can be run like:

$ search_packages hunspell-en-zz hunspell-en
$ my_packages=('hunspell-en-zz' 'hunspell-en')
$ search_packages "${my_packages[@]}"
jesse_b
  • 37,005
  • 1
    This is exactly what I was thinking of doing, however I was looking for something as little bit more elegant, so let's see if anyone has anything else clever (like a more abstract solution away from the use-case) if not I'll mark it as chosen. – Evan Carroll Dec 26 '17 at 23:44
  • 1
    Ideally, apt-cache would just return something less stupid. – Evan Carroll Dec 26 '17 at 23:45
  • 1
    @EvanCarroll, Have you tried messing with the -q quiet option? The man page isn't very verbose on it but maybe it changes the return values? – jesse_b Dec 26 '17 at 23:46
  • 1
    still returns 0. =( – Evan Carroll Dec 26 '17 at 23:48
2

Muru clarified this in the comments grep will return a status of 1 if there is no input. So you can add grep . into the stream and if there is no input to match the pattern ., it'll change the status code:

( ( echo -n | grep . ) || echo 'nada' ) | cat      # prints 'nada'
( ( echo -n foo | grep . ) || echo 'nada' ) | cat  # prints 'foo'

To the use-case that looks like this. In the below, there is no -pl-pl so it falls back and returns hunspell-pl

apt-cache search hunspell-pl-pl | grep . || apt-cache search hunspell-pl

Or,

apt-cache search hunspell-en-US | grep . || apt-cache search hunspell-en

There is an -en-US so it returns hunspell-en-us.

See also,

Evan Carroll
  • 30,763
  • 48
  • 183
  • 315
  • grep . returns true if the input contains at least one (fully delimited with some implementations) line that contains at least one (well formed with most implementations) character and will otherwise remove the empty lines. grep '^' would work better at checking that there is some output, though with some implementation could still return false if the input is one non-delimited line (and could remove that line, or with other implementations, return true but add the missing newline). Some grep implementations also choke on the NUL character. – Stéphane Chazelas Jan 04 '18 at 21:48
2

You could define a:

has_output() {
  LC_ALL=C awk '1;END{exit!NR}'
}

And then:

if cmd | has_output; then
  echo cmd did produce some output
fi

Some awk implementations may choke on NUL characters in the input.

Contrary to grep '^', the above would be guaranteed to work on an input that doesn't end in a newline character, but would add the missing newline.

To avoid that and to be portable to systems where awk chokes on NUL, you could use perl instead:

has_output() {
  perl -pe '}{exit!$.'
}

With perl, you could also define a variant that handles arbitrary files more gracefully:

has_output() {
  PERLIO=:unix perl -pe 'BEGIN{$/=\65536} END{exit!$.}'
}

That bounds the memory usage (like for files that don't have newline characters like big sparse files).

You could also create variants like:

has_at_least_one_non_empty_line() {
  LC_ALL=C awk '$0 != "" {n++};1; END{exit!n}'
}

or:

has_at_least_one_non_blank_line() {
  awk 'NF {n++};1; END{exit!n}'
}

(beware the definition of blank varies between awk implementations, some where it's limited to space and tab, some where it also includes ASCII vertical spacing characters like CR or FF, some where it considers the locale's blanks)

Ideally, on Linux, you'd want to use the splice() system call to maximize performance. I don't know of a command that would expose it but you could always use python's ctypes:

has_output() {
  python -c 'if 1:
    from ctypes import *
    import sys
    l = CDLL("libc.so.6")
    ret = 1
    while l.splice(0,0,1,0,65536,0) > 0:
      ret = 0
    sys.exit(ret)'
}

(note that either has_output's stdin or stdout (or both) has to be a pipe for splice() to work).

0

I would suggest to use very basic builtin functions of the shell:

ck_command() { [ -n $("$@") ] ; }

Here is the simplest test case:

ck_command echo 1 ; echo $?

ck_command echo ; echo $?

Then you could easily use it with the || construct you are used to:

ck_command command_1 || ck_command command_2

This simple function will work as you would like with your apt_cache behaviour whichever the number of arguments would be.

dan
  • 933
  • Except this loses STDOUT in the process, ck_command echo 'asdf' | cat outputs nothing. – Evan Carroll Dec 27 '17 at 22:13
  • 2
    → EvanCarroll: this wasn't in your § "The question". To also achieve this output conservation, look at the very elegant and simple answer from @roaima: https://unix.stackexchange.com/a/413344/31707 . – dan Dec 28 '17 at 16:12