1

Is there a standard tool that can filter text stream based on command execution result?

Consider grep for example. It can filter text stream based on regex. But the more general problem is extending filtering condition. For example, I may want to select all files matching some condition (find has plenty of checks available, but it is a subset anyway), or just use another program to filter my data. Consider this pipe:

produce_data | xargs -l bash -c '[ -f $0 ] && ping -c1 -w1 $1 && echo $0 $@'

It is completely useless, but it provides a general approach. I can use any bash oneliner to test each line. In this example I want the lines which consist of existing file and a reachable host. I would like to have a standard tool that can do it like this:

produce_data | super_filter -- bash -c '[ -f $0 ] && ping -c1 -w1 $1'

it could be easily used with find:

find here | super_filter -- test -r

note how it allows you to use universal tools to filter files instead of specific find flags which I always forget.

A more real-life example where such tool would be helpful is finding object files with specific symbols.

So super_filter would allow any condition checker to operate in stream mode. The syntax could be like in xargs or parallel.

Mikhail
  • 21

2 Answers2

1

Wouldn't GNU Parallel work if you add && echo?

... | parallel 'test -r {} && echo {}'
Ole Tange
  • 35,514
0

while read approach (pure bash)

while read is a common idiom to process input line by line (How to loop over the lines of a file?). Do any checks and conditionally echo the original input line.

... | while IFS= read -r line; do test -r "$line" && echo "$line"; done

Special case: find

Why is looping over find's output bad practice?

find have an -exec action to perform arbitrary actions and checks on found files:

From man find:

-exec command ;
   Execute  command;  true  if 0 status is returned.  All following arguments to
   find are taken to be arguments to the command until an argument consisting of
   `;' is encountered.  The string `{}' is replaced by the current file name be‐
   ing processed everywhere it occurs in the arguments to the command, not  just
   in  arguments  where it is alone, as in some versions of find.  Both of these
   constructions might need to be escaped (with a `\') or quoted to protect them
   from  expansion  by  the shell.  See the EXAMPLES section for examples of the
   use of the -exec option.  The specified command is run once for each  matched
   file.  The command is executed in the starting directory.  There are unavoid‐
   able security problems surrounding use of the -exec action;  you  should  use
   the -execdir option instead.

Example:

find here -exec test -r {} \; -print

Since -exec action overrides the default -print action, -print must be specified explicitly. This behavior is described in EXPRESSION section of find(1) man page. If you are going to apply further processing, use -print0 + xargs --null instead of -print.

belkka
  • 481
  • thanks for pointing out a solution that saves one process launch (with while read). I find it a little harder to read than xargs or parallel, but the latter spawn separate processes which can be unwanted sometimes. – Mikhail Mar 27 '20 at 16:58
  • Note, solutions that accept bash script as string argument (xargs bash -c and parallel) may require some extra work on quoting. E. g. echo '$0 =' "$0" should be written like bash -c "echo '\$0 =' \"\$0\"". With while read approach no extra level of quoting is required – belkka Mar 28 '20 at 09:46