6

Suppose you tried something like this:

$ paste ../data/file-{A,B,C}.dat

and realize that you want to sort each file (numerically, let's suppose) before pasting. Then, using process substitution, you need to write something like this:

$ paste <(sort -n ../data/file-A.dat) \
        <(sort -n ../data/file-B.dat) \
        <(sort -n ../data/file-C.dat)

Here you see a lot of duplication, which is not a good thing. Because each process substitution is isolated from one another, you cannot use any brace expansion or pathname expansion (wildcards) that spans multiple process substitution.

Is there a tool that allows you to write this in a compact way (e.g. by giving sort -n and ../data/file-{A,B,C}.dat separately) and composes the entire command line for you?

musiphil
  • 1,611
  • 2
  • 14
  • 16

2 Answers2

2

You could do:

eval paste '<(sort -n ../data/file-'{A,B,C}'.dat)'

Or to automate it as a function

sort_paste() {
  local n i cmd
  n=1 cmd=paste
  for i do
    cmd="$cmd <(sort -n -- \"\${$n}\")"
    n=$(($n + 1))
  done
  eval "$cmd"
}
sort_paste  ../data/file-{A,B,C}.dat

(in some ksh implementations, you need to replace local with typeset)

To adapt to any arbitrary command, (and to prove that eval can be safe when used properly), you could do:

xproc() {
  local n i cmd stage stage1 stage2 stage3
  cmd= xcmd= stage=1 n=1
  stage1='cmd="$cmd \"\${$n}\""'
  stage2='xcmd="$xcmd \"\${$n}\""'
  stage3='cmd="$cmd <($xcmd \"\${$n}\")"'
  for i do
    if [ -z "$i" ] && [ "$stage" -le 3 ]; then
      stage=$(($stage + 1))
    else
      eval 'eval "$stage'"$stage\""
    fi
    n=$(($n + 1))
  done
  eval "$cmd"
}

xproc paste '' sort -n -- '' ../data/file-{A,B,C}/dat
  • 1
    I don't like eval, but I see the arguments are securely handled above. The only concern I see is the handling of $cmd: it will go through the usual expansions. This is fine when cmd is fixed to paste, but it can be problematic when you want to generalize this and accept cmd from the command line, as in local i n=1 cmd="$1"; shift; .... – musiphil Apr 23 '13 at 19:02
  • +1 for proper eval. Your first version is completely fine (and the cleanest and most easy to maintain) since in this scenario you have full control over the string passed to eval. – Clayton Stanley Apr 24 '13 at 07:27
1

Please see here, why eval can be dangerous to use. As you'll notice, it is a very powerful tool, but at the same time can cause a lot of damage.

The following script will do what you want - safely.

sort_ps () 
{ 
    local cmd="$1" p=()
    shift;
    for f in "$@"; do
        p+=(<(sort -n "$f"));
    done
    "$cmd" "${p[@]}"
}

EDIT: Mr. Chazelas is right. I fixed my solution, so you can now use sort_ps paste file1.txt file2.txt file2.txt ... fileN.txt instead. Thank you Stephane for reviewing my answer.

Sample output:

rany$ sort_ps sprunge foo1.txt foo.txt 
http://sprunge.us/EBZf?/dev/fd/62
http://sprunge.us/TQGC?/dev/fd/62
  • 3
    eval is dangerous when passed uncontrolled data which is not the case in the solution I gave. You forgot a "--" and a ";", you're not reporting errors of paste as the exit status. You're not accounting for a $temp_dir potentially present in the environment. It will fail if any argument contain slashes (as in the OP's question), you're not pasting in the order the files were given, you're not using process substitution as the OP requested: uses extra temp space, sorts run sequentially instead of concurrently, no cleanup upon SIGKILL. – Stéphane Chazelas Apr 23 '13 at 15:36
  • 2
    I didn't expect the process substitutions to survive until the end where "$cmd" is actually executed, not right after each statement. When sort_ps is implemented as a shell function as above, sort_ps echo ../data/file-{A,B,C}.dat gives /dev/fd/63 /dev/fd/62 /dev/fd/61, indicating they survive until the end indeed. However, when sort_ps is implemented as a shell script, it gives /dev/fd/63 /dev/fd/63 /dev/fd/63, which means each process substitution is terminated right after it is mentioned. How strange! I cannot find any official documentation about this. – musiphil Apr 23 '13 at 19:48
  • 1
    Strange, but it seems to be a feature introduced in bash-2.04 – Stéphane Chazelas Apr 24 '13 at 07:13
  • You're still missing a "--". Try running it on a file called passwd in a directory called "-o/etc" for instance. – Stéphane Chazelas Apr 24 '13 at 10:11
  • On bash-4.3.11(1)-release on x86_64-pc-linux-gnu, even the shell function method gives /dev/fd/63 /dev/fd/63 /dev/fd/63, meaning that the process substitutions don't survive until the last command. So storing process substitutions in an array doesn't work. – musiphil Dec 15 '17 at 20:58