1

I find myself using basically the same line over and over again:

cat file | command1 | command2 | command3 > file

Is there a way I can put all these pipes into one script, so I can just run

automatic.sh file

and accomplish the same thing?

Lucas Phillips
  • 449
  • 6
  • 11
  • 4
    with > file, even before any of the cat, command1, command2 or command3 is started, the shell will have truncated file. So cat will see an empty file. – Stéphane Chazelas Dec 05 '13 at 20:30
  • @StephaneChazelas is right, you will destroy your data because file will be emptied! – Totor Dec 06 '13 at 21:25

3 Answers3

4

Create a file with this content:

command1 | command2 | command3

Make it executable:

chmod +x that-file

And call it as:

/path/to/that-file < file.in > file.out

Add /path/to to your $PATH variable in order to be able to do:

that-file < file.in > file.out
3

automatic.sh:

#!/bin/bash
cat $1 | command1 | command2 | command3 > .automatic.sh.temp
rm $1
mv .automatic.sh.temp $1

then call it like:

automatic.sh file

to use a real example:

arthur@a:~$ cat automatic.sh 
#!/bin/bash
cat $1 | grep foo | sed -e 's/foo/bar/g' | sort > $1

arthur@a:~$ cat <<END > foo
> 1 foo
> 2 bar
> 3 foo
> 4 bat
> 5 foo
> END
arthur@a:~$ chmod +x automatic.sh 
arthur@a:~$ ./automatic.sh foo
arthur@a:~$ cat foo
1 bar
3 bar
5 bar

And just to be pedantic, it's slightly better form to write the output to a temporary file, then at the end move the tempfile over the original file.

Lucas Phillips
  • 449
  • 6
  • 11
1

You can make a script or function that contains this command. Use "$1" to refer to the first argument passed to the script or function.

There's a major bug in your code snippet: depending on the timing, > file may truncate the file before the first command in the pipeline starts reading the file, or shortly after it starts reading. Your snippet may occasionally work with small files, but most of the time it won't work.

The recommended way to modify a file is to write to a new temporary file, and once this is finished, move it to replace the old version. This way, if something bad happens to interrupt the processing (such as an error, a power failure, etc.), the old file remains in place.

Here is a function that operates on this principle. Thanks to the && after the pipeline, it only moves the output file into place if command3 returns a success status (note that the return status of other commands in the pipeline is ignored). I rely on the common mktemp utility to create the temporary file (it ensures that the name of the temporary file won't collide with any other instance of the script or any other program).

my_pipeline () {
  out=$(TMPDIR=$(dirname -- "$1") mktemp)
  <"$1" command1 | command2 | command3 >"$out" &&
  mv -f "$out" "$1"
}

Put this function in your .bashrc; it will be available the next time you start bash. You can also copy-paste the definition on the command line to have it take effect in that shell. To use it, type the name of the function and then the name of the file to act on:

my_pipeline my_file

You can make the function act on all of its arguments in turn by putting that stuff in a loop.

my_pipeline () {
  for file; do
    out=$(TMPDIR=$(dirname -- "$file") mktemp)
    <"$file" command1 | command2 | command3 >"$out" &&
    mv -f "$out" "$file"
  done
}

Usage:

my_pipeline file1 file2 file3

If you want to make a script instead, put the code in a file starting with a shebang line to indicate that it's a shell script.

#!/bin/sh
for file; do
  out=$(TMPDIR=$(dirname -- "$file") mktemp)
  <"$file" command1 | command2 | command3 >"$out" &&
  mv -f "$out" "$file"
done

Put the file in your command search path and make it executable (see How can I make a program executable from everywhere).

Another way to solve the truncate-before-use problem is the sponge utility, but this utility isn't available everywhere (it's from Debian).