5

I want to avoid having temporary files laying around if my program crashes.

UNIX is wonderful in that you can keep a file open - even after you delete it.

So if you open the file, immediately delete it, and then do the slow processing, chances are high that even if you program crashes, the user will not have to clean up the file.

In shell I often see something similar to:

generate-the-file -o the-file
[...loads of other stuff that may use stdout or not...]
do_slow_processing < the-file
rm the-file

But if the program crashes before rm the user will have to clean up the-file.

In Perl you can do:

open(my $filehandle, "<", "the-file") || die;
unlink("the-file");
while(<$filehandle>) {
  # Do slow processing stuff here
  print;
}
close $filehandle;

Then the file is removed as soon as it is opened.

Is there a similar construct in shell?

Ole Tange
  • 35,514
  • 1
    You can keep a file open even after you remove it. It is deleted when the reference counters reach zero. You can't delete it (you can truncate it). – ctrl-alt-delor Apr 12 '20 at 09:30

3 Answers3

6

This is tested in csh, tcsh, sh, ksh, zsh, bash, ash, sash:

echo foo > the-file
(rm the-file; cat) < the-file | do_slow_processing
do_other_stuff

or if you prefer:

(rm the-file; do_slow_processing) < the-file
do_other_stuff

Interestingly, it also works for fifos:

mkfifo the-fifo
(rm the-fifo; cat) < the-fifo | do_slow_processing &
echo foo > the-fifo

This is because the reader is blocked until something is written.

Ole Tange
  • 35,514
  • 1
    Or just f=$(mktemp); exec 3<"$f" 4>"$f"; rm "$f"; ... use >&3 and <&4 instead of >"$f" or <"$f" ... –  Apr 12 '20 at 08:52
  • For fifos you can use f=$(mktemp -u); mkfifo "$f"; exec 3<>"$f" 3<"$f" 4>"$f"; rm "$f"; ... which doesn't block, and doesn't require background commands. –  Apr 12 '20 at 08:57
  • Instead of hardcoding 3 and 4 can you make those variables, so you just get an unused file descriptor? Otherwise you would have to remember which number is used for what. – Ole Tange Apr 12 '20 at 09:28
  • Do they work in csh and sh? – Ole Tange Apr 12 '20 at 09:29
  • 1
    They work in any bourne shell (except for the $(...) which you would have to replace with \...`` in pre-posix shells). You could use variables, but only in ksh-like shells (bash, zsh, etc). exec {f}<file; echo >&"$f", etc. –  Apr 12 '20 at 09:39
  • Where'd you get csh to test it? – S.S. Anne Apr 12 '20 at 17:27
  • 1
    sorry @S.S.Anne , I was facetious. there's a link at https://en.wikipedia.org/wiki/C_shell – Grump Apr 12 '20 at 19:38
  • This looks like a UUOC (useless use of cat). do_slow_processing < the-fifo & rm the-fifo would be simple but I'm not 100% sure it's safe. But (do_slow_processing &) < the-fifo; rm the-fifo would be; the redirect has to happen before forking a subshell and moving on to the rm. Or maybe with { do_slow_processing & } < the-fifo. Or I think exec < the-fifo would work if you don't mind replacing stdin. – Peter Cordes Apr 13 '20 at 00:54
1
generate-the-file > the-file
exec 5< the-file
rm the-file
...
do_slow_processing 0<&5

Notes:

  • You need to run exec with no executable, so it affects the descriptors of the shell itself
  • Only up to fd 9 available
  • You could use /proc/self/fd/X if you need a filename. This interface is not portable between UNIX flavors (it probably works for you, though).
  • Trying to read again the fd (e.g. two calls to cat 0<&5) will fail as you are at EOF. You would need to rewind it, or overcome it by reading via /proc/self/fd/X
  • In most cases liek above you could do without an actual file, though and do a simple generate-the-file | do_slow_processing

Update:

The OP mentions that generate-the-file may not produce its output in stdout. There are a few idioms for this:

  • Specify an output file of -. It is customary to accept an output filename of - to mean stdout. This is acknowledged by POSIX.1-2017:

    Guideline 13: For utilities that use operands to represent files to be opened for either reading or writing, the '-' operand should be used to mean only standard input (or standard output when it is clear from context that an output file is being specified) or a file named -.

(for utilities other than those where they explicitly define that, it is implementatio-defined, but there is a good chance that it is supported by your generate-the-file tool)

This won't work on all shells and requires OS support for fd filenames.

Ángel
  • 3,589
  • Often generate-the-file does not send data to stdout, so the simple pipe will often not be a possibility. – Ole Tange Apr 12 '20 at 21:09
  • The question asked about a case of generate-the-file > the-file. I assumed there was a good reason for that, such as the-file needing to be seekable, but did not want to skip the basic answer of generate-the-file | do_slow_processing since it is what most readers should be using. I have expanded the answer with some options for when generate-the-file does not send its output on stdout, but lets you specify a filename. – Ángel Apr 12 '20 at 21:43
1

In the bash shell, the way to handle clean-up is by using the EXIT trap builtin (from within a bash shell, type help trap):

trap 'rm temp-file' EXIT

This feature also exists in the dash shell, frequently aliased to sh in modern linux distributions.

user1404316
  • 3,078