18

Let's say a program exists, which takes two arguments; input file and output file.

What if I don't wish to save this output file to disk, but rather pass it straight to stdin of another program. Is there a way to achieve this?

A lot of commands I come across on Linux provide an option to pass '-' as the output file argument, which does what I've specified above. Is this because passing the stdin of a program as an argument is not possible? If it is, how do we do it?

An example of how I would image using this is:

pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" stdin(echo)

The shell I'm using is bash.

jimmij
  • 47,140
Dziugas
  • 293
  • 1
    cat <file | cmd /dev/fd/0 works on most unices. – mikeserv Jul 20 '15 at 03:00
  • Not working for me. Tried it with: cat < README.txt | cp /dev/fd/0. It said cp: missing destination file operand after ‘/dev/fd/0’ Try 'cp --help' for more information. – Dziugas Jul 20 '15 at 03:08
  • 1
    program input-file /dev/stdout | another-program? Also note that echo reads nothing from stdin. – yaegashi Jul 20 '15 at 03:26
  • Yes, that was a botched example I came up with to illustrate my curiosity. Good point. – Dziugas Jul 20 '15 at 03:32
  • 1
    @Dziugas - of course not - you can't cp a file nowhere. echo 1 2 3| cp /dev/fd/0 /dev/tty will print 1 2 3. And by the way, /dev/fd/[num] is more likely to work than /dev/std(in|out|err) in most cases. See Portability of File-Descriptor Links about what you can expect to work where. – mikeserv Jul 20 '15 at 07:57
  • 1
    A good UNIX program would write to standard output leaving it up to the user to decide whether they wish to redirect to a file or pipe to another command. – j-- Nov 09 '15 at 06:39

4 Answers4

13

If the program supports writing to any file descriptor even if it can't seek, you can use /dev/stdout as the output file. This is a symlink to /proc/self/fd/1 on my system. File descriptor 1 is stdout.

psmears
  • 465
  • 3
  • 8
Tiky
  • 286
  • This solved my query. So is there no way to do it when the program needs to seek? – Dziugas Jul 20 '15 at 03:34
  • 3
    If you're trying to prevent disk access, you can write the file in /dev/shm/, however, if you don't want any file on the filesystem, then as far as I know, there is no way to seek on a pipe. Seeking forward means it would have to buffer everything in memory until it reached that point forward, and seeking backward implies having buffered everything in memory. – Tiky Jul 20 '15 at 03:39
  • pdftotext like many (but not all) other utilities support - for that as well (which would work even on systems that don't support /dev/stdout, or where /dev/stdout don't work as expected like on Linux where stdout is not a pipe). pdftotext file.pdf - | wc -c – Stéphane Chazelas Jul 20 '15 at 16:37
11

From the pdftotext man page:

If text-file is ´-', the text is sent to stdout.

So in this case all you need is:

pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" -

Or if you want to pipe this to STDIN of another program:

pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" - | another_prog

Using - as substitute for a filename is a convention many utilities follow (including pdftotext) when we want input from STDIN or output to STDOUT. However not all utilities follow this convention. In that case the idiomatic way to do this in bash is to use a process substitution:

my_utility "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" >( cat )

Here the >( ) behaves largely like a file passed to my_utility, but instead of being a real file, the stream is piped into the stdin of the contained process, i.e. cat. So here, the text should ultimately output as required.

Use of cat almost always sets off UUOC alarm bells on forums like this. I contend that if the utility does not support -, then this is a useful use of cat, though if there are any ways to do this process substitution without the cat, then I'm all ears ;-).

However, if (as the question states) the ultimate destination of of the stream is STDIN of another program, then the cat can be eliminated:

my_utility "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf" >( another_prog )
  • 2
    And let me backpedal once more: if prog2 writes to stdout, prog1 input_file >( cat ) | prog2 is better than prog1 input_file >( prog2 ), because the cat form waits for prog2 to complete (i.e., before the shell issues the next prompt or goes on to the next command (e.g., after ; or &&)), while the cat-less form waits only for prog1 to complete.  Also, after the cat form, $? is the exit status from prog2, whereas, in the other, $? is the exit status from prog1.  (You pays your money and you takes your choice.) – Scott - Слава Україні Jul 20 '15 at 16:57
5

If your shell supports them, the simplest way of doing such manipulations would be to use process substitution: <(…) and >(…). This works in bash, zsh and ksh and possibly other shells. For example:

$ sort <(printf "b\nc\na\n")
a
b
c
$ ls
foo
$ cp <(find . -name foo) bar
$ ls
bar  foo

However, this won't help in the example you state since pdftotext will save in a text file. While your best choice (apart from the obvious one of using -) is to use /dev/stdout as suggested by @TiCPU, you could also use another shell feature. The construct !:N refers to the Nth argument of the previous command. Therefore, you could do:

$ pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf"  out.txt
$ cat !:2
terdon
  • 242,166
  • When is cat <( command ) ever useful?  That looks like a UUOC.  I think TiCPU's answer is correct (although not spelled out clearly): pdftotext "C BY K&R.pdf" /dev/stdout.  (I guess Digital Trauma's answer would work, although it's also a UUOC.) – Scott - Слава Україні Jul 20 '15 at 11:51
  • @Scott yes, it is a UUOC but neither of my other two examples are. I very often use <() for things like diff <(sort foo) <(sort bar). As for cat <(command) specifically, I can't think of a case at the moment that couldn't be replaced by other tools but there may well be one. In any case, cat was just the example chosen by the OP. – terdon Jul 20 '15 at 15:25
  • I don't see where the OP chose cat.  Somebody posted a semi-answer featuring a UUOC in a comment, and the OP (who didn't understand quite how to apply it) replied that it didn't work for him.  (And, of course, I realize that commands that *don't even include cat* cannot be UUOCs.) – Scott - Слава Україні Jul 20 '15 at 15:31
  • @Scott whops, true, it was the echo(stdin) which I translated to cat. That's just the only way I could think of to twist the OP's example into something workable. – terdon Jul 20 '15 at 15:51
  • 1
    While I agree that cat <() can be useful in some situations, in this scenario however it is not working at all. The problem (very poorly described by OP, I must admit) is that pdftotext takes two arguments: input file and output file. If second argument is missing then it produces nothing, so cat <(pdftotext "file.pdf") would also return nothing. One can cheat pdftotext command by giving >(cat) as a second argument like Digital Trauma answered, but cat <() is pointless here. Obviously in pdftotext case it is best just to use - as the output file name. – jimmij Jul 20 '15 at 16:04
  • @jimmij ah, I see. In that case, TiCPU's answer is probably the way to go. – terdon Jul 20 '15 at 16:16
  • 1
    @Scott How is my answer a UUOC? How would you do this process substitution without cat? >( ) will effectively pipe the stream to whatever process is inside - so we actually do need a cat here to output that stream. Normally we should be able to do something like pdftotext input.pdf -, but apparently pdftotext doesn't support the - parameter to output directly to stdout instead of a file - try it. – Digital Trauma Jul 20 '15 at 16:21
  • 1
    @DigitalTrauma it is not uuoc. I believe cat is the fastest you can get in case of just printing, but in fact you can use other command as >(grep something) to be more useful. BTW, my pdftotext 3.04 do support - as an output file, so I'm a little surprised by the whole discussion. – jimmij Jul 20 '15 at 16:27
  • @jimmij yes - I did not notice that pdftotext actually supports -. In this case I think it is a UUOC ;-) But I think the construct is still useful for utilities that don't support -. – Digital Trauma Jul 20 '15 at 16:31
  • @DigitalTrauma: Yeah, sorry; I was just in the process of typing a retraction.  But first: The question seems (to me) to be asking how to handle some (hypothetical) program (nominally, pdftotext) that insists on doing open(argv[1], O_RDONLY) and open(argv[2], O_CREAT|O_WRONLY), and doesn’t default to reading stdin or writing stdout (not even if given an arg of -).  And TiCPU and Digital Trauma both wrote decent answers to that question.  … (Cont’d) – Scott - Слава Україні Jul 20 '15 at 16:32
  • (Cont’d) …  And I want to retract what I said in my first comment: if TiCPU’s answer doesn’t work (e.g., because /dev/stdout doesn’t exist), Digital Trauma’s answer may be the only (or at least the best) answer, and calling it a UUOC, while arguably, technically true, was a little harsh, because, while prog1 input_file >(  cat  ) | prog2 can be abbreviated to prog1 input_file >( prog2 ), the cat form is (again, arguably) clearer. – Scott - Слава Україні Jul 20 '15 at 16:34
  • Of course, if you just want to display the output of prog1, you can use prog1 input_file /dev/tty, or jas's idea, prog1 input_file $(tty). – Scott - Слава Україні Jul 20 '15 at 16:40
  • @Scott Thanks - These comments were useful - I've significantly revised my answer - I hope its much more complete now. – Digital Trauma Jul 20 '15 at 16:43
  • 1
    @terdon I hate to be a stickler, but this doesn't seem to work. Specifically it is no different that running pdftotext "C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.pdf", which puts the output in a file called C BY BRIAN W KERNIGHAN & DENNIS M RITCHIE.txt, but none of the text is output to STDOUT for piping to another program. – Digital Trauma Jul 20 '15 at 22:59
  • 1
    @DigitalTrauma that's not being a stickler! That's me being an idiot. Thanks for pointing it out and please never apologize when pointing out mistakes. I would much rather have my mistake pointed out to me and so learn something than leave it there in all its dubious glory. – terdon Jul 21 '15 at 13:10
-2
cmd tty

tty returns the name of the terminal connected to stdout.

Kevdog777
  • 3,224
jas
  • 7
  • I'm not sure how this answers the question, which is about combining commands; perhaps you expand with an example of how you would achieve that. – dhag Jul 20 '15 at 14:39
  • I guess you are saying to check with tty the name of the terminal, and then use that file as an output, for example pdftotext file.pdf /dev/pts/2. In that case, I agree. – jimmij Jul 20 '15 at 16:18
  • That can be abbreviated/automated to prog1  input_file $(tty); which is generally going to be equivalent to prog1  input_file /dev/tty.  But this approach assumes that the goal is to display the output of prog1 (i.e., in the terminal), and that is not what the question is asking (see the comments on terdon's answer for some clarification on the meaning of the question). – Scott - Слава Україні Jul 20 '15 at 17:10