How does xargs know when a stdin input ends, so that it can start processing it?

Question

After reading Stephen Kitt's reply, xargs waits for receiving the stdin input before processing any of the input, such as splitting it into arguments.

Is -E used for specifying the end of a stdin input?

Without it, how does xargs knows when it ends? Is there some timeout?

How does any program know when stdin ends? How does cat know to exit when it's reached the tail of the file it's reading, for example? Why is the answer to how xargs knows when it's reached the end of its stdin any different? — Charles Duffy, Nov 25 '18 at 01:33
That actually helps to clarify the question quite a lot -- that you're not looking for an xargs-specific answer but a generic UNIX-file-operations answer. Stephen is correct -- read() returning 0 indicates EOF. From the read(2) man page, section RETURN VALUES: If successful, the number of bytes actually read is returned. Upon reading end-of-file, zero is returned. Otherwise, a -1 is returned and the global variable errno is set to indicate the error. — Charles Duffy, Nov 25 '18 at 01:39
...so, read() will either actually read some bytes (and return a positive number with the number of bytes read), or fail to read some bytes (and return a negative number that indicates how/why it failed), or hit end-of-file (and return 0). — Charles Duffy, Nov 25 '18 at 01:41

Stephen Kitt · Answer 1 · 2018-11-25T06:30:18.700

To read its input, xargs uses the read function (see also the corresponding Linux manpage).

read reads the next available data from a file descriptor into an area of memory specified by the calling program. As used by xargs, read waits until either data is available, or an error occurs, or it determines that no more data will ever be available. It returns respectively a positive integer, -1, or 0 in each of these cases.

To determine when it’s finished reading its input (its standard input, or the input specified by the -a option), xargs looks for the specified end-of-file marker (if the -E option was used), or a 0 return value from read.

You can see this in action by running

printf '%s ' {1..1024} | strace -e read xargs -s 2048 -x

score 1 · Answer 2 · answered Nov 25 '18 at 15:09

How does xargs know when a stdin input ends, so that it can start processing it?

xargs does not wait for the end of stdin before starting to process it:

$ while date +%H:%S; do sleep 1; done | xargs -n2 echo
16:51 16:52
16:53 16:54
16:55 16:56
16:57 16:58
^C

xargs knows that the stdin has ended just like any other program, eg. cat(1) or tee(1), by checking the return value of read(2), calling feof(3), etc. xargs will also treat a read error on stdin just as an end-of-file.

How does xargs know when a stdin input ends, so that it can start processing it?

2 Answers2

Linked