POSIX alternative to GNU find's -print0

Question

GNU find has a -print0 option to terminate filenames with null characters. However, this option is not available in POSIX find.

In the GNU man page for find, under the -print flag, it says:

If you are piping the output of find into another program and there is the faintest possibility that the files which you are searching for might contain a newline, then you should seriously consider using the -print0 option instead of -print.

This suggests to me that -print0 was introduced by GNU to specifically handle file paths with newline characters.

What alternative is available in POSIX for GNU's -print0 option, using either just POSIX find or piping to a second POSIX command?

Generate that output POSIXly is not a problem. That's rather making use of it that would be a problem, given that those NULs make that output non-text and can't be processed by text utilities POSIXly. — Stéphane Chazelas, Feb 05 '21 at 20:49
If you need to find a POSIX alternative to -print0, then I assume you will also need to find POSIX alternatives to handling that output? Why not just use -exec to process the pathnames directly? — Kusalananda, Feb 05 '21 at 20:54
Does this answer your question? How do I use find when the filename contains spaces? — Thomas Dickey, Feb 05 '21 at 21:10
@ThomasDickey Perhaps that does answer my question. I was mostly looking to see if POSIX offered any way to do the same thing -print0 does, but if -print0 was designed specifically for the purpose of piping the output to xargs -0 (which is also non-POSIX), then I guess there's no reason to try to find an alternative to -print0 in POSIX. — Shane Bishop, Feb 05 '21 at 22:40
To add to my last comment, from reading further, it seems like GNU might have introduced -print0 to handle newline characters in paths (see my quote in my question). This (to me at least) makes it seem less likely that my question is a duplicate of How do I use find when the filename contains spaces?. Even if the answer to that question answers my question, the two questions IMO are different. — Shane Bishop, Feb 06 '21 at 15:55

score 2 · Accepted Answer · 2021-02-05T20:40:04.080

2

find ... -exec sh -c 'printf "%s\0" "$@"' - {} +

Simply find ... -exec printf '%s\0' {} + may work too, though that will obviously use the standalone printf executable instead of the shell's builtin. I'm not sure if that may have other implications.

edited Feb 05 '21 at 20:40

answered Feb 05 '21 at 19:26

2

Given that the GNU "extension" -print0 is usually used together with the non-portable and non-POSIX xagrs -0, there is no need to emulate -print0. But your idea is correct, -exec + is the solution to avoid the GNU xargs feature. – schily Feb 05 '21 at 19:31
Why is there a - before the {}? – Shane Bishop Feb 05 '21 at 19:35
@ShaneBishop it's for the $0 variable -- you can set it to anything you want. – Feb 05 '21 at 19:37
@user414777 The $0 variable to which command? To find? To sh? To printf? – Shane Bishop Feb 05 '21 at 19:38
@user414777 In general, usually code only answers aren't helpful to readers who are trying to learn. In my case, I already understand sh -c, -exec ... {} +, $@, but others may not. It would be helpful for them if you provide some further explanation in your answer. – Shane Bishop Feb 05 '21 at 19:40
@ShaneBishop the $0 variable inside the shell run with sh -c 'commands ...'. Check with sh -c 'printf "%s\n" "$@"' - 1 2 3 vs sh -c 'printf "%s\n" "$@"' 1 2 3. With all due respect, you don't seem to already understand sh -c ;-). I'm sorry if you don't find my answer helpful, but there's no harm in it either. – Feb 05 '21 at 19:43
1

Oh I see now - the - will prevent the first item going to sh from being treated (in a C sense) as argv[0], which means all of the output of find will go to $@ instead of the first one being lost. – Shane Bishop Feb 05 '21 at 19:51
1

Using things like - or _ is bad practice those as what goes in there is also used by most shells when reporting error message for instance. It's better to use sh for instance, so you get an error message like sh: line 0: printf: write error: Bad file descriptor instead of -: line 0: printf: write error: Bad file descriptor – Stéphane Chazelas Feb 05 '21 at 20:46
I can't think of what benefit using that extra sh would bring. Note that some sh implementations (ksh88-based ones and some pdksh-based ones) still don't have printf builtin. – Stéphane Chazelas Feb 05 '21 at 20:51
If your sh is yash, that would choke on file paths not made of valid text in the locale for instance. – Stéphane Chazelas Feb 05 '21 at 20:54
@StéphaneChazelas I disagree that that is a "bad practice". For me it's very obvious, and it kind of mimics the errors of perl -e. – Feb 05 '21 at 20:54
If you get a Died at - line 1. in perl, that tells you the - (stdin) script died (as in perl <<< 'open "" or die'). Not obvious, but makes sense. A -: line 0: printf: write error: Bad file descriptor error here at best misleads you into thinking a script fed on stdin failed. – Stéphane Chazelas Feb 05 '21 at 20:57
@StéphaneChazelas I was thinking of syntax error at -e line 1. Using sh -c '...' -c args ... would better mimic that, though that would be even more perplexing: "Why the two -c, are you trying to, etc". But feel free to change the answer as you see fit, I don't care about promoting my style here -- I'll make it community wiki. – Feb 05 '21 at 21:03

POSIX alternative to GNU find's -print0

1 Answers1

Linked