48

I want to pipe file names to other programs, but they all choke when the names contain spaces.

Let's say I have a file called.

foo bar

How can I get find to return the correct name?

Obviously I want:

foo\ bar

or:

"foo bar"

EDIT: I don't want to go through xargs, I want to get a correctly formatted string out of find so that I can pipe the string of file names directly to another program.

erch
  • 5,030
bug
  • 2,518
  • 7
    what are you piping it to? are you aware of the -exec flag with find? you could potentially alleviate this error and make your command more efficient by doing -exec instead of piping it to other commands. Just my $.02 – h3rrmiller Jul 01 '13 at 15:56
  • The receiving end of the pipe should not matter. I just want the file names to be formatted correctly. – bug Jul 01 '13 at 16:05
  • 8
    @bug: find formats the file names just fine; they are written one name per line. (Of course, this is ambiguous if a filename contains a newline character.) So the problem is the receiving end "choking" when it gets a space, which means you have to tell us what the receiving end is if you want a meaningful answer. – rici Jul 01 '13 at 16:40
  • 3
    What you call "properly formatted" is really "escaped for consumption by the shell". Most utilities which can read a bunch of file names would choke on a shell-escaped name, but it would in fact make sense for (say) find to offer an option to output file names in a format suitable for the shell. In general, though, the -print0 GNU find extension works fine for many other scenarios (too), and you should learn to use it in any event. – tripleee Jul 01 '13 at 16:55
  • @triplee: Does that mean that there is no way to format the file names to either escape the space or to add quotation marks? – bug Jul 01 '13 at 17:26
  • @rici: I want to pipe the file names to the stdin of other programs, so I guess the receiving end is the shell. – bug Jul 01 '13 at 17:29
  • @bug: are you using read? -- most shell utilities don't accept filenames through stdin, but the ones which do (such as xargs) do not require quoting. – rici Jul 01 '13 at 18:02
  • 2
    @bug: By the way, ls $(command...) does not feed the list through stdin. It puts the output of $(command...) directly into the command line. In that case, it is the shell which is reading from the c, and it will use the current value of $IFS to decide how to wordsplit the output. In general, you're better off using xargs. You won't notice a performance hit. – rici Jul 01 '13 at 18:06
  • You want find to produce a correctly-formatted string for your program. Fine. So what input format does the program expect? You need to tell us! Separated by newlines and doesn't support newlines in file names? Separated by null bytes? Base-64-encoded on separate lines? … – Gilles 'SO- stop being evil' Jul 01 '13 at 21:52
  • 2
    find -printf '"%p"\n' will add double quotes around each found name, but will not properly quote any double quotes in a file name. If your file names do not have any embedded double quotes, you can ignore the problem: or pipe through sed 's/"/&&/g;s/^""/"/;s/""$/"/'. If your file names end up being handled by the shell, you should probably use single quotes instead of double quotes, though (otherwise sweet$HOME will become something like sheet/home/you). And this is still not very robust against file names with newlines in them. How do you want to handle those? – tripleee Jul 02 '13 at 09:23
  • @tripleee you should make that an answer, since I think it's what the questioner really wants. – evilsoup Jul 02 '13 at 16:32

11 Answers11

30

POSIXLY:

find . -type f -exec sh -c '
  for f do
    : command "$f"
  done
' sh {} +

With find supports -print0 and xargs supports -0:

find . -type f -print0 | xargs -0 <command>

-0 option tells xargs to use the ASCII NUL character instead of space to end (separate) the filenames.

Example:

find . -maxdepth 1 -type f -print0 | xargs -0 ls -l
cuonglm
  • 153,898
  • Doesn't work. When I run ls $(find . -maxdepth 1 -type f -print0 | xargs -0) I get ls: cannot access ./foo: No such file or directory ls: cannot access bar: No such file or directory – bug Jul 01 '13 at 16:01
  • 2
    Have you tried it the way Gnouc actually wrote it? If you insist on doing it your way, try enclosing the $(..) in double-quotes "$(..)" – evilsoup Jul 01 '13 at 16:03
  • 3
    @bug: your command is wrong. Try exactly I worte and read the manpage of find and xargs. – cuonglm Jul 01 '13 at 16:12
  • I see, then again I want to get a formatted string which I could pipe directly. – bug Jul 01 '13 at 16:18
  • 1
    @bug: Just use xargs -0 – cuonglm Jul 01 '13 at 16:41
  • Based on counglm's answer, I threw together a bash script which does what the OP's asking for here: https://gist.github.com/mellertson/490742488d12d16cb2c779c592827029 – MikeyE Jan 05 '20 at 05:32
12

Using -print0 is one option, but not all programs support using nullbyte-delimited data streams, so you'll have to use xargs with the -0 option for some things, as Gnouc's answer noted.

An alternative would be to use find's -exec or -execdir options. The first of the following will feed the filenames to somecommand one at a time, while the second will expand to a list of files:

find . -type f -exec somecommand '{}' \;
find . -type f -exec somecommand '{}' +

You may find that you are better off using globbing in many cases. If you have a modern shell (bash 4+, zsh, ksh), you can get recursive globbing with globstar (**). In bash, you have to set this:

shopt -s globstar
somecommand ./**/*.txt ## feeds all *.txt files to somecommand, recursively

I have a line saying shopt -s globstar extglob in my .bashrc, so this is always enabled for me (and so are extended globs, which are also useful).

If you don't want recursiveness, obviously just use ./*.txt instead, to use every *.txt in the working directory. find has some very useful fine-grained searching capabilities, and is mandatory for tens of thousands of files (at which point you'll run into the shell's maximum number of arguments), but for day-to-day usage it is often unnecessary.

evilsoup
  • 6,807
  • 3
  • 34
  • 40
11

So if you don't want to use xargs (thus likely nor e.g. parallel), a find output can be read and processed line by line like following way:

find . -type f | while read x; do
  # do something with $x
done
mykhal
  • 3,231
6
find ./  | grep " "

will get you the files and directories contains spaces

find ./ -type f  | grep " " 

will get you the files contains spaces

find ./ -type d | grep " "

will get you the directories contains spaces

4

By setting the Internal Field Separator to newline, the shell will ignore the spaces:

IFS=$'\n' eval 'for i in `find . -type f -name "*"`;do echo $i;done'
4

Personally, I'd use the -exec find action to solve this sort of problem. Or, if necessary, xargs, which allows for parallel execution.

However, there is a way to get find to produce a bash-readable list of filenames. Unsurprisingly, it uses -exec and bash, in particular an extension to the printf command:

find ... -exec bash -c 'printf "%q " "$@"' printf {} ';'

However, while that will print out correctly shell-escaped words, it will not be usable with $(...), because $(...) does not interpret quotes or escapes. (The resut of $(...) is subject to word splitting and pathname expansion, unless surrounded by quotes.) So the following will not do what you want:

ls $(find ... -exec bash -c 'printf "%q " "$@"' printf {} +)

What you would have to do is:

eval "ls $(find ... -exec bash -c 'printf "%q " "$@"' printf {} +)"

(Note that I have made no real attempt to test the above monstrosity.)

But then you might as well do:

find ... -exec ls {} +
rici
  • 9,770
  • I don't think the ls scenario adequately captures the OP's use case, but this is only speculation, since we have not been shown what (s)he is actually trying to accomplish. This solution actually works very nicely; I get the output I (vaguely) expected for all the funny file names I tried, including touch "$(tr a-z '\001-\026' <<<'the quick brown fox jumped over the lazy dogs')" – tripleee Jul 03 '13 at 11:17
  • @triplee: I have no idea what OP wants to do either. The only real advantage of constructing the quoted string to pass to eval is that you don't have to pass it to eval yet; you could save it in a parameter and use it later, perhaps several times with different commands. However, OP gives no indication that that is the use case (and if it were, it might be better to put the filenames into an array, although that's tricky too.) – rici Jul 03 '13 at 14:33
1

There are many different answers, depending on how exactly you want to use the output, as well as what assumptions you are making about what odd characters aren't in the filenames. The find command doesn't have an option to escape special characters, but if it did, it's choice of what to escape might not match the exact needs of your program. Considering that the only illegal characters in filenames are '/' and NULL, there are a lot of edge cases.

In my case, I wanted to process the file names as elements in an array in Bash, so I wanted something like:

FILES=( $(find . -type f) )

That doesn't work with spaces (or tabs, for that matter). That also kills the newlines from the find command, making them useless as separators. You can set the field separator in Bash to something different. Ideally, you would set it to null and use -print0 in find, but null is not allowed as a field separator in Bash. My solution is to pick a character that we assume is not in any filenames, like 0x01 (ctrl-a), and use that:

IFS=$'\x01'
FILES=( $(find . -type f | sed -e 's/$/\x01/') )
unset IFS
for F in "${FILES[@]}"; do
    useful_command "$F"
done

Note the need to unset IFS to restore it to the default. That won't work with filenames with newlines in them, but should work with most other filenames.

If you're really paranoid, then you'll need to do a find piped to hexdump, split out the results to get all the hex values, and look for one that isn't in the results. Then use that value. I'm sure Johnny Drop Tables has files with every hex code in the file names. If you're paranoid, create file and directory names using all 253 legal characters and test. Probably the only solutions that would pass that test would be ones using 'find -print0' piped to xargs or a custom C program.

0

If all you want is to escape the spaces, you can do this:

find (...) | sed -e's/ /\\ /g' | whatever...

But as far as I know only xargs requires such treatment. For example:

find . -type f -name '* *' | sed -e's/ /\\ /g' | xargs ls -l

It might work for you. It doesn't escape anything but the spaces, but covers the most usual case. Quotes in file names might be a problem still.

Florian F
  • 101
0

If you are going to use the filenames in a script you might want to combine BASH arrays with find. Something like this:

readarray -d '' ALL_JSON_FILES < <(find "my_src_dir" -name "*.json" -print0)
for TEAM in "${ALL_JSON_FILES[@]}"
do

Inspired from SO questions:
https://stackoverflow.com/a/54561526/671282\ https://stackoverflow.com/a/8880633/671282

0

I'm quite surprised no one ever mentioned to just use the -printf flag from find itself like this:

find $MY_DIR -printf '"%p"\n'

This will wrap result's paths inside double quotes which is more bash-friendly to deal with. It will produce something like this:

"foo and bar"
"foo"
"caz with spaces here"
...

if %p is not what you want then there are still more directives available to you. Just check the man find out.

-2
    find . -type f -name \*\  | sed -e 's/ /<thisisspace>/g'