10

Example:

% touch -- safe-name -name-with-dash-prefix "name with space" \
    'name-with-double-quote"' "name-with-single-quote'" \
    'name-with-backslash\'

xargs can't seem to handle double quotes:

% ls | xargs ls -l 
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option
ls: invalid option -- 'e'
Try 'ls --help' for more information.

If we use the -0 option, it has trouble with name that has dash prefix:

% ls -- * | xargs -0 -- ls -l --
ls: invalid option -- 'e'
Try 'ls --help' for more information.

This is before using other potentially problematic characters like newline, control character, etc.

  • Are you asking why ls doesn't work here? That's normal behavior, and that's why you never parse the output of ls. Is that your question or is there something else? – terdon Aug 11 '17 at 12:20
  • @terdon the first ls is just to supply the names, one line per entry. This is not about parsing the output of ls (because in this case the first ls just gives the filenames, without any other modification except adding newline per entry). The second ls is to demonstrate how xargs feeds the names as arguments to the command. In this case, xargs somehow fails to pass the second -- to ls. – Gerry Lufwansa Aug 11 '17 at 12:26
  • Not sure how you got that error message upon ls -- * | xargs -0 -- ls -l --. I'd expect ls be passed just one big argument (possibly starting with a - but that wouldn't be a problem with --) since the input doesn't contain NULs and the error would be about that file with a long name and some newline characters not being found. What system and version of ls/xargs is that? – Stéphane Chazelas Aug 11 '17 at 14:42
  • Like ls: cannot access '-name-with-dash-prefix'$'\n''name with space'$'\n''name-with-backslash\'$'\n''name-with-double-quote"'$'\n''name-with-single-quote'\'''$'\n''safe-name'$'\n': No such file or directory – Stéphane Chazelas Aug 11 '17 at 14:44
  • @StéphaneChazelas I'm on Debian 8.0 Jessie, xargs 4.4.2, ls 8.23. – Gerry Lufwansa Aug 11 '17 at 14:51
  • Can you reproduce it? Is it the first or second ls that outputs that error? What do you see if you run ls -- * | strace -fe execve xargs -0 -- ls -l -- instead? – Stéphane Chazelas Aug 11 '17 at 14:54
  • I can't. Apparently I copy-pasted the wrong error message. It should be something like you has (except with literal newlines and now surrounding single quotes). Sorry about that. – Gerry Lufwansa Aug 11 '17 at 15:23

4 Answers4

9

The POSIX specification does give you an example for that:

ls | sed -e 's/"/"\\""/g' -e 's/.*/"&"/' | xargs -E '' printf '<%s>\n'

(with filenames being arbitrary sequences of bytes (other than / and NULL) and sed/xargs expecting text, you'd also need to fix the locale to C (where all non-NUL bytes would make valid characters) to make that reliable (except for xargs implementations that have a very low limit on the maximum length of an argument))

The -E '' is needed for some xargs implementations that without it, would understand a _ argument to signify the end of input (where echo a _ b | xargs outputs a only for instance).

With GNU xargs, you can use:

ls | xargs -rd '\n' printf '<%s>\n'

(also adding -r (also a GNU extension) for the command not be run if the input is empty).

GNU xargs also has a -0 that has been copied by a few other implementations, so:

ls | tr '\n' '\0' | xargs -0 printf '<%s>\n'

is slightly more portable.

All of those assume the file names don't contain newline characters. If there may be filenames with newline characters, the output of ls is simply not post-processable. If you get:

a
b

That can be either both a a and b files or one file called a<newline>b, there's no way to tell.

GNU ls has a --quoting-style=shell-always which makes its output unambiguous and could be post-processable, but the quoting is not compatible with the quoting expected by xargs. xargs recognise "...", \x and '...' forms of quoting. But both "..." and '...' are strong quotes and can't contain newline characters (only \ can escape newline characters for xargs), so that's not compatible with sh quoting where only '...' are strong quotes (and can contain newline characters) but \<newline> is a line-continuation (is removed) instead of an escaped newline.

You can use the shell to parse that output and then output it in a format expected by xargs:

eval "files=($(ls --quoting-style=shell-always))"
[ "${#files[@]}" -eq 0 ] || printf '%s\0' "${files[@]}" |
  xargs -0 printf '<%s>\n'

Or you can have the shell get the list of files and pass it NUL-delimited to xargs. For instance:

  • with zsh:

    print -rNC1 -- *(N) | xargs -r0 printf '<%s>\n'
    
  • with ksh93:

    (set -- ~(N)*; (($# == 0)) || printf '%s\0' "$@") |
      xargs -r0 printf '<%s>\n'
    
  • with fish:

    begin set -l files *; string join0 -- $files; end |
      xargs -r0 printf '<%s>\n'
    
  • with bash:

    (
      shopt -s nullglob
      set -- *
      (($# == 0)) || printf '%s\0' "$@"
    ) | xargs -r0 printf '<%s>\n'
    

2023 Edit. Since version 9.0 of GNU coreutils (September 2021), GNU ls now has a --zero option that can be used in conjunction with xargs -r0:

ls --zero | xargs -r0 printf '<%s>\n'
  • So to recap it's quite simple actually (I should've read the manpage once in a while :-) ): xargs interprets quotes (and backslash) except when given -0 in which it only interprets NUL as record separator. Thanks for explanation about the different flavors of xargs. – Gerry Lufwansa Aug 12 '17 at 00:51
  • Beside / and NUL, the link you quote also names <new-line> and {} as failures. –  Aug 14 '17 at 01:41
  • @Arrow, the example there is using -I {} and $1, $2 as arguments to xargs, so if {} is found in $1 and $2 in that case, it's a problem. But not here as we're neither using -I nor passing variable arguments to xargs. Newline is covered in my answer. I should add -E '' though – Stéphane Chazelas Aug 14 '17 at 06:30
  • The sed escape command misses single quotes as this fails: echo "'" | sed -e 's/"/"\\""/g' -e 's/.*/"&"/' | xargs -I {} sh -c "echo '{}'" – mgutt Feb 25 '22 at 08:28
  • 1
    @mgutt, never embed {} in the code argument of some interpreter as that results in command injection vulnerabilities. Here, you'd need an extra level of escaping for it to be used as shell code. Instead, use xargs sh -c 'for arg do echo "$arg"; done' sh which runs fewer sh invocations and passes the arguments as arguments to the inline script instead of embedding them in its code. – Stéphane Chazelas Feb 25 '22 at 08:33
  • @StéphaneChazelas Thank you. Found a good example for injection here: https://stackoverflow.com/a/11003457/318765 I can't use a loop, but it's safe now: https://superuser.com/a/1706699/129262 – mgutt Feb 25 '22 at 11:00
  • 1
    @mgutt, technically, the + argument of date is still some code in some language. That language is very limited, so you won't run commands with that, but an input like %9999999999s would still cause problem, making it a DoS vulnerability if not an ACE. I would not use xargs for that. To timestamp input, you can use perl or gawk instead. – Stéphane Chazelas Feb 25 '22 at 11:10
  • @StéphaneChazelas Thanks again. I updated my answer (its offtopic here, so please comment there if you want to). – mgutt Feb 26 '22 at 08:44
4

For xargs to understand the -0 null-delimited input option, the sending party must also apply the null delimiter to the data that they are sending over.

Else there's no synchronization between the two.

One option is the GNU find command which can place such delimiters:

find . -maxdepth 1 ! -name . -print0 | xargs -0 ls -ld
3

As you said, xargs doesn't like unmatched double quotes unless you use -0 but -0 only makes sense if you feed it null-terminated data. So, this fails:

$ echo * | xargs
xargs: unmatched double quote; by default quotes are special to xargs unless you use the -0 option
name-with-backslash -name-with-dash-prefix

But this works:

$ printf '%s\0' -- * | xargs -0
-- name-with-backslash\ -name-with-dash-prefix name-with-double-quote" name-with-single-quote' name with space safe-name

In any case, your basic approach is not really the best way to do this. Instead of fiddling about with xargs and ls and whatnot, just use shell globs instead:

$ for f in *; do ls -l -- "$f"; done
-rw-r--r-- 1 terdon terdon 4142 Aug 11 16:03 a
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 'name-with-backslash\'
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 -name-with-dash-prefix
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 'name-with-double-quote"'
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 "name-with-single-quote'"
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 'name with space'
-rw-r--r-- 1 terdon terdon 0 Aug 11 15:34 safe-name
terdon
  • 242,166
  • "In any case, your basic approach is wrong to begin with. Instead of fiddling about with xargs and ls and whatnot, just use shell globs instead". i should've added that I specifically asked about solution that involves xargs, because the names do not necessarily come from filenames, and I want to pass multiple arguments to a single command, not one argument to one command. – Gerry Lufwansa Aug 11 '17 at 14:00
  • "And there's no need to use the -- with ls when run via xargs, xargs can deal with that properly itself:" it seems that -- is actually needed, e.g.: ls -- *dash* | xargs ls result in error while ls -- *dash* | xargs ls -- is okay. xargs does not automatically add an extra -- argument and the second ls needs it to separate the name from options. – Gerry Lufwansa Aug 11 '17 at 14:03
  • @GerryLufwansa ah! That's a very different proposition then, yes. You can probably still use the shell loop (personally, I think I never use xargs and always write such loops), but we might be abe to give you a better solution for your use case if you clarify what you'll be doing. – terdon Aug 11 '17 at 14:03
  • @GerryLufwansa huh. You're absolutely right. That's odd, I guess xargs only does that when run with -0. Thanks for the correction. – terdon Aug 11 '17 at 14:04
0

It is extremelly silly to try to parse the output of a command ls that is not designed to be parsed to feed a command which is not designed to deal with several characters (for example: new lines and {}) when the shell does that by itself:

set -- *; for f; do echo "<$f>"; done

set    -- *
for    f
do     ls "$f"
done

Or, in one command line:

$ set -- *; for f; do echo "<$f>"; done
<name-with-backslash\>
<-name-with-dash-prefix>
<name-with-double-quote">
<name-with-single-quote'>
<name with space>
<safe-name>
<with_a
newline>

Note that the output deals (and has n example as last filename) with new-lines perfectly fine.

Or, if the number of files makes the shell slow, use find:

$ find ./ -type f -exec echo '<{}>' \;
<./safe-name>
<./with_a
newline>
<./name-with-double-quote">
<./-name-with-dash-prefix>
<./name with space>
<./name-with-single-quote'>
<./name-with-backslash\>

Just mind that find process all dot-files and all sub-directories diferently than the shell.