2

I do an ls command like this to list multiple directories. Please be aware of the blank space in the folder X y.

ls -1d 2021*/"X y"/foobar

Output is

'20211121-161518-801/X y/foobar'
'20211128-151513-585/X y/foobar'
'20211130-170001-724/X y/foobar'

It looks like that ls still use quotation marks here. But when I use this output in combination with xargs the quotation marks seams to be lost

ls: Zugriff auf '20211121-161518-801/X' nicht möglich: Datei oder Verzeichnis nicht gefunden
ls: Zugriff auf 'y/foobar' nicht möglich: Datei oder Verzeichnis nicht gefunden

The xargs command is like this

ls -1d 2021*/"X y"/foobar | xargs ls

In the end I will use a xargs rm -rf but the xargs ls is just a test.

buhtz
  • 865
  • 1
  • 12
  • 23

2 Answers2

4

While Stéphane Chazelas gave you generaly pretty good and deep dive into "modern" ls, not all lses are created equal. Even if new GNU ls will have --zero it will take decades until trickles down everywhere, and this still will not work on all, especially non-GNU based systems.

General consensus with ls is that it's meant for human consuption and should not really be trusted for anything else.

Despite this being quite a common knowledge among elder admins, neophytes and novice admins are for many understandable (and other inexplicable) reasons still drawn exclusively to ls. As Linux proliferates beyond niche and into more and more mainstream and nationalized environments, this ls love is becoming grave problem (as your own situation proves). That is also the reason why GNU ls is growing "fast" the options that any ls should have had for decades.

However if you really want to solve this problem reliably and safely right now and proper the right tool for this job is find command: either used as zero separated list generator, or as direct command executor.

Using find as zero separated list generator:

find topdir -mindepth 1 [-maxdepth 2] -type d -print0 | xargs -0 -I@ stat @

Using find as direct executor:

find topdir -mindepth 1 [-maxdepth 2] -type d -exec stat '{}' \;

With -mindpeth/-maxdepthyou can control depth into which the find will descend, with -type you control which objects it will process. You can also match on object name or even object path. Because find does not mangle or escape the object names in any "smart" way by default, you can them pass theses anmes around raw and without fear, either through pipe (generator) or as a part of argv vector (executor).

Just remember when shuffling lists through pipes this way, -print0 has always to be used, so that names are zero separated, and others characters like spaces and newlines can be expressed directly and don't confuse the reader.

etosan
  • 1,054
  • 1
    Note that -mindepth, -maxdepth, -print0, -0, are all non-standard GNU extensions. stat is a non-standard command for which there exist several incompatible implementations. GNU find has a -printf predicate that renders stat redundant. Also note that sorting file lists reliably even with GNU find is particularly painful. So while using find, when your' eon a GNU system, can get you a long way, it will be nowhere as convenient or straightforward as when using zsh or a proper programming language like perl/python... – Stéphane Chazelas Dec 19 '21 at 19:41
  • Indeed @StéphaneChazelas, all the points you make are valid, but:
    • I used stat here solely just as an examle "payload" for @buhtz command (it could have been echo, I hope this not lost on advanced readers)
    • missing -print0 can often be emulated by -printf "xxx\0"
    • you will often find -mindepth, -maxdepth on many more oses, than GNU ls features, same with -0, all three are quite common even on many bsds
    • you can avoid all "separator" spelunking when using executor mode
    • and yes, sorting is the last remaining "issue", but is inherent to unix
    – etosan Dec 19 '21 at 23:44
  • Should I edit my answer? – etosan Dec 19 '21 at 23:45
  • Naturally as already said, proper shell (like zsh) or programming language makes you trade all the issues mentioned so far, for other set of your "new" programming issues. In my experience as mentioned, find is usually "last" stop before bringing up heavyweight tools like lua, perl or python etc, and tail is quite long on this one. I used it to process hundredths thousands of files and terabytes of data with weird names without a glitch. – etosan Dec 19 '21 at 23:51
  • Note that while -print0 is now pretty common, -printf is still GNU specific. -exec printf '%s\0' {} + would be the standard equivalent. But generally, to handle NUL delimited lists you need GNU implementations of standard utilities (sed -z, sort -z, head -z, ls --zero...). – Stéphane Chazelas Dec 20 '21 at 08:25
2

The GNU implementation of ls has a --quoting-style options which let you specify a quoting style, but none of them is compatible with the quoting style expected by xargs.

xargs (introduced by PWB Unix in the late 70s) quoting style is similar to that of the Mashey shell (also from PWB Unix), a predecessor of the Bourne shell.

ls --quoting-style=shell-always quotes with single quotes as per the Bourne/POSIX shell style, with single quotes themselves entered as \'. xargs supports both '...' and "..." in addition to backslash but quoted strings cannot include newline characters which can only be escaped with backslash.

The POSIX xargs specification also expects xargs to parse the input as text in the current locale and also does not guarantee it will work for arguments larger than 256 bytes, so in general can't be used to process arbitrary data.

The really only reliable way to use xargs is with the -0 option found in some implementations such as GNU xargs (you also generally want to use the -r option, also non-standard).

The next version of GNU ls will have a --zero option to output files NUL-delimited, so then you'll be able to do:

ls -t --zero | xargs -r0 printf ' * %s\n'

For instance.

For now, what you can do is use the --quoting-style=shell-always mode to pass that list to a shell, and that shell to convert it to a NUL-delimited list:

(
   echo 'files=('
   ls --quoting-style=shell-always -t
   echo '); print -rNC1 -- $files'
) | zsh | xargs -r0 some command

But if you're already using zsh, you don't really need ls nor xargs as zsh can sort its globs by other criteria like ls and has a stat builtin to retrieve file metadata (the only reasons you may want to use ls) and its own zargs command.

In any case, in:

ls -d -- *.txt

It's the shell that expands *.txt to the list of matching files. ls is pretty useless there as it just ends up printing them.

You might as well pass the glob expansion to something that can print the list NUL-delimited, such as the print builtin in zsh:

print -rNC1 -- *.txt(N)

(here also using the Nullglob qualifier, so it prints nothing if the glob doesn't match any file).

Or in the bash shell (sometimes still used on GNU systems as that's the GNU shell):

print0() {
  [ "$#" -eq 0 ] || printf '%s\0' "$@"
}
shopt -s nullglob
print0 *.txt