2

ls seems to have a number of limitations that seem odd to me that are not included in its switches (such as --max-depth= as other tools have). I like to maintain common standards (so ls and ll follow what most normal distros have), but my additional aliases follow some kind of easy to remember syntax (lls is 'long listing, security' etc). I could split this up into a few different questions, but since it all relates to this attempt to find common ways of manipulating ls, a listing of my working to describe what I am talking about felt more appropriate as they are all related. Some specific questions:

  • I've often heard it said that you should never use ls in for loops etc. That makes sense in terms of how ls will look inside subdirectories by default etc, but is there a simple way to clip ls to never look inside subdirectory? I see nothing like --max-depth= in the man page, but it seems to me that if we clip ls to not go into subdirectories then it should be reliable to use in for loops or other constructs. Is there a reliable way to clip ls to only output for one directory and then use that in a for loop?

  • I have used what I feel are quite clunky constructs for lld (long listing with directories) and llf (long listing with files). Is there a better way to say "I just want to see files?" or "I just want to see directories?"; again, there is nothing in the man page that I can see. In particular, I can only do this listing in -l format as otherwise I could not grep out the items that I do not want to display. In general, I think using grep in this way is probably a bad idea (as locks into -l format), so is there a better way to achieve just picking directory items or just picking files, instead of using grep?

  • If any other approaches in the below are malformed, I would appreciate knowing better ways?

Attempt to have a standard set of ls outputs (as are often setup differently on each distribution).

The [] character set wildcards. e.g. ls name[03][17].c, would match name01.c, name07.c, name31.c, name37.c, and [] also allows ranges: ls name[07][1-9].c

Note the use of \ls to run the bare command, ignoring an alias.

The -F appends an indicator (one of */=>@|) to entries color=always vs color=auto.

ls is quite awkward with respect to recursion, e.g. ls * will look into every subfolder even without the -R flag, and ls c* would look through every folder starting c.

Never use the output of an ls in a for i in ls as that can be unpredictable.

Don't want to do anything obscure with aliases, follow widely used settings for ls, ll, l, then have additional aliases for other tasks.

  • [[ $(type ls) == *"aliased"* ]] && then unalias ls: Don't need this, but in general on testing for type alias to act upon them

  • alias ls='\ls --color=always --group-directories-first': l will print the normal ls in most distros (i.e. will not show .* files!). Note that \ before a command revets a command back its non-aliased form.

  • alias l='ls -AFh --color=always --group-directories-first': Using -A (almost all, ignores ./ and ../, but show all .* files, putting this to ls as almost always want to see .* files)

  • alias la='ls -Ah': long format (-AFh and also -l): Note that the above l alias uses \ls to run ls bare without flags (to remove -A mainly)

  • alias ll='ls -lAh': long format (-AFh and also -l): Note that the above l alias uses \ls to run ls bare without flags (to remove -A mainly)

alias l.='ls -d .*'    # Explicitly list just .* files, so ./ and ../ are shown, overriding the A flag
alias ls.='ls -d .*'   # Explicitly list just .* files, so ./ and ../ are shown, overriding the A flag, long format
alias ll.='ls -dl .*'  # Explicitly list just .* files, so ./ and ../ are shown, overriding the A flag, long format
alias lld='ls -FlA | grep :*/'      # Only directories
alias llf='ls -FlA | grep -v "/"'   # Only files (broken as will show symlinks etc, if have to use `/`, then `/$` would be better, but using grep at all is probably not optimal)
alias ldot="ls -ld .??*"            # Dotfiles only
alias lx="ls -FlA | grep *"         # Executable files only, below 'lxext' is just trying to find 'executable-like' files by their extension
alias lnox="ls -FlA | grep -v *"    # Everything except executable files
alias lxext='ll *.sh *.csh *.ksh *.c *.cpp *.py *.jar *.exe *.bat *.cmd *.com *.js *.vbs *.wsh *.ahk *.ps1 *.psm1 2> /dev/null'  # List possible executables and scripts by extensions, discarding error output (as will generate for every type that is not there)
alias lext='ls -Fla | egrep "\."'         # Files without extensions only   ".|/"
alias lnoext='ls -Fla | egrep -v "\."'    # Files without extensions only
alias lsp='find . -maxdepth 1 -perm -111 -type f'        # List executable by permissions.   ls -lsa | grep -E "[d\-](([rw\-]{2})x){1,3}"   https://stackoverflow.com/q/7812324
alias lsum="ls -Fla \$1 \$2 \$3 \$4 \$5  | awk '{ print; x=x+\$5 } END { print \"total bytes = \",x }'"   # ls on required info then awk will sum the sizes
alias lll='ls --human-readable --size -1 -S --classify'  # Long-list with just size and name and total size summary line
alias lm='ls -Am'; alias lcsv='lm'         # comma separated view (-m), -A almost all, except '.' and '..'
alias lsz='ls -lAshSr'; alias lsize='lsz'  # -s size, -h human readable, -S by size, -r reverse so largest are easily visible at end
alias lt='ls -lAth'  ; alias ltime='lt'; alias ldate='lt'; alias lst='lt'   # sort by -t time/date, human readable
# replicate 'ls', but using 'stat' to show both normal 'ls' permission flags *and* octal.
lsec() { if [ -z "$@" ]; then args='. .*'; else args="$@"; fi; stat --printf="%A\t%a\t%h\t%U\t%G\t%s\t%.19y\t%n\n" $args; };   alias lstat='lsec';
lperm() { if [ -z "$@" ]; then args='. .*'; else args="$@"; fi; stat --printf="%A  %a  %n\n" $args; };   # Just permissions, %A (ls format), %a (Octal format) and names
sanitize() { chmod -R u=rwX,g=rX,o= "$@" ;}   # Make directory and file access rights the same
alias 000='echo "---------- (Owner -, Group -, and Other -)"; chmod 000'   # Remove permissions: append with file/directory to apply to
alias 644='echo "-rw-r--r-- (Owner rw, Group r, and Other r)"; chmod 644'  # Onwer rw, everyone else read-only
alias 755='echo "-rwxr-xr-x (Owner rwx, Group r-x, and Other r-x)"; chmod 755'  # Make executable, but only Owner has write permissions
alias mx='chmod a+x'   # Make Executable
alias lls='lss'        # Since the 'stat' output is in long format 'll', also use 'lls' for 'long listing with ecurity'
alias sl='ls'          # Common typo, also just overwrite the 'Steam Locotomive' toy if present, as that gets boring
alias lg='exa -lG'     # 'ls grid', exa is an interesting potential successor to 'ls', in Ubuntu 20.10 repo by default, colours each permission item and -lG is a useful 2x column long view.
if grep -qEi "(Microsoft|WSL)" /proc/version &> /dev/null; then
    for d in /mnt/[a-z]; do [ -d /mnt/$(basename ${d}) ] && alias "$(basename ${d}):"="cd $d"; done         #  "d:" => cd to /mnt/d
    for d in /mnt/[a-z]; do [ -d /mnt/$(basename ${d}) ] && alias "l$(basename ${d}):"="cd $d && ll"; done   # "ld:" => cd to and list d:""
fi
YorSubs
  • 621
  • 3
    not using ls in a shell loop is more to do with the output being potentially ambiguous, in particular with filenames containing newlines. And the fact that something like $(ls *.txt) could just be replaced with *.txt, since it's the shell that expands the glob, and the ls there would just repeat what it got on the command line. A bit like if you did $(echo *.txt). – ilkkachu Oct 12 '21 at 11:42
  • 1
    as for the list of aliases... I'm not sure what you're asking exactly. And there's a lot of those aliases. Though I'm pretty sure e.g. ls -FlA | grep *" does not do what you think it does, again because the shell expands the *... – ilkkachu Oct 12 '21 at 11:45
  • Right, I've heard this about 'newlines in filenames' just recently, which sounds like a very bad design in linux. I think you are right, some of these do not do what I think they do, and that's kind of the point: I'm trying to work out plain questions, such as "how would I just show directories?" and while this should be a simple question, ls, and possibly the way that wildcards interact with the filesystem, often make these quite difficult to achieve. In a way, my collection of aliases here is also a means for me to have an effective library of techniques that will answer these questions. – YorSubs Oct 12 '21 at 11:56
  • 1
  • 3
    @YorSubs the only characters not allowed in file names are / since that is part of path definitions and \0 (null). Everything else is fair game and has been since long before Linux was developed. – terdon Oct 12 '21 at 12:02
  • Sure, all the way back to unix variants in the 70's. I think there is some usefulness in having more restricted characters (particularly new lines just seem to be asking for problems), but it is what it is and we have to work with that. – YorSubs Oct 12 '21 at 12:15
  • 1
  • also https://dwheeler.com/essays/fixing-unix-linux-filenames.html and https://dwheeler.com/essays/filenames-in-shell.html – ilkkachu Oct 12 '21 at 14:44
  • 1
    @YorSubs, of course in practice you can probably do something like ls | grep without too many ill effects if your know your files don't have newlines, and you only e.g. clone git trees from sensible people. I mean, part of the reason people do it is that they can get away with it. But of course unix.SE has a tendency of being a bit more strict about things like that, if only for the possibility of encountering such abominations. (Or highlighting the suckiness of most POSIX-like shells and other tools...) – ilkkachu Oct 12 '21 at 14:46

1 Answers1

12

The common advice against using ls isn't about for loops, it is about any sort of parsing of the output of ls. This includes things like ls | grep which are pretty much guaranteed to fail with strange file names (those that contain newline or glob characters, for example). Please read through https://mywiki.wooledge.org/ParsingLs and Why *not* parse `ls` (and what to do instead)?.

The main point is that the reason you don't want to use ls in for loops or any other parsing has nothing to do with subdirectories and everything to do with how ls displays its results. That said, there is a flag that causes ls (at least GNU ls, the default on Linux) not to descend into subdirectories:

       -d, --directory
              list directories themselves, not their contents

For example:

$ tree
.
├── dir1
│   ├── file1
│   ├── file2
│   └── file3
└── dir2
    ├── file1
    ├── file2
    └── file3

2 directories, 6 files

Now compare:

$ ls *
dir1:
file1  file2  file3

dir2: file1 file2 file3

and

$ ls -d *
dir1  dir2

You're already using it in some of your aliases.

As for the lsd and lsf aliases, yes that's clunky as you say and will also fail for weird file names. The "standard" way of listing only directories would be:

ls -d */

I don't know how to get only files with basic ls, if you really need that for some reason, you can use find instead:

find . -maxdepth 1 -type f

In general, ls is designed to be read by humans and not scripts. So if you need to parse it, there are always better tools. Usually find or stat.


Some of those aliases don't do what you think at all. These, for example:

alias lx="ls -FlA | grep *"         # Executable files only, below 'lxext' is just trying to find 'executable-like' files by their extension
alias lnox="ls -FlA | grep -v *"    # Everything except executable files

Since the alias produces an unquoted *, it will be expanded by the shell before grep ever sees it. So if you have a directory with two files:

$ ls
file1  file2

Then, lx will actually run

ls  -FlA | grep file1 file2

Because the * will be expanded to the contents of the directory, so file1 file2. The result is that your alias will completely ignore ls and instead grep for the string file1 in the file file2.A working alternative for these two would be:

alias lx="find . -maxdepth 1 -executable -type f"
alias lnox="find . -maxdepth 1 -type f ! -executable"

Going through a list of 28 aliases is way beyond the scope of a simple Q&A but this should serve as a starting point: forget parsing ls, it can't be done safely and you should use aliases with find or stat instead.

ilkkachu
  • 138,973
terdon
  • 242,166
  • This is a very useful answer. I'm curious with respect to things like lx / lnox in the above, I like consistency, and while find does return the executables / non-executables perfectly, it would be nice to be able to have that in ls format and be able to use standard ls flags with that result (nice to have I guess, but maybe this is impossible). However, I do see your point that ls is maybe meant more for human readability - it seems a shame then that they did not give ls a few extra options to more easily filter on what someone is trying to find. – YorSubs Oct 12 '21 at 16:13
  • Also, should I take it then that using the results of a find as for i in $(find . -maxdepth 1 -executable -type f); do echo $i; done would be safe (including from files with newlines in them)? That being the case, I think I have to start using find a lot more if I want to loop through a set of files. I think I need to split up my ideas into a set of aliases and functions just for console readability use, and another set of find snippets that can help me get the files that I need within a script. – YorSubs Oct 12 '21 at 16:13
  • 1
    @YorSubs no! for i in $(anyCommand) AKA Bash pitfall #1 is never a good idea. Always prefer anyCommand | while read i; do ...; done or, better anyCommand | while IFS= read -r i; do; ... done. To make find safe for arbitrary file names (assuming GNU find) you can use -print0 to print filenames as a null-delimited list and then use tools that can handle such data (e.g. sort -z or grep -z). The wiki I linked you to has some great advice. Have a look at https://mywiki.wooledge.org/BashPitfalls and https://mywiki.wooledge.org/BashFAQ. – terdon Oct 12 '21 at 16:31
  • A lot to process here, thanks. Various of the things that I used above, I have seen hundreds of times (as some of these things seem like they should be natural). I don't want to get into those bad habits, so thanks for putting me on the right path. I'm a bit disappointed that we can't do for i in $(anyCommand) as it feels more intuitive than IFS= read -r i (and I see so much more of the former). A practical example: you say 'anyCommand', so is ls ok again? i.e. to open all *.sh and *.md in vs code, would this be ok? ls *.sh *.md | while IFS= read -r i; do x+="$x $i"; done; code $x – YorSubs Oct 12 '21 at 17:57
  • 1
    Any particular reason why the two find commands in the last example have their predicates in opposite order to each other? It makes it less obvious how they are related (and any decent find will reorder predicates as necessary to optimise execution). – Toby Speight Oct 12 '21 at 20:22
  • 1
    An alternative to find's -print0 is to use find's -exec option. e.g. find . -maxdepth 1 -type f -exec ls -l {} + (note that some versions of find, e.g. GNU, have a -ls predicate which has similar output to ls -ld. e.g. find . -maxdepth 1 -type f -ls ). Or -exec with an embedded shell script: find . -maxdepth 1 -type f -exec sh -c 'for f; do echo "$f"; done' find-sh {} +. – cas Oct 12 '21 at 23:23
  • 1
    In general, find is the right tool for many of the things you want to do with ls but can't. Also, many of the things you're trying to do with aliases should probably be done with functions instead - aliases are very limited in what they can do (they're just a simple substitution of the beginning of the line/command), while a function can do anything a shell script can (including parsing options, and using arguments in exactly the place(s) they're needed, and much more). – cas Oct 12 '21 at 23:24
  • 1
    @YorSubs well, "never use for i in $(command)" is a bit of an exaggeration. You can use it if you know that the output of command will never have any whitespace or globbing characters. It's just that, usually, you cannot be sure of that and you certainly cannot be sure of it with an arbitrary, unknown command, so it's best to avoid it. However, something like for i in $(seq 1 10); do ... ; done would be fine. – terdon Oct 13 '21 at 13:16