2

I create catalogs mkdir site_{1..44}. I want to sort them

site_1
site_2
site_3
site_4
...
site_44

I execute the command ls | sort -h and I have

site_1
site_10
site_11
site_12
site_13
site_14
site_15
site_16
site_17
site_18
site_19
site_2
site_20
site_21
site_22
site_23
site_24
site_25
site_26
site_27
site_28
site_29
site_3
site_30
site_31
site_32
site_33
site_34
site_35
site_36
site_37
site_38
site_39
site_4
site_40
site_41
site_42
site_43
site_44
site_5
site_6
site_7
site_8
site_9

Where do I go wrong?

Kusalananda
  • 333,661
Thommen
  • 49
  • With GNU sort, -h compares things like 1G with 100M etc. whereas what I think you want is -V. To get a list of the directories that you just created, the easiest would be to use the same brace expansion though: printf '%s\n' site_{1..44}. I'm uncertain why you feel you need to parse the output of ls, so I'm not turning this into an answer. See also ls -v on a GNU system. – Kusalananda Dec 02 '21 at 14:06

1 Answers1

8

With GNU ls:

ls -1 -v
ls --format=single-column --sort=version

With GNU sort (and assuming file names don't contain newline characters):

ls | sort -V
ls | sort --version-sort

With zsh, you can also get globs to sort numerically with the n glob qualifier:

print -rC1 -- *(Nn)

(here also using the Nullglob qualifier so it prints nothing instead of giving an error if there's no matching file (no non-hidden files with * as the pattern)).

If passing to ls, you'd need to tell ls to disable its own sorting (-U / --sort=none with GNU ls):

ls -ldU -- *(n)

(here, not using N because if not passed any argument, ls will list . instead of listing nothing).

You can also make it the default there with set -o numericglobsort.

But maybe better here would be to use 0-padding in your file names so they sort the same numerically and lexically:

mkdir site_{01..44}    # zsh, bash, yash -o braceexpand
mkdir site_{1..44%02d} # ksh93

GNU's sort -h itself it to sort human readable sizes as displayed by du -h or ls -sh for instance where 123456789 for instance is represented as 118M, so it's of no use here.

In your case as those site_1... do not start with what looks like a human readable size, they are all considered as 0 and sort the same. You do however get a lexical sort thanks to the last-resort comparison which breaks ties by comparing full lines lexically.

Even if your directories were named site_314, site_1.1K, site_0.9M and you wanted to sort them based on the size found after the _, you'd need to tell sort where the human readable size is as in sort -t_ -k1,1 -k2h to sort first on the part before the first _ lexically, and then on the part after the _, interpreting it as a human readable size with the h flag to the key specification.