137

I need to be able to alphabetically sort the output of find before piping it to a command. Entering | sort | between didn't work, so what could I do?

find folder1 folder2 -name "*.txt" -print0 | xargs -0 myCommand
Flimm
  • 4,218
Industrial
  • 1,801
  • 4
  • 14
  • 13

6 Answers6

104

Use find as usual and delimit your lines with NUL. GNU sort can handle these with the -z switch:

find . -print0 | sort -z | xargs -r0 yourcommand
Oli
  • 16,068
  • 1
    It does not seem to work with find . -name '*.dat' -type f -printf '%f\n' | sort -z | xargs -r0 > output.txt. Is my line wrong due to the printf? – bomben Nov 24 '20 at 17:47
  • @Ben you're not using -print0 and are introducing newlines instead of NULLs. – ychaouche May 24 '21 at 13:29
  • or together with formating the output find . -printf "%y %p \n\0" | sort -z – BMWW Jul 09 '21 at 15:38
68

Some versions of sort have a -z option, which allows for null-terminated records.

find folder1 folder2 -name "*.txt" -print0 | sort -z | xargs -r0 myCommand

Additionally, you could also write a high-level script to do it:

find folder1 folder2 -name "*.txt" -print0 | python -c 'import sys; sys.stdout.write("\0".join(sorted(sys.stdin.read().split("\0"))))' | xargs -r0 myCommand

Add the -r option to xargs to make sure that myCommand is called with an argument.

Arcege
  • 22,536
  • Good one (two?)... Interestingly, though, the two methods handle . differently... With sort it winds up at the end of the list... with python it sorts to the top. (maybe python sorts with LC_COLLATE=C) – Peter.O Mar 16 '12 at 14:45
  • There is also the -t \0 option for sort (which is a -z synonym) – Javier Aug 10 '15 at 18:44
  • 1
    The problem with all these |sort solutions is that you cannot use -exec any longer. OK, although it is possible to rewrite your statement given to -exec so that it works with xargs, the question is, what about "mini-scripts"? (sh -c ...) I wouldn't call that trivial to transform a 'sh -c' mini-script with multiple commands so that it can work with xargs (if possible at all, that is) – syntaxerror Nov 20 '15 at 19:57
  • @syntaxerror: What problem do you have using sh -c with xargs? printf %s\\n a b c d e | xargs -n3 sh -c 'printf %s, "$@"; printf \\n' x – None Aug 24 '16 at 18:11
  • -t \0 is not the same as -z. -t is for field separator, not for line delimiter. – graywolf Jan 22 '20 at 15:05
10

I think you need the -n flag for sort#

According to man sort:

-n, --numeric-sort
    compare according to string numerical value

edit

The print0 may have something to do with this, I just tested this. Take the print0 out, you can null terminate the string in sort using the -z flag

whoami
  • 3,870
  • Well, that print0 appears to be space-separating the filenames which is what I need to pass to my command, unfortunately – Industrial Mar 16 '12 at 10:46
5

If you have GNU Parallel http://www.gnu.org/software/parallel/ installed you can do this:

find folder1 folder2 -name "*.txt" -print | 
  sort |
  parallel myCommand

You can install GNU Parallel simply by:

wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
cp parallel sem

Watch the intro videos for GNU Parallel to learn more: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Ole Tange
  • 35,514
  • 3
    What is the justification for using GNU Parallel? To speed it up? – Peter Mortensen Sep 28 '14 at 00:18
  • That and you do not need to mess with \0 separated records. – Ole Tange Sep 28 '14 at 16:46
  • 1
    I don't understand that last statement. I create a file with a line break in the file name and execute your command: cd /tmp && touch $'a\nz' && ls && find -maxdepth 1 -print | sort | parallel echo. Total false output. I know GNU Parallel now, but that answer misses the original question, doesn't it? – uav Jun 11 '20 at 14:49
  • 1
    I know that it is bad practice to use crazy characters in file names - I am already including the blank space. I just see that parallel has a -0 parameter. Nice. No downvote. find -maxdepth 1 -print0 | sort -z | parallel -0 echo. – uav Jun 11 '20 at 15:01
  • 1
    @uav In my 25 years of sysadmin I have never seen a user making a file with \n. I have seen plenty of files with ' space and ". So unless you have evil users or a filesystem with error, I will reckon you will not meet a file with \n that was not made by a fellow sysadm. – Ole Tange Jun 11 '20 at 20:46
  • The original question was about print0. print0 used as separator \0 instead of line breaks. Why does print0 exist? I think in order to have a safe separator and thus be able to handle all the crazy characters. I know you know that. \n was just an example. You answer with print. Kinda missed the point. The main thing is to advertise. By the way: echo 'will cite' | parallel --citation 1>/dev/null 2>/dev/null. To get rid of that annoying citation message. – uav Jun 12 '20 at 09:38
  • @OleTange You will find \n being used by those trying to achieve command injection to defeat security measures. – Graham Leggett Jan 22 '24 at 11:46
5

Some implementation of find supports ordered traversal directly via the -s parameter:

$ find -s . -name '*.json'

From the FreeBSD find man page:

-s       Cause find to traverse the file hierarchies in lexicographical
         order, i.e., alphabetical order within each directory.  Note:
         `find -s' and `find | sort' may give different results.
raychi
  • 1,191
  • 1
  • 8
  • 4
2

Some solutions here don't work correctly because the sort command takes the full "path" string to sorting instead of the filename string.

This is a quite complicated but working example of natural sorting results of the "find" command:

find every_minute -type f -name "*.sh" -printf '%f\t%p\n' | sort -V -k1 | cut -d$'\t' -f2 | tr '\n' '\0' | xargs -r0 -I {} echo 'Found: "{}"'

Result:

Found: "every_minute/api/1_build_synonyms.sh"
Found: "every_minute/search_module/2_rotate_index.sh"
Found: "every_minute/api/3_check_synonyms.sh"
Found: "every_minute/api/4_run_schedule.sh"
Found: "every_minute/search_module/10_test.sh"

Example of an invalid find every_minute -type f -name "*.sh" | sort -z | xargs -r0 echo command result:

every_minute/api/1_build_synonyms.sh
every_minute/api/3_check_synonyms.sh
every_minute/api/4_run_schedule.sh
every_minute/search_module/10_test.sh
every_minute/search_module/2_rotate_index.sh

Based on this answer.