I have a large folder containing many sub-directories each holding many .txt
files. I want to concatenate all of these files into one .txt
file. I am able to do it for each of the sub-directories with cat *.txt>merged.txt
, but I am trying to do it for all of the files in the large folder. How do I do this?

- 41,407

- 81
3 Answers
try with
find /path/to/source -type f -name '*.txt' -exec cat {} + >mergedfile
find all '*.txt' f
iles in /path/to/source
recursively for sub-directories and concatenate all into one mergedfile
.
To concatenate each sub-directories files within its directory, do:
find . -mindepth 1 -type d -execdir sh -c 'cat $1/*.txt >> $1/mergedfile' _ {} \;

- 41,407
If you are using Bash and the number of text files is contained (i.e. does not exceed the maximum argument number limit, which is very large but not infinite), you can easily achieve this with the globstar
feature:
$ shopt -s globstar
$ cat **/*.txt > merged.txt
A more general, although less elegant approach, will be to use find
as the driver and make it call cat
on each file, appending the output:
$ find -name \*.txt -exec sh -c 'cat {} >> merged.out' \;
Calling sh
is needed here because you want to append the result of each cat
. Make sure the output file has a different extension or lies outside of the tree you're merging, or find
may try to concatenate the output with itself.

- 1,857
If you have to do the concatenation in a particular order, then the below will concatenate the files in lexicographical order (sorted by pathnames) in bash
:
shopt -s globstar
for name in **/*.txt; do
[ -f "$name" ] && cat <"$name"
done >merged.out
This is similar to the find
command
find . -type f -name '*.txt' -exec cat {} ';' >merged.out
except that the ordering may be different, symbolic links to regular files would be included (add a && [ ! -L "$name" ]
if you don't want them) and hidden files (and files in hidden directories) would be excluded (use shopt -s dotglob
to add them back).

- 333,661
-
-
@αғsнιη Absolutely nothing now when you've changed your answer. I will modify that part. Thanks for letting me know. – Kusalananda Jun 04 '18 at 06:26
-
Does bash guarantee that
**/*.txt
sorts the pathnames in lexicographical order? – Derek Mahar Mar 29 '23 at 22:27 -
1@DerekMahar Yes, the list resulting from expanding a globbing pattern is guaranteed to be lexicographically sorted. From the POSIX standard: "If the pattern matches any existing filenames or pathnames, the pattern shall be replaced with those filenames and pathnames, sorted according to the collating sequence in effect in the current locale." – Kusalananda Mar 29 '23 at 22:36
>>
can be>
in the firstfind
call. – Kusalananda Jun 04 '18 at 05:36>
redirects the output offind
, notcat
. Thecat
command ends at the+
, and you can't do redirections in-exec
without using a child shell (sh -c
). In your second example, you won't need it either as you do one directory at a time. – Kusalananda Jun 04 '18 at 05:52-execdir
is already executing with the directory as the working directory, you should get rid of$1/
in the command. – Kusalananda Jun 04 '18 at 05:56>
instead of>>
in first command is right but$1/
is needed in second command and that works I tested before. note that execdir is changing for the find not for the child-shell I used there – αғsнιη Jun 04 '18 at 06:10_
) that appears just before the the opening and closing braces ({}
)? – Derek Mahar Mar 28 '23 at 01:04_
there. – αғsнιη Mar 28 '23 at 02:17sh
be{}
which is the placeholder thatfind
replaces with each directory name? – Derek Mahar Mar 29 '23 at 12:48_
there is 0th argument to thesh -c '....'
and{}
is the 1st. when you remove the_
, the{}
is being 0th argument while thesh -c '...'
perform and do stuff on the$1
argument but now there is no 1st ($1
) argument. why we don't use the{}
as the first argument because in general always the 1st argument is the script name and all errors/warning/... will use that name prefixed to alert where things go wrong. – αғsнιη Mar 29 '23 at 14:30$0
is always the name of the script, but in this case, the script is anonymous, so you must provide the name of the script as the first (0th) argument. Is this correct? – Derek Mahar Mar 29 '23 at 22:14sh -c 'echo $0 $1 $2' a b c
printsa b c
andsh -c 'echo $1 $2' a b c
printsb c
. In both cases,sh
assignsa
to$0
,b
to$1
, andc
to$2
, but only the first example prints all of the arguments. – Derek Mahar Mar 29 '23 at 22:54find ./ -type f -name '*.txt' -not -name 'mergedfile.txt' -exec cat {} + >mergedfile.txt
. – BadAtLaTeX Feb 29 '24 at 18:24