Merging text files and adding separator

Question

I want to add a separator like this "==============" and a blank new line

I tried to do this, but failed and causes high CPU usage. i mean the cpu become rotate ery fast and noisy when i run the script

This needs to be for around 100000 text files.

this is the code that I use

#!/bin/bash
for F in *.txt ; do
    type "$F"
    echo .
    echo ========
    echo . 
done >> Combined.txt;

please advice

See Concatenate multiple files with two blank lines as delimiter? and the linked questions/answers. — don_crissti, Oct 07 '18 at 20:44

score 1 · Answer 1 · answered Oct 07 '18 at 21:07

1

I would simplify your commands as follows:

  #!/bin/bash
  for file in *.txt; do
  cat $file >> Combined.txt
  printf '\n\n=========\n\n' >> Combined.txt
  done

answered Oct 07 '18 at 21:07

Stéphane Chazelas · Answer 2 · 2022-09-07T15:10:51.087

If you're going to do it for thousands of files, you may want to avoid running several commands per file. With GNU awk:

printf '%s\0' ./*.txt | xargs -r0 gawk '
  BEGINFILE {if (NR) print "\n==========\n"};1' > combined.out

Don't give a .txt extension to the output file if you're going to put it in the same directory, or it's going to be selected as an input file and causing an infinite loop (probably your problem in the first place).

Or use a shell where cat is builtin like ksh93:

#! /bin/ksh93 -
firstpass=true
for file in *.txt; do
  "$firstpass" || print '\n===========\n'
  firstpass=false
  command /opt/ast/bin/cat < "$file"
done > combined.out

All those commands in the loop are built-in, so running them doesn't involve forking new processes nor loading external executable, so that would make the performance tolerable.

score 0 · Answer 3 · answered Oct 07 '18 at 23:02

Using `FNR` and `NR` in `awk`

#!/bin/bash

outfile="$( mktemp combined.txt.XXXXXX )"

echo "Output file: ${outfile}"

awk 'FNR==1 && NR>1 { printf("\n%s\n\n","========") } 1' *.txt > "${outfile}"

echo "Finished."

A line-by-line description:

outfile="$( mktemp combined.txt.XXXXXX )"

Use mktemp to create an empty new file with a unique name (eg, combined.txt.HDpgMn). You can use more X characters for a longer random suffix. Enclose the command in "$(...)" to store the new file's name in the variable outfile.

echo "Saving to file: ${outfile}"

Print the name of the output file. (When the script has finished, you may wish to rename the output file to remove the string of random characters following the .txt.)

awk 'FNR==1 && NR>1 { printf("\n%s\n\n","========") } 1' *.txt > "${outfile}"

Print...

a blank line,
a short line of "=" characters,
and another blank line

...at the start of each input file, except for the first input file. FNR counts the input file line numbers, resetting at the start of each file. NR counts the line numbers and does not reset.

In the awk statement, the 1 just before the closing single quotation mark evaluates to TRUE for every line, and performs the default action of printing that line. (In other words, awk '1' works like cat.)

echo "Finished."

Inform the user when the script is done. (Not strictly necessary, since you'll see the command prompt anyway, but it doesn't hurt.)

score 0 · Answer 4 · answered Oct 10 '18 at 22:13

Why not simply

printf "\n\n=====\n\n" > XTMP
cat $(printf "%s XTMP " *.txt) > combined.tmp

Put the separator into a temp file, and make use of printf's feature to repeat the formatting string for every argument it finds, so the cat command will look like

cat 1.txt XTMP 2.txt XTMP ... n.txt XTMP

You may run into system limits (e.g. LINE_MAX), though...

Merging text files and adding separator

4 Answers4

Using FNR and NR in awk

Using `FNR` and `NR` in `awk`