3

Assume I have many *.txt files on directory texts with the below contents.

Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aliquam tincidunt mauris eu risus.
Vestibulum auctor dapibus neque.

And I want to replace them with the following contents recursively.

Vestibulum commodo felis quis tortor.
Ut aliquam sollicitudin leo.
Cras iaculis ultricies nulla.
Donec quis dui at dolor tempor interdum.

As this is a quite large replacement. Typing each one of them can be time consuming.

Hence, I think it would be better if there is an option like this.

Copy and Paste the original texts into file original.txt and the required replacements into another file update.txt.

And then execute a command to find all the *.txt files in the directory texts that consist of the content in original.txt and replace them with the contents of update.txt.

Similar to simple replacements like:

find texts -name "*.txt" -exec sed -i 's/original/update/g' {} \;

I think this way there will be no mistakes as manual typing and less time will be consumed.

But I don't know what command I should use to achieve this? Is this possible.

However, first of all I must be able to verify the availability and number of occurrence of the original text.

Similar to Simple Checks like:

cd texts
grep -r --color=always "original" | wc -l

Thanks.

2 Answers2

2

I'd use perl instead of sed (or awk):

find texts/ -name '*.txt' \
  -exec perl -0777 -p -i.bak -e '
    BEGIN {
      $search = q{Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aliquam tincidunt mauris eu risus.
Vestibulum auctor dapibus neque};
  $replace = q{Vestibulum commodo felis quis tortor.

Ut aliquam sollicitudin leo. Cras iaculis ultricies nulla. Donec quis dui at dolor tempor interdum.}; };

s/\Q$search\E/$replace/mg' {} +

  • -0777 tells perl to "slurp" in the entire file at once and process it as one long string

  • -p makes perl behave similarly to sed (and the counterpart -n option makes it work like sed -n).

  • -i.bak does an "in-place" edit of the file, saving the original with a .bak extension. Again, similar to sed -i.

    If you don't want the backup copies, use just -i instead of -i.bak.

  • \Q in a perl regex tells perl to treat the following pattern (until it sees a \E) as literal string even if it contains regex special characters.

    From man perlre:

    \Q quote (disable) pattern metacharacters until \E

    \E end either case modification or quoted section

  • q{} uses the perl q quoting operator that works exactly the same as single-quotes. It's particularly useful in a one-liner where the perl script is already in single-quotes (which can't be backslash-escaped because escape codes are ignored inside single quotes). See man perlop and search for "Quote and Quote-like Operators". See also perldoc -f q (and compare with perldoc -f qq, the double-quote operator).

BTW, I recommend testing just the perl portion of this on a single file and examine the output to make sure it's going to do what I want (i.e. without find and especially without -i.bak).

cas
  • 78,579
  • This work perfectly and recursively as I expected. However, It produces file with .bak extensions for backup. How to remove them recursively. Thanks – zakadmin Feb 16 '23 at 15:04
  • 1
    if you don't want the .bak copies, use just -i - same as with sed making backup copies with -i is optional. If you've already created them and want to delete them: find texts/ -name '*.bak' -delete. – cas Feb 16 '23 at 17:02
  • @Peregrino69 Thanks. I updated accordingly. – zakadmin Feb 16 '23 at 17:05
  • @zakadmin I just made a small change to the script, replacing the double-quotes around the strings in the BEGIN{} block with q{}, which is one of perl's quoting operators. It works just like single-quoted strings (so, e.g., no variable interpolation or escape-code interpretation) because it is an alternative for single-quotes in perl. q{} is particularly useful when the entire script is inside a single-quoted string. See man perlop and search for "Quote and Quote-like Operators". I made this change in case your actual search or replace string contains characters like $ or . – cas Feb 16 '23 at 17:19
  • Thanks @cas but you had missed the next requirement grep. If you can give me an option for that as well. This will be a complete answer for my question. – zakadmin Feb 16 '23 at 17:28
  • I have no idea what you're talking about wrt grep. If you have a second question, you should post a second question - that's how this site is supposed to work: one question per post. – cas Feb 16 '23 at 17:35
  • anyway, it's 4.30 am here and i should try to get some sleep. i'd be asleep already if it wasn't so damn hot (still over 22C even now, and will be back up to 37 or so by mid-day) – cas Feb 16 '23 at 17:41
0

Here's one way, done in Debian 11:

  • bash v.5.1.4(1)
  • tr (GNU Coreutils) v.8.32
  • sed (GNU sed) v.4.7

Written this way this script

  1. assumes every file in question has .txt extension
  2. must be run in the same directory where the .txt -files are
  3. displays the name and contents of every file while processing
  4. creates new files instead of replacing in place to ensure original files are still available in case of failure. Also this way the newly created files can be deleted with a single command without affecting the originals, if for example the replacement text needs to be changed.

Directory contents:

pg1@TREX:~/foo$ ls
repla.sh  text1.txt  text2.txt

Text file contents:

pg1@TREX:~/foo$ for i in {1..2}; do cat text$i.txt; echo; done
This is the beginning
of the first text.

These lines will be replaced.

This is the ending of the first text.

This is the beginning of the second text.

These lines will be replaced.

This is the ending of the second text.

The replacement text:

New lines
that will replace
old lines.

repla.sh:

#!/bin/sh

for i in *.txt; do echo $i cat text.txt | tr '\n' '\r' | sed -e 's/These lines\rwill be replaced./New lines\rthat will replace\rold lines./g' | tr '\r' '\n' | tee $i.new done

Result:

pg1@TREX:~/foo$ ./repla.sh 
text1.txt
This is the beginning
of the first text.

New lines that will replace old lines.

This is the ending of the first text. text2.txt This is the beginning of the second text.

New lines that will replace old lines.

This is the ending of the second text. pg1@TREX:~/foo$ ls repla.sh text1.txt text1.txt.new text2.txt text2.txt.new

Script needs to be adjusted as needed, for example to accommodate the situation if there are also text files with different extensions (or without one) to modify or if the files are in different directory/directories. Commenting out echo $i prevents displaying the file names while processing, replacing | tee with > redirection prevents the file contents from showing. In-place replacement requires creating new files, the originals can be replaced simply by adding mv to the script.

Please note Ed Morton's comments below.

There are other solutions, quite probably more elegant; and more suitable for longer paragraphs. For example in this Stack Overflow question the old and replacement paragraphs are in variables.

Peregrino69
  • 2,417
  • Can I dry run this to check that it works fine before execution. – zakadmin Feb 16 '23 at 07:01
  • However, this seems to creating a new file. I want to modify existing files and also not single files but in bulk. – zakadmin Feb 16 '23 at 07:03
  • Please check the example output. This script handles at one go every .txt-file in the directory containing the text to be replaced. This is only an example on how this can be handled. Modifying it to suit your specific needs is really up to you. Please keep in mind, SE sites aren't a free scripting service. – Peregrino69 Feb 16 '23 at 08:31
  • You should copy/paste that into http://shellcheck.net and fix the issues it tells you about. Also, as soon as you do tr '\n' '\r' you've changed the input into something that's no longer a valid text file (no terminating newline) so YMMV with what any subsequent text processing tool does with it, it'd fail if the input contained \rs, and sed would fail given any of several different chars in the input or replacement text, see https://stackoverflow.com/q/29613304/1745001. – Ed Morton Feb 16 '23 at 12:55
  • Thx @EdMorton, I'd forgotten abt shellcheck; and that thread I didn't see :-) I don't intend to develop this further until I need it myself, but will keep your advise in mind. I just wanted to create something that accomplishes the exact task asked - there's no mention of further processing. At the end \r gets again changed to \n, the end result appears to achieve the purpose; so could you maybe expand that a bit? – Peregrino69 Feb 16 '23 at 13:57
  • I mean don't assume that any given sed version will be able to handle input without a terminating newline as that's undefined behavior per the POSIX standard (sed takes a text file as input and a text file MUST end in a newline) so any sed provider can do whatever they like with it. It'll probably be OK but it's a small gamble. – Ed Morton Feb 16 '23 at 14:45
  • 1
    Thanks again, @EdMorton :-) POSIX-compliance is somewhat of a stumbling block for me. Still :-) I added the versions in the answer for completeness. – Peregrino69 Feb 16 '23 at 16:17