1

I fought with it for so long but I am now completely out of ideas. Maybe someone here will be able to help me. Here is what I want to achieve:

file_1.txt:

# Some comment
some_variable="test"
some other things

Marker

More things!@#$%^

file_2.txt:

# Marker
# Some other comment
other_variable_1="test"

Some totally other comment

other_variable_2="test"

I want to insert file_2.txt into file_1.txt in place of # Marker and later I want to reverse this process.

Final file file_1.txt:

# Some comment
some_variable="test"
some other things

Marker

Some other comment

other_variable_1="test"

Some totally other comment

other_variable_2="test"

More things!@#$%^

Problem is, both files are multiline and contains various special characters. I would also like to have both of those files in variables.

I tried various things, sed, perl and awk. Nothing worked for me. This is my closest attempt I think:

perl -pi -e 'chomp if eof' file_2.txt
marker_var="# Marker"
file_2_var=$(tr '\n' '\f' <file_2.txt)

sed -e "s|$marker_var|$file_2_var| tr '\f' '\n'" file_1.txt

I say closest because it it still not working. I tried to combine various answers from stackexchange but It throws error about not properly ended s. I suspected that it is because of final \n new line in file so I tried to delete it with perl command but it didn't work.

Can someone please help me?

Peksio
  • 121
  • 2
    Note that text files require a trailing newline by definition. – terdon Oct 11 '22 at 11:56
  • 1
    Not clear how you plan to reverse the insertion once completed. Is there a marker at the end of File2.txt that you plan on tracking? – jubilatious1 Oct 12 '22 at 04:15
  • Regarding I would also like to have both of those files in variables. - that's usually a bad idea that means you are pursuing the wrong solution to some problem. – Ed Morton Oct 12 '22 at 13:51
  • When you say later I want to reverse this process - that would mean removing the contents of file_2.txt from the output of running the merge command and it's not at all clear how you could do that robustly. Do you actually just mean later I want to reverse the order of the input files and do the same thing? – Ed Morton Oct 12 '22 at 14:01

6 Answers6

1

If you already have perl, then do perl all the way:

open(f1,"<file_1.txt");
open(f2,"<file_2.txt");
open(out,">new_file_1.txt");

while(<f1>) { last if /# Marker/; print out; } print out while(<f2>); print out while(<f1>);

close(out); close(f1); close(f2);

White Owl
  • 5,129
  • 1
    Hi, thank you for your answer. But I am not sure if I am doing something wrong or I explained my problem in a poor way. I saved it as a perl script and I am running it from my shell script and I am getting printed a lot of GLOBs and after that my program fails. Why is that? – Peksio Oct 11 '22 at 13:23
  • Oh, sorry, it should be filehandles, not references to them: print $h $_ vs print h. Code corrected. – White Owl Oct 11 '22 at 14:58
0

Merging files with awk is actually trivial:

awk 'NR==FNR {if ($1 == "#" && $2 == "Marker") while((getline a<ARGV[2]) > 0) print a; else print}' file_1.txt file_2.txt > out
  • NR==FNR Is an awk trick to ensure we operate only on the first file. Check this answer for more information on NR & FNR
  • {if ($1 == "#" && $2 == "Marker")else print} : Parse the first file and output all line when # Marker is not found.
  • while((getline a<ARGV[2]) > 0) print a; Parse the content of ARGV[2] (here it is file_2.txt) and output it.
  • > out is a shell redirect and write the output of this command in the out file (and create the file if it does not exist.

The result :

$ awk 'NR==FNR {if ($1 == "#" && $2 == "Marker") while((getline a<ARGV[2]) > 0) print a; else print}' file_1.txt file_2.txt
# Some comment
some_variable="test"
some other things

Marker

Some other comment

other_variable_1="test"

Some totally other comment

other_variable_2="test"

More things!@#$%^

0

Using sed

$ sed -Ee '/Marker/{e cat file_2.txt' -e ';d}' file_1.txt
# Some comment
some_variable="test"
some other things

Marker

Some other comment

other_variable_1="test"

Some totally other comment

other_variable_2="test"

More things!@#$%^

sseLtaH
  • 2,786
  • The e command seems to insert a shell command to the output, interesting. A GNU extension? But it doesn't show in the man page … where is a documentation? Anyhow, what is the advantage of e echo over the standard r? – Philippos Oct 12 '22 at 06:17
  • @Philippos No advantage as far as I can tell, I just prefer to use it as it is more flexible than r allowing for more than just read operations. Your answer is essentially exactly the same as mine – sseLtaH Oct 12 '22 at 07:32
  • Well, mine is portable and adds an explanation. My question was serious: Where can I find a documentation of the e command? Why is it not included in the man page? Since which version does it work? Are there any security concerns? – Philippos Oct 12 '22 at 10:54
0

Using Raku (formerly known as Perl_6)

my $f1  =  open :r, 'Peksio1.txt';
my $f2  =  open :r, 'Peksio2.txt';
my $out =  open :w, 'Peksio_out.txt';

for ($f1.lines) { last if / ^ "#" \s Marker $ /; $out.put($_); };

$out.put($) for $f2.lines; $out.put($) for $f1.lines;

close($f1); close($f2); close($out);

Above is a direct translation of the Perl(5) answer posted by @White_Owl. The two source files are opened for :r reading, while the destination file is opened for :w writing. The keyword for is used to iterate over lines of the first file, breaking out of the loop with last when the full Marker line is encountered.

[Note: The Marker regex can be changed to make it more/less stringent. Also note: the # octothorpe must be double-quoted to be recognized as a literal character (backslashing won't work). This is probably due to the ability to add within-regex comments via #].

The destination file $out has lines (referenced as the $_ topic variable) put into it, until no lines remain.

Sample Output:

# Some comment
some_variable="test"
some other things

Marker

Some other comment

other_variable_1="test"

Some totally other comment

other_variable_2="test"

More things!@#$%^

https://docs.raku.org/language/101-basics
https://raku.org

jubilatious1
  • 3,195
  • 8
  • 17
0

Why not a simple standard sed solution?

sed -e '/# Marker/{r file_2.txt' -e 'd;}' file_1.txt

The read command inserts a file without caring for special characters. Additionally, you need to delete the duplicate # Marker line.

Philippos
  • 13,453
0

Using any awk:

$ awk 'NR==FNR{new=new sep $0; sep=ORS; next} /# Marker/{$0=new} 1' file_2.txt file_1.txt
# Some comment
some_variable="test"
some other things

Marker

Some other comment

other_variable_1="test"

Some totally other comment

other_variable_2="test"

More things!@#$%^

$ awk 'NR==FNR{new=new sep $0; sep=ORS; next} /# Marker/{$0=new} 1' file_1.txt file_2.txt
# Some comment
some_variable="test"
some other things

Marker

More things!@#$%^

Some other comment

other_variable_1="test"

Some totally other comment

other_variable_2="test"

Regarding the script in your question:

  1. As soon as you do tr '\n' '\f' you no longer have a valid text file as it no longer contains the required terminating newline and so YMMV with what sed or any other text processing tool will do with it.
  2. The process of mapping newlines to/from form-feeds would fail when your input contains form-feeds, as is used between functions in code that wants to force functions to start at the top of a new page when printed (or other similar text that requires page breaks at specific locations when printed).
  3. You can't do sed -e "s|$marker_var|$file_2_var| because file_2_var could itself contain |s or backreferences like & or \1 that sed would then interpret, see https://stackoverflow.com/q/29613304/1745001.
Ed Morton
  • 31,617