2

I have an XML file containing, amongst other lines, <asd>blablabla</asd> and

<dsa>-some stuff
-other stuff
final stuff.</dsa>

I want to replace everything between the "asd" tags with whatever is between the "dsa" tags, which will almost 100% be multiline. I do not want to replace the tags, themselves, only the text between them and keep the newlines.

The file will change from time to time, it's name, extension and tags will remain the same, only the content between them will change.

I need a command that can achieve this on a basic, bare bash, the kind Github Actions use.

I was thinking of sed, however I don't know how to tell it to replace with multiline.

EDIT:
My mistake (maybe?) my file is actually a .net csproj file, not a true XML file, so I'm unsure if commands like xmlstarlet would work with it.

5 Answers5

2

Assuming a well formed XML document:

<root>
<asd>blablabla</asd>
<dsa>-some stuff
-other stuff
final stuff.</dsa>
</root>

You can simply use xmlstarlet to replace the contents of the top-level asd node with the contents of the dsa node like so:

$ xmlstarlet ed -u '/root/asd' -x '/root/dsa/text()' file.xml
<?xml version="1.0"?>
<root>
  <asd>-some stuff
-other stuff
final stuff.</asd>
  <dsa>-some stuff
-other stuff
final stuff.</dsa>
</root>

A more complicated example that requires one to replace each asd node with its sibling dsa node:

<?xml version="1.0"?>
<root>
  <node1>
    <asd>blablabla</asd>
    <dsa>-some stuff
-other stuff
final stuff.</dsa>
  </node1>
  <node2>
    <asd>REPLACE ME</asd>
    <dsa>REPLACE WITH THIS</dsa>
  </node2>
</root>

This can be done with

$ xmlstarlet ed -u '//asd' -x '../dsa/text()' file.xml
<?xml version="1.0"?>
<root>
  <node1>
    <asd>-some stuff
-other stuff
final stuff.</asd>
    <dsa>-some stuff
-other stuff
final stuff.</dsa>
  </node1>
  <node2>
    <asd>REPLACE WITH THIS</asd>
    <dsa>REPLACE WITH THIS</dsa>
  </node2>
</root>

Kusalananda
  • 333,661
1

This has been answered before: How can I use sed to replace a multi-line string?

But I would highly suggest the second answer in that post: use perl. Many people might stray away from PERL, but it's original use was for things exactly like this.

$ perl -0777 -i.original -pe 's/a test\nPlease do not/not a test\nBe/igs' alpha.txt

As you can see you are allowed newlines in your pattern. Please see the answer for more details.

Joseph Glover
  • 353
  • 1
  • 2
  • 9
  • The original use of Perl was to parse XML?? – Kusalananda Apr 24 '20 at 13:47
  • PERL has been "backcronyzed" as Practical Extraction and Reporting Language, so not necessarily for XML, but for text extraction, regular expressions, and the like. A more complete solution would be to build a lexer and parser for the XML document and search for the node that way, but that is more involved. – Joseph Glover May 02 '20 at 01:31
  • Though after seeing xmlstarlet, that is a much better solution. :) – Joseph Glover May 02 '20 at 01:36
1

If xmlstarlet is an option - maybe through this xmlstarlet GitHub Action - then given

$ cat file.xml
<?xml version="1.0"?>
<foo>
<asd>blablabla</asd>
<dsa>-some stuff
-other stuff
final stuff.</dsa>
</foo>

you could do something like

$ xmlstarlet edit --update '//asd' --value "$(xmlstarlet select -t -v '//dsa' file.xml)" file.xml
<?xml version="1.0"?>
<foo>
  <asd>-some stuff
-other stuff
final stuff.</asd>
  <dsa>-some stuff
-other stuff
final stuff.</dsa>
</foo>

References:

steeldriver
  • 81,074
0

You can try this:

c="$(sed -n '\#<dsa>#,\#</dsa>#p' file | sed 's#<dsa>\|</dsa>##')"
sed -e "s#<asd>.*</asd>#<asd>${c//$'\n'/\\n}</asd>#" -e '\#<dsa>#,\#</dsa>#c<dsa></dsa>' file

Output:

<asd>-some stuff
-other stuff
final stuff.</asd>
<dsa></dsa>

Fetch the contents between 'dsa' tags:

c="$(sed -n '\#<dsa>#,\#</dsa>#p' file | sed 's#<dsa>\|</dsa>##')"

Insert the content into 'asd' tags:

sed -e "s#<asd>.*</asd>#<asd>${c//$'\n'/\\n}</asd>#"

Change 'dsa' tags and content to just <dsa></dsa>

'\#<dsa>#,\#</dsa>#c<dsa></dsa>'
0

Ok, ok, we should use a XML-awere parser. but with Perl:

perl -0pe 'm!<dsa>(.*?)</dsa>!s and $a=$1;
           s!<asd>.*?</asd>!<asd>$a</asd>!s' example
JJoao
  • 12,170
  • 1
  • 23
  • 45