1

I use sed -i to replace the port value in a xml file,

<property name="port" value="8954" />

But I don't know how escape the regex, my regex is below.

(?<=name="port"\s+value=")\d+(?=")

my command is

sed -i "s/${PORTREX}/$NEWPORT/g" config.xml
Kusalananda
  • 333,661

3 Answers3

1

Using XMLStarlet on the XML document file.xml to update the value of the value attribute of all property nodes whose name attribute is port:

xmlstarlet ed -u '//property[@name="port"]/@value' -v "$NEWPORT" file.xml

The XPATH expression //property[@name="port"]/@value means "the value attribute belonging to any property node whose name attribute is port". The selected attributes will be set to $NEWPORT (this is assumed to be a port number).

Example:

$ NEWPORT=8080
$ xmlstarlet ed -u '//property[@name="port"]/@value' -v "$NEWPORT" file.xml
<?xml version="1.0"?>
<root>
  <property name="port" value="8080"/>
</root>

Using sed to modify XML is difficult, especially if you're trying to use a Perl-like regular expression that sed simply does not support. Using line-oriented tools to process XML is also fraught with potential issues relating to the placement of newlines and the relative placement of node attributes within lines (note that <property name="port" value="8954" /> is the same as <property value="8954" name="port" />). Encoding of strings is also a potential issue that sed could handle only if you did proper decoding within sed itself (e.g. &amp; --> & etc.)

Question related to the difficulty of parsing XML-like data using regular expressions:

Kusalananda
  • 333,661
1

I wrote this in a comment because I didn't realize that this is already the solution. The OP wrote it works, so this time as an answer:

sed "s/\(name=\"port\" *value=\"\)[0-9]*/\1$NEWPORT/g" config.xml

The regular expression suggested by the OP used a couple of non-standard extensions, so it didn't work with sed, but you don't need them:

I use sed -i to replace the port value in a xml file,

But I don't know how escape the regex, my regex is below.

  • instead of the "lookbehind" (?<=somestring) you can use the string as is, but it in \(somestring\) to mark it as subexpression and reuse it in the replacement as \1, thus instead of saying "this needs to be there, but not get replaced", just replace it with itself
  • The + is a shortcut for \{1,\} (one or more occurences), but you can also replace a+ with aa*. In this case it's probably save to use a simple *, because zero occurences are impossible
  • The \d is not better to read than [0-9], in my opinion
  • \s can be replaced by [[:space:]] or [ <literal tab>]. Here a simple whitespace does the job

Conclusion: Almost all reg-ex extensions can easily be avoided, maybe giving a little lengthy pattern or harder to read, but often without disadvantage. An exception is the non-greedy match: In some cases you need a different concept without them.

Philippos
  • 13,453
-1

Try this

sed -i "s/name=\"port\" value=\"8954\"/$NEWPORT/g" config.xml
muru
  • 72,889