0

There is a function Parse_xml as below

    Parse_XML()
{
TDIR=$1
_VERSION=
_REVISION=
_FILENAME=
_COMPONENT=
_DESCRIPT=
_ISITOA=0
_NOLOG=0

_OAVERSION=

local TMP=/tmp/tmpfile.txt-$$
local JUNK

# find the cpq_package XML file and assign it to file
local file=
for xmlfile in *.xml
do
    if [ -n "$(head ${xmlfile} | grep '<cpq_package')" ] ; then
        file="${xmlfile}"
        break
    fi
done


if [ -z "${file}" ] || [ ! -f "${file}" ]
then
    _NOLOG=1
    return
fi

${echo} `grep \<version $file|awk -F = '{print $2}'|awk '{print $1}'|tr -d '"'` > $TMP
read _VERSION JUNK < $TMP
${echo} `grep \<version $file|awk -F '=' '{print $3}'|awk '{print $1}'|tr -d '"'` > $TMP
read _REVISION JUNK < $TMP

_OAVERSION=${_VERSION}
_VERSION=${_VERSION}${_REVISION}

here the version and revisions fetched from xml file from this line

<version value="GPK5" revision="B" type_of_change="1"/>
<version value="GPK5" revision="" type_of_change="1"/>

here some of the revision are empty string and some are having 1 character so the command

 grep \<version CP057761.xml|awk -F = '{print $2}'|awk '{print $1}'|tr -d '"'

is fetching all the version from xml and store in TMP file. And command

grep \<version CP057761.xml|awk -F '=' '{print $3}'|awk '{print $1}'|tr -d '"'

is fetching revisions of all the version headers from xml with different versions.

so sometimes the revision of previous version if fetched and added to a version which has empty revision.

How I can modify this command

    ${echo} `grep \<version $file|awk -F = '{print $2}'|awk '{print $1}'|tr -d '"'` > $TMP
    read _VERSION JUNK < $TMP
    ${echo} `grep \<version $file|awk -F '=' '{print $3}'|awk '{print $1}'|tr -d '"'` > $TMP
    read _REVISION JUNK < $TMP
_OAVERSION=${_VERSION}
_VERSION=${_VERSION}${_REVISION}

to search only the value in _VERSION variable in xml file and fetch it's particular version. so when it has revision, the _VERSION prints GPK5B and when its empty, the _VERSION prints GPK5.

I fixed the issue by searching the $_VERSION in grep of revision instead \<version. it fetched me only revisions with that particular version and read _REVISION JUNK $TMP fetched me the revision So basically I wanted only latest revision along with version. I regret, I wasn't clear with my question before.

raj
  • 1

2 Answers2

3

Use an XML parser to parse XML data. is one.

Given file.xml containing

<root>
<version value="GPK5" revision="B" type_of_change="1"/>
<version value="GPK5" revision="" type_of_change="1"/>
</root>

Then

xmlstarlet sel -t -m '//version' -v '@value' -v '@revision' -n file.xml

Outputs

GPK5B
GPK5
glenn jackman
  • 85,964
1

Don't use sed nor regex to parse HTML/XML you cannot, must not parse any structured text like XML/HTML with tools designed to process raw text lines. If you need to process XML/HTML, use an XML/HTML parser. A great majority of languages have built-in support for parsing XML and there are dedicated tools like xidel, xmlstarlet or xmllint if you need a quick shot from a command line shell.. Never accept a job if you don't have access to proper tools.

is the most advanced XML/HTML parser in command line out there.

His syntax is more intuitive than xmlstarlet and xmllint when you know query language:

xidel -e '//version/(@value||""||@revision)' -s file.xml
GPK5B
GPK5