2

problem is the following:

I have an xml file with data and I am looking for a small part of the data to write it into a new file: content has been shortened by request:

snippet if type=dhcp-client:

    <deviceconfig>
      <system>
        <type>
          <dhcp-client>
            <send-hostname>yes</send-hostname>
          </dhcp-client>
        </type>
        <hostname>Firewall</hostname>
      </system>
    </deviceconfig>

snippet if type=static

    <deviceconfig>
      <system>
        <type>
          <static/>
        </type>
        <hostname>Firewall</hostname>
        <permitted-ip>
          <entry name="192.168.0.0/24"/>
        </permitted-ip>
        <ip-address>192.168.0.2</ip-address>
        <netmask>255.255.255.0</netmask>
        <default-gateway>192.168.0.1</default-gateway>
      </system>
    <network>
      <interface>
        <ethernet>
          <entry name="ethernet1/1">
            <layer3>
              <ip>
                <entry name="192.168.0.5/24"/>
              </ip>
            </layer3>
          </entry>
        </ethernet>
      </interface>
      <virtual-router>
        <entry name="default">
          <routing-table>
            <ip>
              <static-route>
                <entry name="default-route">
                  <nexthop>
                    <ip-address>192.168.0.1</ip-address>
                  </nexthop>
                  <interface>ethernet1/4</interface>
                  <destination>0.0.0.0/0</destination>
                </entry>
              </static-route>
            </ip>
          </routing-table>
        </entry>
      </virtual-router>
    </network>

the four relevant values are unique (or nonexistent) within the "system" tag <system></system> things like ip-address might appear again elsewhere outside of <system></system> but i am only checking for the ones inside system, if the type is not static dont appear, i set it to dhcp-client

this is what I need as a result in a file if the type is dhcp:

type=dhcp-client

this is what I need as a result in a file if the type was static:

type=static
ip-address=192.168.0.2
default-gateway=192.168.0.1
netmask=255.255.255.0

I am not sure how to accomplish this efficiently and integrated inside an existing PHP file (so either work with exec or better yet use php only).

I am also limited to tools that are installed by default on an ubuntu server system and would be unable to use other packages.

PS: this is actually the whole/complete use-case, I will not need to produce other output other than these two examples. Thanks for any help or pointers :)

Kusalananda
  • 333,661
Questi
  • 45

5 Answers5

2

Assuming you don't have access to an XML-aware tool and your input file is as simple and regular as you show, this produces the expected output from your posted sample input:

$ cat tst.awk
BEGIN { FS="[[:space:]]*[<>][[:space:]]*"; OFS="=" }
$2 == "system"  { inBlock=1 }
inBlock { f[$2] = $3 }
$2 == "/system" { inBlock=0 }
END {
    if ("ip-address" in f) {
        print "type", "static"
        print "ip-address", f["ip-address"]
        print "default-gateway", f["default-gateway"]
        print "netmask", f["netmask"]
    }
    else {
        print "type", "dhcp-client"
    }
}

.

$ awk -f tst.awk absentFile
type=dhcp-client

.

$ awk -f tst.awk presentFile
type=static
ip-address=192.168.0.2
default-gateway=192.168.0.1
netmask=255.255.255.0

The above was run on these input files:

$ tail -n +1 absentFile presentFile
==> absentFile <==
    <deviceconfig>
      <system>
        <type>
          <dhcp-client>
            <send-hostname>yes</send-hostname>
          </dhcp-client>
        </type>
        <hostname>Firewall</hostname>
      </system>
    </deviceconfig>

==> presentFile <==
    <deviceconfig>
      <system>
        <type>
          <static/>
        </type>
        <hostname>Firewall</hostname>
        <permitted-ip>
          <entry name="192.168.0.0/24"/>
        </permitted-ip>
        <ip-address>192.168.0.2</ip-address>
        <netmask>255.255.255.0</netmask>
        <default-gateway>192.168.0.1</default-gateway>
      </system>
    <network>
      <interface>
        <ethernet>
          <entry name="ethernet1/1">
            <layer3>
              <ip>
                <entry name="192.168.0.5/24"/>
              </ip>
            </layer3>
          </entry>
        </ethernet>
      </interface>
      <virtual-router>
        <entry name="default">
          <routing-table>
            <ip>
              <static-route>
                <entry name="default-route">
                  <nexthop>
                    <ip-address>192.168.0.1</ip-address>
                  </nexthop>
                  <interface>ethernet1/4</interface>
                  <destination>0.0.0.0/0</destination>
                </entry>
              </static-route>
            </ip>
          </routing-table>
        </entry>
      </virtual-router>
    </network>
Ed Morton
  • 31,617
  • i added a much larger snippet now and found out that those three strings only appear within the -system- -/system- tags – Questi Jan 15 '20 at 22:27
  • Posting large chunks of code for us to read is a bad idea. You're supposed to post a Minimal example to make it easy for us to help you. The code I posted doesn't care where the strings appear - did you try it? did it work or not? If not, in what way did it not work? – Ed Morton Jan 16 '20 at 14:58
  • it was my fault, i posted a larger snippet to show that it can happen that the values appear more than once, only uniqueness i could find was that they must be within the system tag, within there they appear only once. also i somehow will need to incorporate this into a php file without creating more than the final file, with as little code as practical – Questi Jan 16 '20 at 16:29
  • OK, so now create a minimal snippet so we can help you. I personally am not going to wade through a large block of text trying to figure out what's important/relevant in it but others might I suppose. – Ed Morton Jan 16 '20 at 17:54
  • Hi, I edited and posted a more minimal version but left as much as I thought might be needed to understand that e.g. some values are not unique but are unique within another unique part, I also edited the text to hopefully explain better which parts are relevant and which are not (but are shown to differentiate/find the relevant string) – Questi Jan 16 '20 at 18:22
  • Much better. I updated my answer. I don;'t know anything about integrating that inside a PHP script of course. In fact I'm looking to hire a PHP programmer for my own site/project :-). – Ed Morton Jan 16 '20 at 19:08
  • yup, except of using default-gateway instead of dhcp-client, its the result i was looking for. unfortunately wont work for me as this requires me to create a file first and I cannot create files really. – Questi Jan 18 '20 at 09:11
  • I tweaked it to print dhcp-client instead of default-gateway when required. Nothing in my answer requires you to create a file. What file is it you think you need to create to use the script I posted? Is it the awk script? You can call awk as awk 'BEGIN ... }' file instead of saving the BEGIN .... } script in tst.awk and invoking it as awk -f tst.awk file. – Ed Morton Jan 18 '20 at 23:04
  • yes I thought it requires you to create the awk files... this script now does exactly what I needed on the CLI, which is what I thought would need to be done at first. Thanks for the help :) Hopefully can use this as well, I found a solution in php as well. – Questi Jan 18 '20 at 23:46
1

You can use the following script ip-parse.sh:

#!/bin/bash

#https://stackoverflow.com/questions/22221277/bash-grep-between-two-lines-with-specified-string
#https://www.cyberciti.biz/faq/bash-remove-whitespace-from-string/
#https://stackoverflow.com/questions/1251999/how-can-i-replace-a-newline-n-using-sed
sed -n '/\<system\>/,/system\>/p' ~/Desktop/x-test.xml | sed -e 's/^[ \t]*//' > ~/Desktop/x-system.xml
sed ':a;N;$!ba;s/\n/ /g' ~/Desktop/x-system.xml > /tmp/xml-one-line.xml

#[]test to see if the "system" section ...
#... has the word hostname occuring before the word ip-address    
#https://stackoverflow.com/questions/33265650/grep-for-a-string-in-a-specific-order
if [ -n "$(grep hostname.*ip-address /tmp/xml-one-line.xml)" ]; then 
    echo "File contains hostname and ip-address, in that order."
else
    echo "type=dhcp-client" ; echo "type=dhcp-client" > ~/Desktop/network-config.txt ; exit
fi

#http://www.compciv.org/topics/bash/variables-and-substitution/    
ipaddress="$(grep ip-address ~/Desktop/x-system.xml | sed 's/<ip-address>//g; s/<\/ip-address>//g')"  
defaultgateway="$(grep default-gateway ~/Desktop/x-system.xml | sed 's/<default-gateway>//g; s/<\/default-gateway>//g')"
netmask="$(grep netmask ~/Desktop/x-system.xml | sed 's/<netmask>//g; s/<\/netmask>//g')"

echo "type=static" > ~/Desktop/network-config.txt
echo "ip-address=$ipaddress" >> ~/Desktop/network-config.txt
echo "default-gateway=$defaultgateway" >> ~/Desktop/network-config.txt
echo "netmask=$netmask" >> ~/Desktop/network-config.txt

Application example:

paul@mxg6:~/Desktop$ ./ip-parse.sh   
File contains hostname and ip-address, in that order.  
paul@mxg6:~/Desktop$ cat network-config.txt   
type=static  
ip-address=192.168.0.2  
default-gateway=192.168.0.1  
netmask=255.255.255.0   

If you don't need to check if hostname comes before ip-address, and you want to use variables instead of intermediary files, try this:

#!/bin/bash

xsystemxml="$(sed -n '/\<system\>/,/system\>/p' ~/Desktop/x-test.xml \
| sed -e 's/^[ \t]*//')"

if [ -n "$(echo $xsystemxml | grep ip-address)" ]; then 
    echo "System section contains ip-address."
else
    echo "type=dhcp-client"
    echo "type=dhcp-client" > ~/Desktop/network-config.txt
    exit
fi

ipaddress="$(echo "$xsystemxml" | grep "ip-address" \
| sed 's/<ip-address>//g; s/<\/ip-address>//g')"

defaultgateway="$(echo "$xsystemxml" | grep "default-gateway" \
| sed 's/<default-gateway>//g; s/<\/default-gateway>//g')"

netmask="$(echo "$xsystemxml" | grep "netmask" \
| sed 's/<netmask>//g; s/<\/netmask>//g')"

echo "type=static" > ~/Desktop/network-config.txt
echo "ip-address=$ipaddress" >> ~/Desktop/network-config.txt
echo "default-gateway=$defaultgateway" >> ~/Desktop/network-config.txt
echo "netmask=$netmask" >> ~/Desktop/network-config.txt
oksage
  • 33
  • @Torin You're right thanks. I'll change that. – oksage Jan 16 '20 at 02:37
  • Hi oksage, thanks for this! This is very close to what I need!!! maybe you can just make a small modification: the check for hostname and ip-address afterwards is not necessary, i only need to check if there is an ip-address at all in this section (and write it the way you have if found/not found). if this could get rid of the extra file being created it would be exactly what i need! meaning if possible i would only like to create a single output file and no "in-between" files in the end i will need to integrate this into a .php file – Questi Jan 16 '20 at 13:37
  • would you happen to know how to do this in php? – Questi Jan 18 '20 at 13:55
  • So if you don't need to check if hostname comes before ip-address (in the section), then a few lines can be removed. And to avoid the intermediary files, use variables instead of the temporary files. I'll add the revised script below the current one. It could probably be done in a simpler way, but I was having trouble with using sed -n '/<ip-address>/,/ip-address>/p' on the variable $xsystemxml ... so I added to $xsystemxml 's command substitution to get the $ipsection, $gatewaysection and $netmasksection variables). Someone could clean this up better than I could, but try it out. – oksage Jan 21 '20 at 00:49
  • I don't know php. – oksage Jan 21 '20 at 01:00
  • I got the $ipaddress variable working with the variable inside the command substitution. I think maybe the reason it wasn't working before was because I was using $xsystemxml when I should have been using "$xsystemxml" . I'll edit the bottom script to remove the 'section' variables and update the ip/gateway/netmask variables. – oksage Jan 21 '20 at 01:44
  • @Questi ... forgot to tag you on the other comments – oksage Jan 21 '20 at 02:16
1

Ok, I found the answer myself...

actually much simpler in php than without it...

did take a many hours to finish for me though ^^

#load the file as simplexml object and then switch into system
#https://www.w3schools.com/php/func_simplexml_load_file.asp
$xml=simplexml_load_file('./myfile') or die("Error: Cannot create object");
$xml=$xml->system

#put the whole string(s) into a variable, getname gets the name of the object itself if it exists
#https://www.w3schools.com/php/func_simplexml_getname.asp
$output='type=' . $xml -> type -> static -> getName() . $xml -> type -> {'dhcp-client'} -> getName() . "\nip-address=" . $xml -> {'ip-address'} . "\ndefault-gateway=" . $xml -> {'default-gateway'} . "\nnetmask=" . $xml -> netmask;

#write the output into a file
#https://www.w3schools.com/php/func_filesystem_file_put_contents.asp
file_put_contents('./myoutputfile', $output );

this gave me the following output for the first snippet (the last three lines are ok if they dont give a value, otherwise i could have checked if they exist first):

type=dhcp-client
ip-address=
default-gateway=
netmask=

and this output for the second snippet:

type=static
ip-address=192.168.0.2
default-gateway=192.168.0.1
netmask=255.255.255.0

Thanks for everyones help :)

Questi
  • 45
0

Following answer from here https://stackoverflow.com/questions/893585/how-to-parse-xml-in-bash, I made a simple script

#!/bin/bash

read_dom () {
    local IFS=\>
    read -d \< ENTITY CONTENT
}

found=0
while read_dom; do
    if [[ $ENTITY = "ip-address" ]] && [[ $last_tag = "/hostname" ]] || [[ $ENTITY = "netmask" ]] || [[ $ENTITY = "default-gateway" ]]; then
        if [[ $found = 0 ]]; then
            echo "type=static"
        fi

        echo "$ENTITY=$CONTENT"
        found="1"
    fi
    last_tag=$ENTITY
done

if [[ $found = 0 ]]; then
    echo "type=dhcp-client"
fi

If you name your script parse.sh, you can call it like that

parse.sh < input.xml > output.txt
nobody
  • 1,710
0

Here's a bash script which uses a sed.

scriptname input.xml

Output is sent to std out

#!/bin/bash
sed -n '{
/<hostname>/ {

# next line makes a comment out of hostname
# add a '#' to beginning of line to supress
    s/\s*<hostname>\(.\+\)<\/hostname>/# \1/p  #make hostname into a comment with hash

    n   # read next line to patern space
    /<ip-address>/{               # if this line contains <ip-address>
        i\type=static
        s/\s\{1,\}<\(ip-address\)>\(.\+\)<\/\1>/\1=\2/p
        n   # read next line to patern space

        # netmask
        s/\s\{1,\}<\(netmask\)>\(.\+\)<\/\1>/\1=\2/p
        n   # read next line to patern space

        # default-gateway
        s/\s\{1,\}<\(default-gateway\)>\(.\+\)<\/\1>/\1=\2\n/p
        n
        b end # branch to end
        }

    /<ip-address>/ !{             # if line does not contain with <ip-address>
        i\type=dhcp-client\


        }
    :end # end label
    }
}
' $1
X Tian
  • 10,463