0

So what I have here is part of my script that is used to read the pcie lanes on the boards to help understand what NVMe's are up and running without having to manually check each drive. This works for the most part, however, my knowledge in bash has failed me again and I'm not really sure what I should using in place of:

    elif [[ ! "10000:01:00.0" =~ "${V2Lnvmearray[$key]}" ]]

As this doesn't just if a drive is missing but is also true when for the other look ups that dont match when a drive is present. This results in either 8 counts of "no drive detected" per slot or 7 counts of "No drive detected" and a positive count. Thanks for any input.

#!/bin/bash

V2Lnvme0n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme0n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme1n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme1n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme2n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme2n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme3n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme3n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme4n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme4n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme5n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme5n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme6n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme6n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) V2Lnvme7n1=$( readlink -f /sys/dev/block/$(ls -l /dev/nvme7n1 |awk -F'[, ]+' '{print $5":"$6}') |grep -Eo '1000+[0-4]+[:]+[0]+[1-4]+[:]+[0]+[.]+[0]+' ) declare -A V2Lnvmearray=( [nvme0n1]=$V2Lnvme0n1 [nvme1n1]=$V2Lnvme1n1 [nvme2n1]=$V2Lnvme2n1 [nvme3n1]=$V2Lnvme3n1 [nvme4n1]=$V2Lnvme4n1 [nvme5n1]=$V2Lnvme5n1 [nvme6n1]=$V2Lnvme6n1 [nvme7n1]=$V2Lnvme7n1 ) #V2Large Machines setup | get the PCIe address into the array for key in "${!V2Lnvmearray[@]}"; do echo "$key ${V2Lnvmearray[$key]}"; done # displays the keys and values of the array for key in "${!V2Lnvmearray[@]}"; do if [[ "10000:01:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 0" elif [[ ! "10000:01:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 0" fi if [[ "10000:02:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 1" elif [[ ! "10000:02:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 1" fi if [[ "10002:01:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 2" elif [[ ! "10002:01:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 2" fi if [[ "10002:02:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 3" elif [[ ! "10002:02:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 3" fi if [[ "10001:01:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 4" elif [[ ! "10001:01:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 4" fi if [[ "10001:02:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 5" elif [[ ! "10001:02:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 5" fi if [[ "10001:03:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 6" elif [[ ! "10001:03:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 6" fi if [[ "10001:04:00.0" = "${V2Lnvmearray[$key]}" ]]; then echo "$key is in slot 7" elif [[ ! "10001:04:00.0" =~ "${V2Lnvmearray[$key]}" ]]; then echo "No drive detected in slot 7" fi done

  • 1
    for one, it looks to me the conditions you have in the elif branches are exactly the same as the ones in the corresponding if branches, just negated. So you could just use else instead. That very first array initialization just screams for a loop, for nvm in nvme{0..7}n1 ; do V2Lnvmearray[$nvm]=$(... /dev/$nvm ...); done. So does the inside of the main loop too, for x in 10000:01:00.0 10000:02:00.0 ... – ilkkachu Oct 18 '21 at 10:55
  • anyway, yeah, you're right, that would print e.g. "x is in slot 0" and 7 times "no drive detected in slot N" for each key. You don't have any connection between the loop iterations, so how could it know in one iteration that another is going to find something in a slot this one doesn't? The easy way out is to just not print the "no drive detected" messages, and let the user deduce that from the missing output lines. Or smarter, make an array for the slots, and fill that during the loop, when finding the drives. And then print everything at the end, when you know which slots were empty. – ilkkachu Oct 18 '21 at 10:59
  • Yeah atm it seems like that is what I'm having to accept. Although my knowledge of bash is pretty low compared to others – Earthwormben Oct 18 '21 at 11:07
  • You'd have to do the logic the same way with any other tool. And I think you already have the Bash-specific parts you need there in the current script. (the array stuff, mostly. Also see https://mywiki.wooledge.org/BashGuide/Arrays and https://www.gnu.org/software/bash/manual/html_node/Arrays.html) Of course you could also write the logic in some other language first? With some mock data, perhaps. – ilkkachu Oct 18 '21 at 11:33

1 Answers1

1

Why even do it this way? Not only is parsing the output of ls a terrible idea, you're looking for details of nvme devices that may or may not even be installed. stat and/or find are far more suitable tools for getting the data you want than ls -l.

Wouldn't it to be simpler to just get what you need from find /sys/devices/pci* -name 'nvme[0-9]*n1' ? That will give you the full /sys/devices path to only the nvme drives which are actually installed in the system.

If you want that in an array:

$ readarray -d '' -t nvme  < <(
    find /sys/devices/pci*/ -name 'nvme[0-9]*n1' -print0)

$ declare -p nvme declare -a nvme=( [0]="/sys/devices/pci0000:40/0000:40:01.2/0000:42:00.0/nvme/nvme1" [1]="/sys/devices/pci0000:40/0000:40:01.1/0000:41:00.0/nvme/nvme0")

(line-feeds added to the output of declare -p to make it more readable)

if you also want that in an associative array (aka hash), indexed by the basenames:

$ readarray -d '' -t nvme  < <(
    find /sys/devices/pci*/  -name 'nvme[0-9]*n1' -print0)

$ for d in "${nvme[@]}" ; do nvme_hash["$(basename "$d")"="$d" done

$ declare -p nvme_hash declare -A nvme_hash=( [nvme1n1]="/sys/devices/pci0000:40/0000:40:01.2/0000:42:00.0/nvme/nvme1/nvme1n1" [nvme0n1]="/sys/devices/pci0000:40/0000:40:01.1/0000:41:00.0/nvme/nvme0/nvme0n1" )

or if you wanted only the, say, fifth element of that path:

$ for d in "${nvme[@]}" ; do
  nvme_hash["$(basename "$d")"]="$(printf "$d" | awk -F/ '{print $5}')"
done

$ declare -p nvme_hash declare -A nvme_hash=([nvme1n1]="0000:40:01.2" [nvme0n1]="0000:40:01.1" )

This won't directly tell you if a drive is "missing", but will give you a complete list of all nvme drives installed - you can use that to deduce whether any are missing. If you expect 8 nvme devices but the hash only has 7 elements, then one is missing. Or if you expect nvme[0-7]n1 to be installed, but 3 and 5 are missing, you can iterate over the keys of the hash to discover that.

and the /sys/devices/... path gives you access to the complete information about the nvme drives.

In short, you are doing this backwards, starting from a hard-coded list of what might be there instead of discovering what's actually there.

cas
  • 78,579
  • Its done this way, as we build, monitor, and repair the customers machines. I know exactly what should be there, and I am trying to make tools to speed this up whilst learning a scripting language language – Earthwormben Oct 18 '21 at 12:05
  • an important part of learning a language is listening when people who've been programming in it for decades tell you that doing some particular thing (like parsing the output of ls) is a terrible idea. but if you're sure you know better, then go right ahead. – cas Oct 18 '21 at 12:29
  • @schrodigerscatcuriosity they weren't comments, they were root shell prompts. – cas Oct 18 '21 at 12:30
  • @cas I was about to ask :), waht do you think of using $ instead of # so they don't appear as comments? – schrodingerscatcuriosity Oct 18 '21 at 12:31
  • $ is for non-root shells. # is for root shells. OTOH, nothing in this answer actually requires root (but reading any of the files under /sys/devices/pci* might) – cas Oct 18 '21 at 12:32
  • @cas oh yes, I just suggested an alternative because I find non comment code commented can be confusing, but it's just me. – schrodingerscatcuriosity Oct 18 '21 at 12:36
  • to me, it's just a way of distinguishing input from output. lines/statements starting with a prompt char are what gets typed in, lines without are output. i've changed # to $ now, anyway. – cas Oct 18 '21 at 12:37
  • @cas I have already looked through the advice you have given and whilst it may be suitable to other situations, it was not helpful in this instance, it gave me things to think on such as using find /sys/devices/pci*/ but ultimately at this point, it will not benefit me – Earthwormben Oct 18 '21 at 13:56