-2

I was given a command to extract an IPv6 address:

/usr/bin/ip a | grep inet6 | grep -vE 'fe80|host' | sed -e 's/^.*inet6 \([^ ]*\)\/.*$/\1/;t;d'

Could someone please break down the sed substitution for me?

The sed command also works without the ;t;d

muru
  • 72,889
Bjoern
  • 31
  • 2
    Quick non-answer: start with the leading pipeline so you can see what the sed is acting on. What do you know (if anything) about Regular Expressions? – Chris Davies Apr 20 '22 at 22:32
  • 1
    Are you sure it's /usr/bin/ip and not one of /usr/sbin/ip or even /sbin/ip? – Chris Davies Apr 20 '22 at 22:34
  • awk would be better for this. e.g. ip addr | awk '/inet6/ && ! /fe80|host/ { print $2 }'. In English, that's roughly: on lines that contain "inet6" but don't contain "host" or "fe80", print field two. – cas Apr 21 '22 at 02:49
  • roaima, on my MX-Linux VM system it is indeed /usr/bin/ip – Bjoern Apr 21 '22 at 14:51

2 Answers2

1

I'll start by simplifying your expression a little...

LINE=" inet6 fd86:73ea:ff6b:0:141b:ca40:741b:ec0c/64 scope global noprefixroute"
echo $LINE | sed -e' s/^.*inet6 \([^ ]*\)\/.*$/\1/;t;d'

the -e means that the next argument is the sed script.

's/pattern/replace/' means find "pattern" in the input and substitute with "replace".

Here pattern is /^.*inet6 \([^ ]*\)\/.*$/

The '/' marks the beginning & end of the pattern of the pattern.

The ^ and $ characters always match the start and end of the input string respectively. Obvs the input string will always have a start and an end - these become useful when you want to match or replace elements relative to these positions.

.* means zero or more occurrences of any character. In $LINE above, this matches a single space character.

inet6 means the literal string "inet6 " (with a trailing space).

The \(pattern\) brackets tell sed to not only match the sub-pattern in the input, but to store it for use later.

The \/ means match a literal '/' (see above - without the prefix, the '/' character denotes a structural element in the pattern).

.*$ simply means match the any remaining characters up to the end of the line.

/ marks the end of the pattern.

\1 This is the replacement. Here 1 refers to the first stored match found by the pattern (but there is only one).

symcbean
  • 5,540
1

It seems you want to learn something, which I appreciate. symcbean explained the sed command to you, but I'd like to add some more things you can learn from the code. You can learn to avoid bad habits. (-;

  1. The whole command is piped through two greps and an sed, which is almost always nonsense. In sed, you can address lines by preceeding them with a regular expression, which can do the grep functionality on the fly like sed -n '/foo/s/pattern/replace/p' to do the replacement only on lines with foo. So '/inet6/!d' could delete all lines without (!) inet6 and then you could replace the second grep by /fe80/d;/host/d (or, with extended regular expression option -E: /fe08|host/d).
  2. The -e is optional if there is only one command (or several commands concatenated by ;). Things get easier to read if you leave away superfluous stuff.
  3. As already pointed out, beginning a regular expression by ^.* is nonsense. The greedy * will match anything from the beginning anyhow, so don't distract the reader with an unnecessary anchor.
  4. Same applies to .*$ at the end. Remove the anchor.
  5. If the pattern contains a slash, use a different separator for the s command. Almost any character is allowed so why should one make the expression harder to read by filling it with backslashes? Underscores are good alternatives, for example: s_.*inet6 \([^ ]*\)/.*_\1_
  6. Working with \(…\) and \1 can be useful, but if this is just about extracting the middle part, it can be more comprehensive to simply remove beginning and ending separately: s/.*inet6 //;s_/.*__
  7. According to the sed definition, the t command is optionally followed by a jump mark, which would be ;d in this case. Uncommon, but legal. The GNU version of sed doesn't allow a semicolon in jump marks, but interprets this as a command-separating semicolon. In this case there would be no jump mark, which means "if a substitution was made, jump to the end of the script". But other sed versions will throw an error. It's really nasty to make a script incompatible for such a detail! This could be avoided by separate scripts, for example -e '…;t' -e d. Or by writing commands in new lines.
  8. In this case the idea of the t;d was to avoid messed output if the substitution failed. A good idea, but there is a real tool for that, the p flag to the s command: s_.*inet6 \([^ ]*\)/.*_\1_p;d. If a replacement was made, print the buffer. Easier to read and portable.
  9. The trailing d command could be replaced by the -n option to suppress default output, but that's a matter of taste.

Finally we can compare the commands

/usr/bin/ip a | grep inet6 | grep -vE 'fe80|host' | sed -e 's/^.*inet6 \([^ ]*\)\/.*$/\1/;t;d'

/usr/bin/ip a | sed '/inet6/!d; /fe80/d; /host/d; s/.inet6 //; s_/.__p; d'

or with ERE and -n instead of d:

/usr/bin/ip a | sed -En '/inet6/!d; /fe80|host/d ;s/.*inet6 //; s_/.*__p'

Code golfers would probably write

sed -En '/fe80|host/!s_.*inet6 ([^ ]*)/.*$_\1_p'

but this seems less readable to me.

Philippos
  • 13,453
  • Thank you Philippos, great expalnation!

    I have run into another issue, hoping you have a fix for: -bash-4.2$ ssh -q -o "StrictHostKeyChecking no" 192.168.210.21 "hostname;/usr/sbin/ip a | sed '/inet6/!d; /fe80/d; /host/d; s/.*inet6 //; s_/.*__p; d'" -bash: !d: event not found

    Googling around, I found out it hast to do with history expansion, however, I couldn't find a fix that worked.

    Regards, Bjoern

    – Bjoern Apr 21 '22 at 18:16
  • Yes, there was a bug in bash for a long time, see https://unix.stackexchange.com/questions/390931/bash-history-expansion-inside-single-quotes-after-a-double-quote-inside-the-sam If you can't update to bash 5.x, do set +H to switch off history expansion. I recommend that anyhow, because very few people use it, but many get annoyed by it. – Philippos Apr 22 '22 at 06:19
  • So here's something interesting; I first attempted to string the set command before the actual ip command: set +H && ssh -q -o "StrictHostKeyChecking no" 192.168.210.21 "hostname;/usr/sbin/ip a | sed '/inet6/!d; /fe80/d; /host/d; s/.*inet6 //; s_/.*__p; d'" This did not work, however if I execute the set command set +H first, then the actual ip command, it will work. I was hoping to do this with a one-liner.
    Any ideas?

    Regards,

    – Bjoern Apr 22 '22 at 21:32
  • Put the set +H in your ~/.bashrc to get rid of that buggy feature forever. (-: – Philippos Apr 24 '22 at 14:09
  • Thank you Philippos! – Bjoern Apr 26 '22 at 14:41