3

I am new to sed and am having some troubles making it work.

What I want is this:

abc.ztx.com. A 132.123.12.44 ---> abc.ztx.com

I used the below pattern, but doesn't seem to work:

echo "abc.ztx.com. A 132.123.12.44" | sed 's/\.\s.+//g'

I verified the regex using regex101.com and pattern, \.\s.+ matches the part . A 132.123.12.44 perfectly. Why is it not working with sed.

Appreciate your help. Thank you.

Kusalananda
  • 333,661
Amey
  • 133
  • Do you have to use sed? This is a perfect job for cut, which is what I'd use here. If you aren't stuck with sed only, let me know and I'll post a cut answer. – ron rothman Mar 18 '20 at 00:37
  • @ron-rothman Yes, please. Anything that makes the job easier. – Amey Mar 18 '20 at 05:23

4 Answers4

5

sed uses POSIX basic regular expressions (BRE) by default. \s is a PCRE (Perl-compatible regular expression) which is equivalent to the BRE [[:blank:]] (I think, matching spaces and tabs, or possiby [[:space:]] which matches a larger set of whitespace characters). The + is a POSIX extended regular expression (ERE) modifier, which is equivalent to \{1,\} as a BRE.

So try

sed 's/\.[[:blank:]].*//'

instead. You may replace [[:blank:]] by a space character if you don't need to match tabs:

sed 's/\. .*//'

Note that there is no need to do the substitution with the g flag as there will only ever be a single match. Also, the .+ that you use could just be replaced by .* instead of .\{1,\} as we don't care whether there are any further characters at all (just delete all of them).

Related:

Kusalananda
  • 333,661
  • Thank you so much Kusalananda for answering the question and also explaining the first principles of sed Regex. – Amey Mar 16 '20 at 14:19
2

If you are using Gnu/Linux, or any other Gnu, then you will have Gnu sed. Gnu sed has the -r option, that allows this.

Add the option -r to change the dialect of regex.

e.g.

echo "abc.ztx.com. A 132.123.12.44" | sed -r 's/\.\s.+//g'

  • Wow! This is helpful too! Thank you.. – Amey Mar 16 '20 at 16:53
  • While -r does work, -E is the POSIX standard switch to do this. – David Conrad Mar 16 '20 at 18:07
  • @DavidConrad POSIX sed does not support extended regular expressions nor PCREs, and does not have -r nor -E options. The sed used in this answer is GNU sed. Most sed implementations supports the non-standard -E option to enable the use of EREs, but only GNU sed (AFAIK) includes the \s expression (along with a few other PCRE shortcuts that GNU decided to put in their regular expression library). – Kusalananda Mar 16 '20 at 18:18
  • @Kusalananda The att sed version (from 2012-03-28) already included the -r and -E options. Also supports the \s. I don't recall now if that came from super-sed or from sed 3.02. –  Mar 17 '20 at 01:01
  • @DavidConrad There is no accepted -E (yet, it may be so on future editions) in POSIX. But the idea sure came from other places, not form POSIX. –  Mar 17 '20 at 01:03
  • Yes, -E for Extended sounds clearer than -r. Both are not POSIX standard anyway. –  Mar 17 '20 at 01:05
0

Your question specifically asks about sed, but I would use cut for this.

If you can live with a trailing dot, then:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1

abc.ztx.com.

If you can't live with the trailing dot, then:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1 | rev | cut -d. -f2- | rev

abc.ztx.com

or:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1 | sed -e "s/\.$//"

abc.ztx.com
0

Yet another possibility using awk, with . (period + space) specified as field separator:

echo "abc.ztx.com. A 132.123.12.44" | awk -F '\\. ' '{print $1}'

abc.ztx.com
AdminBee
  • 22,803