sed: Remove everything from dot after FQDN

Question

I am new to sed and am having some troubles making it work.

What I want is this:

abc.ztx.com. A 132.123.12.44 ---> abc.ztx.com

I used the below pattern, but doesn't seem to work:

echo "abc.ztx.com. A 132.123.12.44" | sed 's/\.\s.+//g'

I verified the regex using regex101.com and pattern, \.\s.+ matches the part . A 132.123.12.44 perfectly. Why is it not working with sed.

Appreciate your help. Thank you.

Do you have to use sed? This is a perfect job for cut, which is what I'd use here. If you aren't stuck with sed only, let me know and I'll post a cut answer. — ron rothman, Mar 18 '20 at 00:37
@ron-rothman Yes, please. Anything that makes the job easier. — Amey, Mar 18 '20 at 05:23

Kusalananda · Accepted Answer · 2020-03-16T18:37:25.103

sed uses POSIX basic regular expressions (BRE) by default. \s is a PCRE (Perl-compatible regular expression) which is equivalent to the BRE [[:blank:]] (I think, matching spaces and tabs, or possiby [[:space:]] which matches a larger set of whitespace characters). The + is a POSIX extended regular expression (ERE) modifier, which is equivalent to \{1,\} as a BRE.

So try

sed 's/\.[[:blank:]].*//'

instead. You may replace [[:blank:]] by a space character if you don't need to match tabs:

sed 's/\. .*//'

Note that there is no need to do the substitution with the g flag as there will only ever be a single match. Also, the .+ that you use could just be replaced by .* instead of .\{1,\} as we don't care whether there are any further characters at all (just delete all of them).

Why does my regular expression work in X but not in Y?

Thank you so much Kusalananda for answering the question and also explaining the first principles of sed Regex. — Amey, Mar 16 '20 at 14:19

ctrl-alt-delor · Answer 2 · 2020-03-16T20:59:08.787

2

If you are using Gnu/Linux, or any other Gnu, then you will have Gnu sed. Gnu sed has the -r option, that allows this.

Add the option -r to change the dialect of regex.

e.g.

echo "abc.ztx.com. A 132.123.12.44" | sed -r 's/\.\s.+//g'

edited Mar 16 '20 at 20:59

answered Mar 16 '20 at 14:22

ctrl-alt-delor

27,993

Wow! This is helpful too! Thank you.. – Amey Mar 16 '20 at 16:53
While -r does work, -E is the POSIX standard switch to do this. – David Conrad Mar 16 '20 at 18:07
@DavidConrad POSIX sed does not support extended regular expressions nor PCREs, and does not have -r nor -E options. The sed used in this answer is GNU sed. Most sed implementations supports the non-standard -E option to enable the use of EREs, but only GNU sed (AFAIK) includes the \s expression (along with a few other PCRE shortcuts that GNU decided to put in their regular expression library). – Kusalananda Mar 16 '20 at 18:18
@Kusalananda The att sed version (from 2012-03-28) already included the -r and -E options. Also supports the \s. I don't recall now if that came from super-sed or from sed 3.02. – Mar 17 '20 at 01:01
@DavidConrad There is no accepted -E (yet, it may be so on future editions) in POSIX. But the idea sure came from other places, not form POSIX. – Mar 17 '20 at 01:03
Yes, -E for Extended sounds clearer than -r. Both are not POSIX standard anyway. – Mar 17 '20 at 01:05

score 0 · Answer 3 · answered Mar 18 '20 at 11:43

Your question specifically asks about sed, but I would use cut for this.

If you can live with a trailing dot, then:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1

abc.ztx.com.

If you can't live with the trailing dot, then:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1 | rev | cut -d. -f2- | rev

abc.ztx.com

or:

$ echo "abc.ztx.com. A 132.123.12.44" | cut -d" " -f1 | sed -e "s/\.$//"

abc.ztx.com

AdminBee · Answer 4 · 2020-03-18T11:59:33.767

0

Yet another possibility using awk, with . (period + space) specified as field separator:

echo "abc.ztx.com. A 132.123.12.44" | awk -F '\\. ' '{print $1}'

abc.ztx.com

edited Mar 18 '20 at 11:59

answered Mar 18 '20 at 11:52

AdminBee

22,803

sed: Remove everything from dot after FQDN

4 Answers4