I have a string of the format [0-9]+\.[0-9]+\.[0-9]
. I need to extract the first, second, and third numbers separately. As I understand it, capture groups should be capable of this. I should be able to use sed "s/\([0-9]*\)/\1/g
to get the first number, sed "s/\([0-9]*\)/\2/g
to get the second number, and sed "s/\([0-9]*\)/\3/g
to get the third number. In each case, though, I am getting the whole string. Why is this happening?
Asked
Active
Viewed 1.4e+01k times
64

Melab
- 4,048
3 Answers
89
We can't give you a full answer without an example of your input but I can tell you that your understanding of capture groups is wrong. You don't use them sequentially, they only refer to the regex on the left hand side of the same substitution operator. If you capture, for example, /(foo)(bar)(baz)/
, then foo
will be \1
, bar
will be \2
and baz
will be \3
. You can't do s/(foo)/\1/; s/(bar)/\2/
, because in the second s///
call, there is only one captured group, so \2
will not be defined.
So, to capture your three groups of digits, you would need to do:
sed 's/\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)/\1 : \2 : \3/'
Or, the more readable:
sed -E 's/([0-9]*)\.([0-9]*)\.([0-9]*)/\1 : \2 : \3/'

terdon
- 242,166
-
1
-
8@JoshM. you need to escape them in order for them to be used to capture patterns. Normally
/(foo)/
in sed will match a literal(
character, followed byfoo
and then a literal)
. If you want to capture a group, you need to either escape the parentheses or use the-E
option. – terdon Sep 26 '18 at 11:16 -
I almost always use the
-r
flag so I assume that's why I haven't run into this yet. – Josh M. Sep 26 '18 at 16:57 -
5@JoshM. yes, the
-r
flag will also do that, but it isn't portable. GNU sed supports it but many others do not. The-E
is more universal. – terdon Sep 26 '18 at 17:12
18
Example:
$ echo "123.456.78" |sed 's/\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)/\1/'
123
$ echo "123.456.78" |sed 's/\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)/\2/'
456
$ echo "123.456.78" |sed 's/\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)/\3/'
78
Or, all together:
$ echo "123.456.78" |sed 's/\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)/\1 : \2 : \3/'
123 : 456 : 78

Andrea Ligios
- 103

jai_s
- 1,500
11
Use Sed with -r, --regexp-extended to avoid all escaped parenthesis.
echo "1234.567.89" | sed -r 's/([0-9]+)\.([0-9]+)\.([0-9]+)/\1, \2, \3/'
1234, 567, 89 #output

Surya
- 479
's/\([0-9]\)\([0-9]\)\([0-9]\).*/\1\2\3/'
to capture individual numbers. – Munir Feb 16 '16 at 18:10