extract value between two search patterns on same line

Question

I have the following in a file Output.dat. I need to extract the value between dn: uid= and ,ou=

 dn: uid=user1,ou=Active,ou=Member,dc=domain,dc=org
 dn: uid=user2@abc.com,ou=Active,ou=Member,dc=domain,dc=org
 dn: uid=usertest,ou=Active,ou=Member,dc=domain,dc=org
 dn: uid=abc1,ou=Active,ou=Member,dc=domain,dc=org

I tried using
```
sed -e '/dn: uid=/,/,ou=/p' output.dat but
```
it returns complete line instead of value.

When tried to use

sed -e '/dn: uid=/,/,ou=/\1/p' output.dat

then got the following error:

sed: -e expression #1, char 18: unknown command: `\'

steeldriver · Accepted Answer · 2014-05-21T23:16:09.087

If you have a version of GNU grep with PCRE (-P) support, then assuming you mean the first occurrence of ,ou

grep -oP '(?<=dn: uid=).+?(?=,ou=)' file

If you want to match up to the second ,ou you can remove the non-greedy ? modifier

grep -oP '(?<=dn: uid=).+(?=,ou=)' file

The expressions in parentheses are zero-length assertions (aka lookarounds) meaning that they form part of the match, but are not returned as part of the result. You could do the same thing natively in perl e.g.

perl -ne 'print "$1\n" if /(?<=dn: uid=)(.+?)(?=,ou=)/' file

It's possible to do something similar in sed, using regular (non zero-length) grouping e.g. (for GNU sed - other varieties may need additional escaping)

sed -rn 's/(.*dn: uid=)([^,]+)(,ou=.*)/\2/p' file

or simplifying slightly

sed -rn 's/.*dn: uid=([^,]+),ou=.*/\1/p' file

Note the [^,] is a bit of a hack here, since sed doesn't have a true non-greedy match option.

Afterthought: although it's not exactly what you asked, it looks like what you actually want to do is read comma-separated name=value pairs from a file, and then further split the value of the first field from its name. You could achieve that in many ways - including

awk -F, '{sub(".*=","",$1); print $1}' file

or a pure-bash solution such as

while IFS=, read -r a b c d; do printf '%s\n' "${a#*=}"; done < file

score 4 · Answer 2 · answered May 21 '14 at 20:51

4

This is a good job for awk. You can split the string instead of attempting to use a regex. Here is a solution:

$ awk -F= '{ split($2,arr,","); print arr[1]  }' test.txt
user1
user2@abc.com
usertest
abc1

answered May 21 '14 at 20:51

jordanm

42,678

Thanks for quick answer. Is it possible with sed ? – Raza May 21 '14 at 21:00

score 3 · Answer 3 · edited Jun 11 '20 at 12:04

3

With sed:

sed 's/[^=]*=\([^,]\+\),.*/\1/' file

This assumes the uid= will have the first occurrence of = on the line and it assumes that you want to stop at the first ,ou= instance on the line.

Explanation

This looks for any number of non = characters ([^=]*) followed by = then matches and saves as many non-commas as it can find ( $[^,]\+$ ) followed by a comma and the rest of the line (,.*). This means it will replace everything up to and including the first = and after the first comma with whatever non-comma characters it finds after the first = on the line.

edited Jun 11 '20 at 12:04

Community

1

answered May 21 '14 at 21:19

Joseph R.

39,549

this didn't work. I am getting the following error sed 's/=([^,]+),/\1' output.dat sed: -e expression #1, char 17: unterminated `s' command – Raza May 21 '14 at 21:22
@Salton Yes it had a missing /. I corrected it and tested it. – Joseph R. May 21 '14 at 21:23
Cool. It works now. It will help if you could please explain this. In your search pattern I don't see "uid" and ",ou" – Raza May 21 '14 at 21:46
@Salton Please see if this helps. – Joseph R. May 21 '14 at 22:06

terdon · Answer 4 · 2014-05-22T00:45:17.683

Some more choices, in order of length:

GNU grep with PCREs
```
grep -oP 'uid=\K[^,]+' file 
```
The \K discards everything matched up to that point, which combined with the -o switch will cause grep to print only the longest stretch of non , characters that comes after the uid=.
awk
```
awk -F'[=,]' '{print $2}' file 
```
-F'[=,] sets the field separator to be either = or , so the 2nd field is the user name.
sed
```
sed -r 's/.{8}([^,]*).*/\1/' file 
```
That will match the first 7 characters (.{7}) =, capture the longest stretch of non-, as \1 and replace the whole line with \1.
perl
```
perl -pe 's/.+?=([^,]+).*/$1/' file 
```
The -pe means "print every line after applying the script given by -e". The s/// is the substitution operator and the regular expression looks for the 1st (.+?, the ? makes it match the shortest possible string) = and then captures the longest stretch of non-, characters after that. The s/// replaces what was matched with what was captured (what was inside the parentheses).
cut
```
cut -d'=' -f 2 file | cut -d ',' -f 1 
```
The -d sets the delimiter to = so the 2nd (-f 2) field is username,ou. The second cut uses , as delimiter and prints the username alone.

extract value between two search patterns on same line

4 Answers4

Linked

Related