20

I need to identify the postion of a character in string.

Example, the string is RAMSITALSKHMAN|1223333.

grep -n '[^a-zA-Z0-9\$\~\%\#\^]'

How do I find the position of | in the given string?

Braiam
  • 35,991
user82782
  • 201

8 Answers8

35

You can use -b to get the byte offset, which is the same as the position for simple text (but not for UTF-8 or similar).

$ echo "RAMSITALSKHMAN|1223333" | grep -aob '|'
14:|

In the above, I use the -a switch to tell grep to use the input as text; necessary when operating on binary files, and the -o switch to only output the matching character(s).

If you only want the position, you can use grep to extract only the position:

$ echo "RAMSITALSKHMAN|1223333" | grep -aob '|' | grep -oE '[0-9]+'
14

If you get weird output, check to see if grep has colors enabled. You can disable colors by passing --colors=never to grep, or by prefixing the grep command with a \ (which will disable any aliases), e.g.:

$ echo "RAMSITALSKHMAN|1223333" | grep -aob '|' --color=never | \grep -oE '^[0-9]+'
14

For a string that returns multiple matches, pipe through head -n1 to get the first match.

Note that I use both in the above, and note that the latter will not work if grep is "aliased" through an executable (script or otherwise), only when using aliases.

runejuhl
  • 583
13

If you're using the shell, you can use purely built-in operations without the need for spawning external processes such as or :

$ str="RAMSITALSKHMAN|1223333"
$ tmp="${str%%|*}"
$ if [ "$tmp" != "$str" ]; then
> echo ${#tmp}
> fi
14
$ 

This uses a parameter expansion to remove all occurrences of | follows by any string and save that in a temporary variable. It is then just a matter of measuring the length of the temporary variable to get the index of |.

Note the if is checking if the | exists at all in the original string. If it doesn't then the temporary variable will be the same as the orginal.

Note also this provides the zero-based index of | which is generally useful when indexing bash strings. However if you require the one-based index, then you can do this:

$ echo $((${#tmp}+1))
15
$ 
  • 1
    probably the best answer, this syntax is beautiful and so fast and easy to use when you understand its meaning, long live to the core – vdegenne Dec 30 '16 at 22:49
  • It's also broken in the presence of special unescaped characters, which is usually why people start asking for 'positions of literal text' and not 'substrings'. – i30817 Dec 12 '21 at 20:20
11

Try:

printf '%s\n' 'RAMSITALSKHMAN|1223333.' | grep -o . | grep -n '|'

output:

15:|

This will give you the position with index based-1.

cuonglm
  • 153,898
5

You can use awk's index function to return the position in characters where the match occurs:

echo "RAMSITALSKHMAN|1223333"|awk 'END{print index($0,"|")}'
15

If you don't mind using the Perl's index function, this handles reporting zero, one or more occurrences of a character:

echo "|abc|xyz|123456|zzz|" | \
perl -nle '$pos=-1;while (($off=index($_,"|",$pos))>=0) {print $off;$pos=$off+1}'

For readability, only, the pipeline has been split across two lines.

As long as the target character is found, index returns a positive value based at zero (0). Hence, the string "abc|xyz|123456|zzz|" when parsed returns positions 0, 4, 8, 15 and 19.

cuonglm
  • 153,898
JRFerguson
  • 14,740
3

We can also do it using "expr match" or "expr index"

expr match $string $substring where $substring is a RE.

echo `expr match "RAMSITALSKHMAN|1223333" '[A-Z]*.|'`

And above will give you the position because it returns the length of the substring matched.

But to be more specific for searching index :

mystring="RAMSITALSKHMAN|122333"
echo `expr index "$mystring" '|'`
bluefoggy
  • 662
  • I don't have enough reputation for commenting anywhere else. I personally liked answer given by @Gnouc . However why to use awk and make it complex when we can do simple things using 'expr' – bluefoggy Sep 02 '14 at 15:29
  • @kingsdeb it's just a suggestion. – Avinash Raj Sep 02 '14 at 16:08
  • @kingsdeb: Because (1) the awk solutions can trivially be modified for report this information on every line of a file (all you have to do is remove the END, which was never really necessary, from JRFerguson’s answer, and Avinash Raj’s does it already); whereas, to do that with the expr solution, you would need to add an explicit loop (and Gnouc’s answer is not easily adaptable to do that at all, that I can see), and (2) the awk solutions can be adapted to report all the matches in each line somewhat more easily than the expr solution (in fact, Avinash Raj’s does that already, too). – G-Man Says 'Reinstate Monica' Sep 02 '14 at 17:37
  • Why would you use echo \...`` here? – Stéphane Chazelas Sep 03 '14 at 11:29
  • This is to just show the output here – bluefoggy Sep 03 '14 at 12:13
3

Another awk command,

$ echo 'RAMSITALSKHMAN|1223333'| awk 'BEGIN{ FS = "" }{for(i=1;i<=NF;i++){if($i=="|"){print i;}}}'
15

By setting the Field separator as null string, awk turns individual character in the record as separate fields.

Avinash Raj
  • 3,703
1

some alternatives include:

similar to Gnouc's answer, but with the shell:

echo 'RAMSITALSKHMAN|1223333' |
tr -c \| \\n | 
sh

sh: line 15: syntax error near unexpected token `|
sh: line 15: `|'

with sed and dc possibly spanning multiple lines:

echo 'RAMSITALSKHMAN|1223333' |
sed 's/[^|]/1+/g;s/|/p/;1i0 1+' |dc

15

with $IFS...

IFS=\|; set -f; set -- ${0+RAMSITALSKHMAN|1223333}; echo $((${#1}+1))

That will also tell you how many there are like...

echo $(($#-1))
mikeserv
  • 58,310
0

Python answer

text='skfwlefk|3oeio|ajda'
print([idx for idx, char in enumerate(text) if char == '|'])
# prints '[8, 14]'
tinnick
  • 300
  • 2
  • 10