I need to identify the postion of a character in string.
Example, the string is RAMSITALSKHMAN|1223333
.
grep -n '[^a-zA-Z0-9\$\~\%\#\^]'
How do I find the position of |
in the given string?
I need to identify the postion of a character in string.
Example, the string is RAMSITALSKHMAN|1223333
.
grep -n '[^a-zA-Z0-9\$\~\%\#\^]'
How do I find the position of |
in the given string?
You can use -b
to get the byte offset, which is the same as the position for simple text (but not for UTF-8 or similar).
$ echo "RAMSITALSKHMAN|1223333" | grep -aob '|'
14:|
In the above, I use the -a
switch to tell grep to use the input as text; necessary when operating on binary files, and the -o
switch to only output the matching character(s).
If you only want the position, you can use grep to extract only the position:
$ echo "RAMSITALSKHMAN|1223333" | grep -aob '|' | grep -oE '[0-9]+'
14
If you get weird output, check to see if grep has colors enabled. You can disable colors by passing --colors=never
to grep, or by prefixing the grep command with a \
(which will disable any aliases), e.g.:
$ echo "RAMSITALSKHMAN|1223333" | grep -aob '|' --color=never | \grep -oE '^[0-9]+'
14
For a string that returns multiple matches, pipe through head -n1
to get the first match.
Note that I use both in the above, and note that the latter will not work if grep is "aliased" through an executable (script or otherwise), only when using aliases.
^
:)
– runejuhl
Sep 02 '14 at 19:48
0:|
as output-- because 0 is the byte position of the beginning of the line where |
is found.
– Alex
May 25 '17 at 14:20
grep (GNU grep) 2.27
. Are you perhaps using OS X?
– runejuhl
May 26 '17 at 12:03
If you're using the bash shell, you can use purely built-in operations without the need for spawning external processes such as grep or awk:
$ str="RAMSITALSKHMAN|1223333"
$ tmp="${str%%|*}"
$ if [ "$tmp" != "$str" ]; then
> echo ${#tmp}
> fi
14
$
This uses a parameter expansion to remove all occurrences of |
follows by any string and save that in a temporary variable. It is then just a matter of measuring the length of the temporary variable to get the index of |
.
Note the if
is checking if the |
exists at all in the original string. If it doesn't then the temporary variable will be the same as the orginal.
Note also this provides the zero-based index of |
which is generally useful when indexing bash strings. However if you require the one-based index, then you can do this:
$ echo $((${#tmp}+1))
15
$
Try:
printf '%s\n' 'RAMSITALSKHMAN|1223333.' | grep -o . | grep -n '|'
output:
15:|
This will give you the position with index based-1.
$(( 'command' - 1))
to change it to index 0.
– Alex
May 25 '17 at 14:22
You can use awk's index
function to return the position in characters where the match occurs:
echo "RAMSITALSKHMAN|1223333"|awk 'END{print index($0,"|")}'
15
If you don't mind using the Perl's index
function, this handles reporting zero, one or more occurrences of a character:
echo "|abc|xyz|123456|zzz|" | \
perl -nle '$pos=-1;while (($off=index($_,"|",$pos))>=0) {print $off;$pos=$off+1}'
For readability, only, the pipeline has been split across two lines.
As long as the target character is found, index
returns a positive value based at zero (0). Hence, the string "abc|xyz|123456|zzz|" when parsed returns positions 0, 4, 8, 15 and 19.
RAMSITALSKHMAN|1|223333
– cuonglm
Sep 02 '14 at 15:05
We can also do it using "expr match" or "expr index"
expr match $string $substring where $substring is a RE.
echo `expr match "RAMSITALSKHMAN|1223333" '[A-Z]*.|'`
And above will give you the position because it returns the length of the substring matched.
But to be more specific for searching index :
mystring="RAMSITALSKHMAN|122333"
echo `expr index "$mystring" '|'`
awk
solutions can trivially be modified for report this information on every line of a file (all you have to do is remove the END
, which was never really necessary, from JRFerguson’s answer, and Avinash Raj’s does it already); whereas, to do that with the expr
solution, you would need to add an explicit loop (and Gnouc’s answer is not easily adaptable to do that at all, that I can see), and (2) the awk
solutions can be adapted to report all the matches in each line somewhat more easily than the expr
solution (in fact, Avinash Raj’s does that already, too).
– G-Man Says 'Reinstate Monica'
Sep 02 '14 at 17:37
$ echo 'RAMSITALSKHMAN|1223333'| awk 'BEGIN{ FS = "" }{for(i=1;i<=NF;i++){if($i=="|"){print i;}}}'
15
By setting the Field separator as null string, awk turns individual character in the record as separate fields.
some alternatives include:
similar to Gnouc's answer, but with the shell:
echo 'RAMSITALSKHMAN|1223333' |
tr -c \| \\n |
sh
sh: line 15: syntax error near unexpected token `|
sh: line 15: `|'
with sed
and dc
possibly spanning multiple lines:
echo 'RAMSITALSKHMAN|1223333' |
sed 's/[^|]/1+/g;s/|/p/;1i0 1+' |dc
15
with $IFS
...
IFS=\|; set -f; set -- ${0+RAMSITALSKHMAN|1223333}; echo $((${#1}+1))
That will also tell you how many there are like...
echo $(($#-1))
Python answer
text='skfwlefk|3oeio|ajda'
print([idx for idx, char in enumerate(text) if char == '|'])
# prints '[8, 14]'