Looking for a guide to optimising regexp matches in bash.
I have a script that loops over a very long list of URLs looking for patterns. Currently it looks a little like the fragment below. Is there a guide to optimising these kinds of matches?
if [[ ${url} == */oai/request ]]
then
echo first option
elif [[ ${url} =~ .*/index.php/[^/]+/journal=.* ]]
then
echo second option
elif [[ ${url} =~ .*/[Ee][Tt][dD]-[Dd][Bb]/.* ]]
then
echo third option
elif [[ ${url} =~ .*/handle/[0-9]+/[0-9].* || ${url} =~ .*/browse.* ]]
then
echo fourth option
else
echo no-match option
fi
.*
from the beginning and end of each regex. – choroba Jan 05 '15 at 10:28case
statements, if you don't mind using extended globs instead. Globs might be faster than regexes: http://stackoverflow.com/a/4555979/2072269 – muru Jan 05 '15 at 11:16perl
/python
/ruby
/awk
. – Stéphane Chazelas Oct 27 '15 at 15:40