To escape variables to be used on the left hand side and right hand side of a s
command in sed
(here $lhs
and $rhs
respectively), you'd do:
escaped_lhs=$(printf '%s\n' "$lhs" | sed 's:[][\\/.^$*]:\\&:g')
escaped_rhs=$(printf '%s\n' "$rhs" | sed 's:[\\/&]:\\&:g; $!s/$/\\/')
sed "s/$escaped_lhs/$escaped_rhs/"
Note that $lhs
cannot contain a newline character.
That is, on the LHS, escape all the regexp operators (][.^$*
), the escaping character itself (\
, and the separator (/
).
On the RHS, you only need to escape &
, the separator, backslash and the newline character (which you do by inserting a backslash at the end of each line except the last one ($!s/$/\\/
)).
Note: you don't want to add backslashes before characters that do not have a special meaning, because in doing so, you could end up giving them a special meaning. For instance, <
, +
and t
have no special meaning in BREs, but \<
, \+
and \t
do in some implementations of sed
(and for \t
, including on the RHS).
That assumes you use /
as a separator in your sed
s
commands and that you don't enable Extended REs with -r
(GNU sed
/ssed
/ast
/busybox sed
) or -E
(BSDs, ast
, recent GNU, recent busybox) or PCREs with -R
(ssed
) or Augmented REs with -A
/-X
(ast
) which all have extra RE operators.
For EREs, the most widely supported of those extensions, the equivalent would be:
escaped_lhs=$(printf '%s\n' "$lhs" | sed 's:[][\\/.^$*+?(){}|]:\\&:g')
escaped_rhs=$(printf '%s\n' "$rhs" | sed 's:[\\/&]:\\&:g; $!s/$/\\/')
sed -E "s/$escaped_lhs/$escaped_rhs/"
A few ground rules when dealing with arbitrary data:
- Don't use
echo
- quote your variables
- consider the impact of the locale (especially its character set: it's important that the escaping
sed
commands are run in the same locale as the sed
command using the escaped strings (and with the same sed
command) for instance)
- don't forget about the newline character (here you may want to check if
$lhs
contains any and take action).
A much safer option is to use perl
instead of sed
and pass the strings in the environment and use the \Q
/\E
perl
regexp operators for taking strings literally:
A="$lhs" B="$rhs" perl -pe 's/\Q$ENV{A}\E/$ENV{B}/g'
perl
(by default) will not be affected by the locale's character set as, in the above, it only considers the strings as arrays of bytes without caring about what characters (if any) they may represent for the user. With sed
, you could achieve the same by fixing the locale to C
with LC_ALL=C
for all sed
commands (though that will also affect the language of error messages, if any).
In some shells, you can also do the escaping without having to resort to external utilities.
In zsh
(here for BRE escaping):
set -o extendedglob
escaped_lhs=${lhs//(#m)[][\\.^$\/&]/\\$MATCH}
escaped_rhs=${rhs//(#m)[\\&\/$'\n']/\\$MATCH}
In ksh93
:
escaped_lhs=${lhs//[][\\.^$\/&]/\\\0}
escaped_rhs=${rhs//[\\&\/$'\n']/\\\0}
In fish
3.4.0+:
set escaped_lhs (
string replace -ar -- '[][\\\\/.^$*]' '\\\\$0' "$lhs" |
string collect --allow-empty
)
set escaped_rhs (
string replace -ar -- '[\\&/'\n']' '\\$0' "$rhs" |
string collect --allow-empty --no-trim-newlines
)