0

I want to replace a variable expansion with a URL --- directly inside a file.
The URL I want to replace the variable expansion with is UTF-8 encoded due to the language of its webpage name being Right-To-Left (RTL) [Hebrew].


Here is a replaced-replacement pattern I work with (currently without escaping):

sed -i 's/$contact_form_success_webpage/https://example.com/index.php?title=%D7%99%D7%A6%D7%99%D7%A8%D7%AA_%D7%A7%D7%A9%D7%A8:%D7%94%D7%A6%D7%9C%D7%97%D7%94/g' FILE

I could add a backslash before the $ of $contact_form_success_webpage so to make it \$contact_form_success_webpage which is processable by sed but to start adding backslashes to needs-to-be-esacped parts of "long" encoded URLs is something I'd rather not do totally by myself and would prefer some automation for that.

The above URL pattern is quite "light" or "easy" but some URLs might have lots of forward slashes (/) and perhaps lots of other needs-to-be-escaped parts also.


How would you do suggest to escape UTF-8 encoded URLs?
(What pattern will you use for generally all use cases?)

  • What needs to be escaped there anyway? The three slashes, ok, but doing that by hand is hardly a lot of work? (Or just use a different separator for s.) – DonHolgo Mar 30 '21 at 12:34
  • @DonHolgo shouldn't the index dot (of index.php) and the query string's ? and = should also be escaped? Beyond that, this might be a "light" case but let me tell ya as a native RTL language speaker (in case you aren't), some URLs can be drastically longer than that, let along include directories... – timesharer Mar 30 '21 at 13:55
  • It's not clear whether $contact_form_success_webpage is a shell variable or a literal string that you want to replace. Also, regarding your last comment, there's no = or ? in your query string, is there? – Kusalananda Mar 30 '21 at 14:32
  • Related: https://unix.stackexchange.com/questions/32907/what-characters-do-i-need-to-escape-when-using-sed-in-a-sh-script – Kusalananda Mar 30 '21 at 14:33
  • 1
    @timesharer No, all of these aren't a problem in the replacement part, you'd just have to deal with them if they appeared in the part to be replaced. If I understand your problem correctly and you want to replace the literal $contact_form_success_webpage with the URL, you'd just need to put \/ instead of / in the replacement or pick a different separator like s|...|...|g. – DonHolgo Mar 30 '21 at 14:36
  • @DonHolgo Since an URL can't contain spaces, a space could be used as delimiter: sed 's $contact_form_success_webpage https://whatever g' file. Note that the $ at the start does not need escaping in a basic regular expression and that the expression should be single-quoted. – Kusalananda Mar 30 '21 at 14:53

1 Answers1

0

One way is to strap a function in bash that when given a string to be made pluggable on the LHS or RHS of the sed command s/// is as follows:

esc_sedvar() {
  case $1 in
    '--lhs')
      a=( '\'  "[" "^" '$' "." "*" / ) ;;
    '--rhs'|*)
      a=( '\' '&' / )
  esac

local var=$2 for c in "${a[@]}"; do var=${var//"$c"/\"$c"} done printf '%s\n' "$var" }

don't escape anything in this, for you they are plain strings.

srch='$contact_form_success_webpage/https://example.com'

repl='index.php?title=%D7%99%D7%A6%D7%99%D7%A8%D7%AA_%D7%A7%D7%A9%D7%A8:%D7%94%D7%A6%D7%9C%D7%97%D7%94'

sed -i -e
's/'
"$(esc_sedvar --lhs "$srch")"
'/'
"$(esc_sedvar --rhs "$repl")"
'/g'
FILE

guest_7
  • 5,728
  • 1
  • 7
  • 13