0

I have a list of strings containing german umlauts as their html representation, e.g. ü is ü and so on. Is there an easy, maybe built-in, way to convert them within strings, i.e. "Prüfung" should become "Prüfung"?

I found that xml.el does something similar but it depends on the entity code and does it the other way around, as far as I understood.

Martin Buchmann
  • 421
  • 2
  • 14

1 Answers1

1

Here's a simple function to do it. It has the list of German umlauts, the acute-accented E, and the SZ ligature. It can be pretty easily extended. There might be a better function somewhere in xml.el, but I can't find it now.

(defun de-escape (string)
  ""
  (let ((replacements '(("Ä" "Ä")
                        ("ä" "ä")
                        ("É" "É")
                        ("é" "é")
                        ("Ö" "Ö")
                        ("ö" "ö")
                        ("Ü" "Ü")
                        ("ü" "ü")
                        ("ß" "ß")))
        (case-fold-search nil))
    (with-temp-buffer
      (insert string)
      (dolist (replacement replacements)
        (cl-destructuring-bind (old new) replacement
          (goto-char (point-min))
          (while (search-forward old nil t)
            (replace-match new))))
      (buffer-string))))

Some example usages:

ELISP> (de-escape "Prüfung")
"Prüfung"
ELISP> (de-escape "die Bären")
"die Bären"
ELISP> (de-escape "Ökologie")
"Ökologie"
zck
  • 8,984
  • 2
  • 31
  • 65