7

In emacs 25.1, if I customize search-default-mode to "Char-Fold Search", then an isearch-forward for

u matches ü

and one for

o matches ö.

But what character should I type to match ß ?

Drew
  • 75,699
  • 9
  • 109
  • 225
glmorous
  • 81
  • 3

1 Answers1

6

By default, this is not possible (easiest way to see this is with C-h v char-fold-table RET and searching for ß — which you won't find).

However, if you modify char-fold-table (the variable responsible for determining how folding occurs), then you can enable this.

The structure of char-fold-table

char-fold-table is a char-table — quoting the emacs manual:

A char-table is a one-dimensional array of elements of any type, indexed by character codes.

In this case, the "elements" are regular expressions for what the given character is supposed to match when Char-Fold Search is on (i.e. more-or-less a list of the characters that are the same as the given character). For instance, by default, the element corresponding to s, obtained with:

(char-table-range char-fold-table ?s)

(or just (aref char-fold-table ?s) is:

"\\(?:s[̧̣̦́̂̇̌]\\|[sśŝşšſșˢṡṣₛⓢs]\\)"

char-fold-table also has an "extra slot", which is itself a char-table, and which lists the character combinations (starting with the given letter) that should match specific regexps (e.g. fi should match the unicode character ). For instance, the alist corresponding to f, found using:

(aref (char-table-extra-slot char-fold-table 0) ?f)

is:

(("fl" . "ffl") ("fi" . "ffi") ("l" . "fl") ("i" . "fi") ("f" . "ff") ("m" . "㎙") ("̇" . "ḟ"))

which means that ffl should match , ffi , fl etc.

Having eszett matched by ss

This is probably the most logical possibility. To make this possible, we need to modify the alist in the extra slot of the char-fold-table to also contain ("s" . "ß"):

(set-char-table-extra-slot
 char-fold-table 0
 (let* ((multi-char-table (char-table-extra-slot char-fold-table 0))
    (s-alist (aref multi-char-table ?s))
    (modified-s-alist (cons '("s" . "ß") s-alist)))
   (aset multi-char-table ?s modified-s-alist)
   multi-char-table))

Unfortunately, searching isn't too smooth, since sometimes you need to jog isearch, by pressing C-s again, after typing the entire search string, to convince it that the search is not actually failing (the same issue occurs with the default combinations).

(To see the issue, insert gru grüßen in a fresh buffer, place the mark at the start and try to search for grussen, typing — not copy-pasting — the search string. After you've typed grus, isearch concludes that the search is failing and won't do anything more unless prompted. An equivalent problem, using the default configuration would be with an affine and searching for affine.)

Having eszett matched by just s

(aset char-fold-table ?s "\\(?:s[̧̣̦́̂̇̌]\\|[sśŝşšſșˢṡṣₛⓢsß]\\)")

Note the ß inserted into the end of the default regexp.

In principle, you could do both, partially circumventing the above mentioned issue.

Is this a bug?

If you believe that the absence of eszett in the default char-fold-table is a bug, you could report it.

The reason why it's absent is because char-fold-table is automatically generated from the Unicode list of decomposable characters and Unicode believes that ß does not "decompose" into s + s in the same way that á does decompose into a + accute accent:

http://www.unicode.org/Public/UNIDATA/UnicodeData.txt

(search for: LATIN SMALL LETTER SHARP S).

aplaice
  • 2,126
  • 17
  • 23
  • 1
    Thanks for the (very thorough). I'll report this as a bug, but in the meantime, I'm using your answer to match 'ß' with 's'. – glmorous Feb 10 '18 at 10:22