9

The docstring for modify-syntax-entry says the following:

(modify-syntax-entry CHAR NEWENTRY &optional SYNTAX-TABLE)
...
The first character of NEWENTRY should be one of the following:
...
  /           character-quote.      @   inherit from parent table.
  |           generic string fence. !   generic comment fence.

What is a fence, and when would I use one? I can't find anything in the info manual.

Wilfred Hughes
  • 6,890
  • 2
  • 29
  • 59
  • This is a question that Emacs itself should answer. The doc string should define or at least describe the term "*fence*". Please consider filing a doc bug: `M-x report-emacs-bug`. – Drew Jun 24 '15 at 02:49
  • Normally, a string or comment delimiter can specify which character terminates a string. E.g. a `"` can only be terminated by another `"`. However, when a custom `syntax-propertize` function is used to recognize a string, this is not possible. Instead, you can mark the end points of strings and comments using `|` and `!`, respectively. (The documentation says that these syntax classes primarily should be used when using the `syntax-table` text property, which is what a custom `syntax-propertize` function sets.) – Lindydancer Jun 24 '15 at 09:51

2 Answers2

8

They are documented in the manual, but it doesn't use the word “fence”. The characters ! and | are listed as “generic comment delimiters” and “generic string delimiters” in the syntax class reference.

These characters were introduced in Emacs 20.1. Quoting the NEWS file:

There are two new syntax-codes, ! and | (numeric values 14 and 15). A character with a code ! starts a comment which is ended only by another character with the same code (unless quoted). A character with a code | starts a string which is ended only by another character with the same code (unless quoted).

These codes are mainly meant for use as values of the `syntax-table' text property.

I can't find any use of ! in the standard Emacs modes. There are several uses of |. The intended use case is languages which have literals that use delimiters other than the usual string delimiters, usually set via overlays added by font locking based on the context. For example, in perl, a regular expression match can be written /REGEXP/, or m/REGEXP/ or m~REGEXP~ or m[REGEXP] or any number of variations. A literal string can be written 'STRING' but also q'STRING', q~STRING~, q[STRING], etc. When font lock recognizes such constructs, it sets the quote characters (/// or '/' or ~/~ or [/] in the examples I gave) to generic string delimiter syntax. Even if a habitual string delimiter is present (e.g. q[foo"bar]), that delimiter will be considered an ordinary part of the string, it won't terminate the string.

I admit that I don't see a definitive benefit — for example CPerl mode does some very fancy things and doesn't use this facility.

5

Taken from syntax.h:

/* A syntax table is a chartable whose elements are cons cells
   (CODE+FLAGS . MATCHING-CHAR).  MATCHING-CHAR can be nil if the char
   is not a kind of parenthesis.

   The low 8 bits of CODE+FLAGS is a code, as follows:  */

enum syntaxcode
  {
    Swhitespace, /* for a whitespace character */
    Spunct,      /* for random punctuation characters */
    Sword,       /* for a word constituent */
    Ssymbol,     /* symbol constituent but not word constituent */
    Sopen,       /* for a beginning delimiter */
    Sclose,      /* for an ending delimiter */
    Squote,      /* for a prefix character like Lisp ' */
    Sstring,     /* for a string-grouping character like Lisp " */
    Smath,       /* for delimiters like $ in Tex.  */
    Sescape,     /* for a character that begins a C-style escape */
    Scharquote,  /* for a character that quotes the following character */
    Scomment,    /* for a comment-starting character */
    Sendcomment, /* for a comment-ending character */
    Sinherit,    /* use the standard syntax table for this character */
    Scomment_fence, /* Starts/ends comment which is delimited on the
                       other side by any char with the same syntaxcode.  */
    Sstring_fence,  /* Starts/ends string which is delimited on the
                       other side by any char with the same syntaxcode.  */
    Smax         /* Upper bound on codes that are meaningful */
  };

Assuming that the syntax codes and the regex syntax classes refer to the same thing, I've spotted use of | in cc-awk.el which uses "\\s|" for highlighting unbalanced string delimiters.

wasamasa
  • 21,803
  • 1
  • 65
  • 97
  • 2
    They are used in a few places, e.g. `python-syntax-stringify`, `ruby-syntax-propertize-percent-literal` and [others](https://gist.github.com/Wilfred/6717d7410b34a4a2068e). I don't see how it differs from `Sstring` here though. – Wilfred Hughes Jun 23 '15 at 21:36