2

In my document, I have an em dash that needs to be followed by italics. Here's a minimal example:

He was surprised—/excuse me/?

The LATEX/PDF output includes the slashes and the text is not italicized. How can I resolve this without adding spaces? This is a quote, so I can't change the text.

T. Arboreus
  • 123
  • 4
  • See `org-emphasis-regexp-components`. Another similar question that contains some more pointers can be found [here](https://emacs.stackexchange.com/questions/54632/org-mode-monospaces-more-than-it-should#comment85326_54632). – NickD Jan 05 '20 at 00:30

1 Answers1

2

As you have found out, adding a space between the em-dash and the slash renders the italics properly (both in the buffer, if you use a font that has a slanted version; and in the exported file, be that PDF or HTML or ...).

The reason that works is the setting of org-emphasis-regexp-components. This is a complex variable: it is a list of various parts which are used to construct at initialization time some rather formidable regexps that are used during runtime to determine whether emphasis is to be applied. The hope (which is partly but not completely realized IMHO) was that it would be easier to understand each part and be able to modify it more easily than to understand the whole monstrous regexp that Org mode constructs.

The description of the variable (at the default setting) is as follows (you can see it by doing C-h v org-emphasis-regexp-components RET in your emacs):

org-emphasis-regexp-components is a variable defined in ‘org.el’. Its value is ("-[:space:]('\"{" "-[:space:].,:!?;'\")}\\[" "[:space:]" "." 1)

Documentation: Components used to build the regular expression for emphasis. This is a list with five entries. Terminology: In an emphasis string like " *strong word* ", we call the initial space PREMATCH, the final space POSTMATCH, the stars MARKERS, "s" and "d" are BORDER characters and "trong wor" is the body. The different components in this variable specify what is allowed/forbidden in each part:

pre Chars allowed as prematch. Beginning of line will be allowed too.

post Chars allowed as postmatch. End of line will be allowed too.

border The chars forbidden as border characters.

body-regexp A regexp like "." to match a body character. Don’t use non-shy groups here, and don’t allow newline here.

newline The maximum number of newlines allowed in an emphasis exp.

You need to reload Org or to restart Emacs after setting this.

Note that the pre set consists of -, (, ', ", { and any member of the character class [:space:] which includes the ordinary space character , but can also include other characters (see Char Classes in regexps for details).

In your case, you want em-dash to be a legal pre character. That's relatively easy - assuming that Org mode is loaded already, you can modify the pre part of the variable (the car of the list) with:

(setcar org-emphasis-regexp-components "-—[:space:]('\"{")

We just add the em-dash to the existing pre set of characters and set the car of the variable to the new value.

The trouble is that modifying this variable does nothing to the (already constructed at initialization time) regexps that are based on this. That's why the doc says: You need to reload Org or to restart Emacs after setting this. - either of these actions will run the initialization code again to (re-)calculate the derivative regexps. If you are only going to need this modification for this session of emacs only, you can do the setcar and then reload Org mode: M-x org-reload RET. However, if you want to make the change permanent, you will need to add the setcar above to your initialization file (e.g. ~/.emacs.d/init.el or similar); but that can only be done if the variable is already defined, which is only done when org.el[c] is loaded. So the invocation in your initialization file has to be something like this:

(eval-after-load 'org 
                 '(progn (setcar org-emphasis-regexp-components "-—[:space:]('\"{")
                         (org-reload)))

This evaluates the form when the file that provides the org feature is loaded (or if it is already loaded). That ensures that the variable is defined before we try to modify it. Then we reload Org mode in order to have the recalculation of the derivative regexps take place.

EDIT: Actually, this last method causes a recursive file mode error. I think adding a complete definition of org-emphasis-regexp-components somewhere near the beginning of your initialization file, before Org mode is loaded, will work better:

(setq org-emphasis-regexp-components
  '("-—[:space:]('\"{"
    "-[:space:].,:!?;'\")}\\["
    "[:space:]"
    "."
    1))

...

(require 'org)

This defines a new variable (IOW, it does not try to modify an existing one), so it can (and should) be done before org is loaded. When org is loaded, it uses this variable definition instead of the default one.

Restarting emacs after this will make the em-dash a permanent member of the pre set.

NickD
  • 27,023
  • 3
  • 23
  • 42