Here's a better method to deal with Unicode characters that do not map to a glyph in the current font (the old answer is left here for reference: there is some useful information in there, but I don't think the \setmainfont
method should be used).
This assumes that you are using XeLaTeX
as your processor.
See the old answer below for the setting of org-latex-pdf-process
that uses XeLaTeX
.
The basic idea is that of font substitution
. If you use Unicode chars with glyphs that are not provided by the main document font, we want to arrange for XeLaTeX
to insert the glyphs from a different font that does provide them. That is what Emacs does e.g. automatically (if, despite its best efforts, none of the fonts it knows about provides the necessary glyph, Emacs will print a box with the Unicode number of the char inside it: try C-h h
to print the Hello file in various scripts - in my case, I don't have TaiViet fonts, so that line just shows the boxes. In contrast, LaTeX leaves it blank).
For LaTeX, you have to tell it what fonts to use: it doesn't have a boatload of them predefined. That can be done for XeLaTeX by defining the following symbolasubst.sty
% define a new font family
\usepackage{fontspec}
\newfontfamily{\SymbolaSubstFont}{Symbola}
% use the interchartoken mechanism for font substitution
\XeTeXinterchartokenstate=1
\newXeTeXintercharclass\SymbolaSubst
% define the chars that are going to be substituted
\XeTeXcharclass"1F311=\SymbolaSubst
\XeTeXcharclass"1F313=\SymbolaSubst
\XeTeXcharclass"1F315=\SymbolaSubst
\XeTeXcharclass"1F317=\SymbolaSubst
% enclose every "unknown" character in a group that uses the substitute font
\XeTeXinterchartoks 0 \SymbolaSubst = {\begingroup\SymbolaSubstFont}
\XeTeXinterchartoks 4095 \SymbolaSubst = {\begingroup\SymbolaSubstFont}
\XeTeXinterchartoks \SymbolaSubst 0 = {\endgroup}
\XeTeXinterchartoks \SymbolaSubst 4095 = {\endgroup}
This informs XeLaTeX that the specified Unicode characters should be substituted from the Symbola font, rather than from the main font (whatever that font might be). Here is a reference to an older and somewhat outdated version that I used and touched up lightly to come up with the version above (see the comments on that answer for the changes necessary). The reference also describes how to deal with whole swaths of characters if you have to.
Once you have the above file (maybe in the same directory as your Org mode file, but if you want to reuse it, then install it in a directory that XeLaTeX knows about), using it is easy - you just have to add the \usepackage
to the preamble:
#+LATEX_HEADER: \usepackage{symbolasubst}
* Test
I'm trying to export an Org table that contains the Unicode characters for moon phases (U+1F311 etc). Export to HTML works fine, but I just see blank spaces in the exported PDF. I know next to nothing about LaTeX. What can I do? Preferably from within Org. Emacs 27.2, Org 9.5.3, Manjaro.
Here's the table in full:
|---------------+-----------------+---------|
| *Phase* | *UCS Codepoint* | *Glyph* |
|---------------+-----------------+---------|
| New moon | =U+1F311= | |
| First quarter | =U+1F313= | |
| Full moon | =U+1F315= | |
| Last quarter | =U+1F317= | |
|---------------+-----------------+---------|
| | | <c> |
I'm pretty sure that this version will work not only with the minimal file but with any file that uses these characters. If you try it out, please let me know if there are any problems.
OLD ANSWER
Here's one way to get what you want out of the Org->LaTeX->PDF workflow.
I assume you have set org-latex-pdf-process
to '("latexmk --shell-escape -pdf -xelatex -output-directory=%o %f")
as mentioned (somewhat inaccurately - the value of org-latex-pdf-process
is a list of strings, not a string) in my comment. The following should work with lualatex
as well as xelatex
, although I've only tried with xelatex
. As mentioned in the comment, pdflatex
's support for Unicode is rudimentary: xelatex
and lualatex
are better choices in this day and age.
The problem is that the glyphs for the Unicode code points you use come from a font that TeX does not know about (but apparently emacs, web browsers and LibreOffice do). If you do C-u C-x =
on any of the moon phases glyphs, you'll see that they all come from the Symbola font. This is installed as a system font, but not as a TeX font. However, xelatex
and lualatex
can use arbitrary TTF/OTF fonts fairly easily.
To tell TeX about this font, you can use the fontspec
package. Here's a complete minimal example based on your table:
N.B. See the better answer above. I do not recommend setting the main font as is done below.
#+LATEX_HEADER: \usepackage{fontspec}
#+LATEX_HEADER: \setmainfont{Symbola}
* moon phases
|---------------+-----------------+---------|
| *Phase* | *UCS Codepoint* | *Glyph* |
|---------------+-----------------+---------|
| New moon | =U+1F311= | |
| First quarter | =U+1F313= | |
| Full moon | =U+1F315= | |
| Last quarter | =U+1F317= | |
|---------------+-----------------+---------|
| | | <c> |
Exporting this to PDF should now work.
This is based on this blog post which I found in this TeX/LaTeX SE question.