5

Is it possible to syntax highlight doxygen in C/C++ code?

QtCreator and Vim both support basic highlighting so \param, \note... etc are highlighted differently.

ideasman42
  • 8,375
  • 1
  • 28
  • 105

3 Answers3

4

Posting own answer since I didn't find an existing method *.

This is generic Doxygen highlighting, it doesn't try to be too strict since doxy supports so many different expressions.

  • \[anything] and @anything are matched (until non whitespace).
  • #symbol uses stricter matching ([a-zA-Z0-9_\.\:]+).
  • Doxy comment styles /** … */, /*! … */, /// … are supported.
  • Symbols quoting 'symbol', `symbol` and ``symbol``
  • HTML tags also supported (from javadoc).

Configuration:


(require 'cc-mode)
;; Generic doxygen formatting
(defconst custom-font-lock-doc-comments
  (let ((symbol          "[a-zA-Z0-9_]+")
        ;; We may want to add more here.
        (symbol_extended "[a-zA-Z0-9_\.\*\-\:]+")
        ;; Struct references #struct.value or class::member.
        (symbol_member   "[a-zA-Z0-9_\.\:]+"))
    `
    ((,
      (concat "</?\\sw"  ;; HTML tags.
              "\\("
              (concat "\\sw\\|\\s \\|[=\n\r*.:]\\|"
                      "\"[^\"]*\"\\|'[^']*'")
              "\\)*>")
      0 ,c-doc-markup-face-name prepend nil)
     (,(concat "\'" symbol_extended "\'") ; 'symbol'
      0 ,c-doc-markup-face-name prepend nil)
     (,(concat "\`" symbol_extended "\`") ; `symbol`
      0 ,c-doc-markup-face-name prepend nil)
     (,(concat "\`\`" symbol_extended "\`\`") ; ``symbol``
      0 ,c-doc-markup-face-name prepend nil)
     (,
      (concat
       "[\\@]"           ;; start of Doxygen special command
       "\\(?:"
       "[a-z]+\\|"       ;; typical word Doxygen special @cmd or \cmd
       "[[:punct:]]+"    ;; non-word commands
       "\\)")
      0 ,c-doc-markup-face-name prepend nil)
     (,(concat "#" symbol_member) ; #some_c_symbol.some_member
      0 ,c-doc-markup-face-name prepend nil))))

;; Matches across multiple lines:
;;   /** doxy comments */
;;   /*! doxy comments */
;;   /// doxy comments
;; Doesn't match:
;;   /*******/
(defconst custom-font-lock-keywords
  `((,
     (lambda (limit)
       (c-font-lock-doc-comments
           "/\\(//\\|\\*[\\*!][^\\*!]\\)"
           limit custom-font-lock-doc-comments)))))

Note that projects that use a single doxygen comment prefix
might use a simpler match on the initial comment,
eg: `"/\\*\\*"` or `"/\\(//\\|\\*\\*\\)"`

* There is doxymacs but it was last updated in 2007 and seems more heavy-weight than I'm looking for (not available from package manager, requires compiled binaries).
* Minor improvement based on @John32ma's answer.

ideasman42
  • 8,375
  • 1
  • 28
  • 105
2

Here's an attempt at improved Doxygen syntax highlighting based on ideasman42 answer. The code:

(require 'cc-mode)

(defface doxygen-verbatim-face
  '((default :inherit default))
  "Face used to show Doxygen block regions"
  :group 'font-lock-faces)

(defface doxygen-match-face
  '((default :inherit default)
    (t :underline t))
  "Face used to show Doxygen region start end commands"
  :group 'font-lock-faces)

(defconst custom-font-lock-doc-comments
  `(
    ;; Highlight Doxygen special commands,
    ;;   \cmd or @cmd
    ;; and the non [a-z]+ commands
    ;;   \\ \@ \& \# \< \> \% \" \. \| \-- \--- \~[LanguageId]
    (,(concat
       "\\(?:"
       "[\\@][a-z]+"     ;; typical word Doxygen special @cmd or \cmd
       "\\|"
       ;; non-word commands, e.g. \\ or @\
       "[\\@]\\(?:\\\\\\|@\\|&\\|#\\|<\\|>\\|%\\|\"\\|\\.\\|::\\||\\|---?\\|~[a-z]*\\)"
       "\\)")
     0 ,c-doc-markup-face-name prepend nil)
    ;; Highlight autolinks. These are referring to functions, so we use a different font face
    ;; from the Doxygen special commands.
    (,(concat
       "\\(?:"
       ;; function() or function(int, std::string&, void*) or more complex where we only
       ;; match the first paren, function(x->(), 2*(y+z)).
       "[A-Za-z_0-9]+(\\([A-Za-z_0-9:&*, ]*)\\)?"
       ;; ClassName::memberFcn or the destructor ClassName::~ClassName. Can also do unqualified
       ;; references, e.g. ::member. The parens are optional, ::member(int, int), ::member(a, b).
       ;; We only require matching of first paren to make cases like ::member(x->(), 2*(y+z))
       ;; work. We don't want \::thing to be highlighed as a function, hence reason to look for
       ;; class::member or space before ::member.  Note '#' can be used instead of '::'
       "\\|"
       "\\(?:[A-Za-z_0-9]+\\|\\s-\\)\\(?:::\\|#\\)~?[A-Za-z_0-9]+(?\\(?:[A-Za-z_0-9:&*, \t]*)\\)?"
       ;; file.cpp, foo/file.cpp, etc. Don't want to pickup "e.g." or foo.txt because
       ;; these are not autolinked so look for common C++ extensions.
       "\\|"
       "[A-Za-z_0-9/]+\\.\\(?:cpp\\|cxx\\|cc\\|c\\|hpp\\|hxx\\|hh\\|h\\)"
       "\\)")
     0 font-lock-function-name-face prepend nil)
    ;; Highlight URLs, e.g. http://doxygen.nl/autolink.html note we do this
    ;; after autolinks highlighting (we don't want nl/autolink.h to be file color).
    ("https?://[^[:space:][:cntrl:]]+"
     0 font-lock-keyword-face prepend nil)
    ;; Highlight HTML tags - these are processed by Doxygen, e.g. <b> ... </b>
    (,(concat "</?\\sw"
                "\\("
                (concat "\\sw\\|\\s \\|[=\n\r*.:]\\|"
                        "\"[^\"]*\"\\|'[^']*'")
                "\\)*>")
     0 ,c-doc-markup-face-name prepend nil)
    ;; E-mails, e.g. first.last@domain.com. We don't want @domain to be picked up as a Doxygen
    ;; special command, thus explicitly look for e-mails and given them a different face than the
    ;; Doxygen special commands.
    ("[A-Za-z0-9.]+@[A-Za-z0-9_]+\\.[A-Za-z0-9_.]+"
     0 font-lock-keyword-face prepend nil)
    ;; Quotes: Doxygen special commands, etc. can't be in strings when on same line, e.g.
    ;; "foo @b bar line2 @todo foobar" will not bold or create todo's.
    ("\"[^\"[:cntrl:]]+\""
     0 ,c-doc-face-name prepend nil)

    ("[^\\@]\\([\\@]f.+?[\\@]f\\$\\)"  ;; single line formula but an escaped formula, e.g. \\f[
     1 'doxygen-verbatim-face prepend nil)

    ;; Doxygen verbatim/code/formula blocks should be shown using doxygen-verbatim-face, but
    ;; we can't do that easily, so for now flag the block start/ends
    (,(concat
       "[^\\@]"  ;; @@code shouldn't be matched
       "\\([\\@]\\(?:verbatim\\|endverbatim\\|code\\|endcode\\|f{\\|f\\[\\|f}\\|f]\\)\\)")
     1 'doxygen-match-face prepend nil)

    ;; Here's an attempt to get blocks shown using doxygen-verbatim-face. However, font-lock doesn't
    ;; support multi-line font-locking by default and I'm not sure the best way to make these work.
    ;;
    ;; Doxygen special commands, etc. can't be in verbatim/code blocks
    ;;   @verbatim
    ;;      @cmd  -> not a Doxygen special command
    ;;   @endverbatim
    ;; so set verbatim/code to a different font.  Verbatim/code blocks spans multiple lines and thus
    ;; a refresh of a buffer after editing a verbatim/code block may be required to have the font
    ;; updated.
    ;;("[^\\@][\\@]\\(verbatim\\|code\\)\\([[:ascii:][:nonascii:]]+?\\)[\\@]end\\1"
    ;; 2 'doxygen-verbatim-face prepend nil)
    ;; Doxygen formulas are link verbatim blocks, but contain LaTeX, e.g.
    ;;("[^\\@][\\@]f.+[\\@f]\\$"  ;; single line formula
    ;; 0 'doxygen-verbatim-face prepend nil)
    ;; multi-line formula,
    ;;   \f[ ... \f]     or    \f{ ... \}
    ;;("[^\\@][\\@]f\\(?:{\\|\\[\\)\\([[:ascii:][:nonascii:]]+?\\)[\\@]f\\(?:}\\|\\]\\)"
    ;; 1 'doxygen-verbatim-face prepend nil)

    ))



;; Matches across multiple lines:
;;   /** doxy comments */
;;   /*! doxy comments */
;;   /// doxy comments
;; Doesn't match:
;;   /*******/
(defconst custom-font-lock-keywords
  `((,(lambda (limit)
        (c-font-lock-doc-comments "/\\(//\\|\\*[\\*!][^\\*!]\\)"
            limit custom-font-lock-doc-comments)))))

(setq-default c-doc-comment-style (quote (custom)))

This solution isn't perfect in that it doesn't correctly syntactically color verbatim/code/formula blocks. I don't understand enough about font-lock to handle multi-lines fontification. For example, text in side a verbatim block that looks like Doxygen special commands shouldn't be marked as special commands.

Here's a test case:

/**
@file DoxygenTest.cpp

@page Test_page Test Doxygen examples in Emacs

Some of the examples below are taken from http://doxygen.nl/autolink.html. Note Doxygen will convert
URL's and e-mails, e.g. user@domain.com to links and Emacs will syntatically color them
differently from the Doxygen special commands.

@section Autolinks

A link to a member of the Autolink_Test class: Autolink_Test::member. Note, to have the
class syntatically highlighted, you need to prefix it with a hash character, e.g. #Autolink_Test.

More specific links to the each of the overloaded members:
Autolink_Test::member(int) and Autolink_Test#member(int,int). 
Can also do an instance call, e.g. Autolink_Test::member(x->(), 2*(y+z)).

The ref command is optional, @ref Autolink_Test::member(int) and @ref Autolink_Test#member(int,int)
but required when the class is @ref lowercase

A link to a protected member variable of Autolink_Test: Autolink_Test#var,

A link to the global enumeration type #GlobEnum.

A link to the define #ABS(x) or ABS(x)

A link to the destructor of the Autolink_Test class: Autolink_Test::~Autolink_Test,

A link to the typedef ::B.

A link to the enumeration type Autolink_Test::EType

A link to some enumeration values Autolink_Test::Val1 and ::GVal2

@section General Doxygen Items

You can have verbatim, code, and formula blocks. Emacs fontification does get confused
for blocks, so we fontify the start and end differently.

@verbatim
  Some verbatim text @foo @bar here.
@endverbatim

Code block:

@code{.cpp}
  if (a > 2) {
      doit();
  }
@endcode

A formula can be inlined: The distance between \f$(x_1,y_1)\f$ and \f$(x_2,y_2)\f$ 
is \f$\sqrt{(x_2-x_1)^2+(y_2-y_1)^2}\f$.


Math environment multi-line formulas are specified using formula brackets, e.g.

  \f[
    |I_2|=\left| \int_{0}^T \psi(t)
             \left\{
                u(a,t)-
                \int_{\gamma(t)}^a
                \frac{d\theta}{k(\theta,t)}
                \int_{a}^\theta c(\xi)u_t(\xi,t)\,d\xi
             \right\} dt
          \right|
  \f]

Non-math environment formulas or other LaTeX are specified using formula braces, e.g.

   \f{eqnarray*}{
        g &=& \frac{Gm_2}{r^2} \\ 
          &=& \frac{(6.673 \times 10^{-11}\,\mbox{m}^3\,\mbox{kg}^{-1}\,
              \mbox{s}^{-2})(5.9736 \times 10^{24}\,\mbox{kg})}{(6371.01\,\mbox{km})^2} \\ 
          &=& 9.82066032\,\mbox{m/s}^2
   \f}

Non-word Doxygen special commands are used to create certain non-alphanum characters. See
http://www.doxygen.nl/manual/commands.html. These include \\ \@ \& \# \< @> \% \"
\. \:: \-- \--- and the language selectors, e.g. \\~english \~english for english should be
highlighted, likewise \\~ \~ for all languages.

For example, to reference a file that contains 'at' symbols: /foo/@@bar/@@goo/file.txt

Another example is the need to escape 
\#include @<foo.hpp@> or better yet, put in quotes "#include <foo.hpp>".

HTML tags allowed in doxygen e.g. "<b>" <b>to bold a set of words</b> "</b>"

Should be able to bold thing in parens: (@b thing) or not in parens: @b thing.

You can use special commands in part of a word, e.g. foothing where thing is bold: foo@b thing.

Doxygen special commands have no meaning in quotes when on same line, e.g.
"foo @b bar line2 @todo foobar".

*/

/*!
  Autolinks for class documentation.

  Since this documentation block belongs to the class Autolink_Test no link to
  Autolink_Test is generated.

  Two ways to link to a constructor are: #Autolink_Test and Autolink_Test().

  Links to the destructor are: #~Autolink_Test and ~Autolink_Test().

  A link to a member in this class: member().

  More specific links to the each of the overloaded members:
  member(int) and member(int,int).

  A link to the variable #var.

  A link to the global typedef ::B.

  A link to the global enumeration type #GlobEnum.

  A link to the define ABS(x).

  A link to a variable \link #var using another text\endlink as a link.

  A link to the enumeration type #EType.

  A link to some enumeration values: \link Autolink_Test::Val1 Val1 \endlink and ::GVal1.

  And last but not least a link to a file: Autolink_Test.cpp.

  function() or function(int, std::string&, void*)

  \sa Inside a see also section any word is checked, so EType,
      Val1, GVal1, ~Autolink_Test and member will be replaced by links in HTML.

*/
class Autolink_Test {
  public:
    Autolink_Test();               //!< constructor
    ~Autolink_Test();              //!< destructor
    void member(int);     /**< A member function. Details. */
    void member(int,int); /**< An overloaded member function. Details */
    /** An enum type. More details */
    enum EType {
        Val1,               /**< enum value 1 */
        Val2                /**< enum value 2 */
    };
  protected:
    int var;              /**< A member variable */
};

/*! details. */
Autolink_Test::Autolink_Test() { }

/*! details. */
Autolink_Test::~Autolink_Test() { }

class lowercase {};

/*! A global variable. */
int globVar;

/*! A global enum. */
enum GlobEnum {
    GVal1,    /*!< global enum value 1 */
    GVal2     /*!< global enum value 2 */
};

/*!
 *  A macro definition.
 */
#define ABS(x) (((x)>0)?(x):-(x))

typedef Autolink_Test B;
/*! \fn typedef Autolink_Test B
 *  A type definition.
 */

and to see the default doxygen output

doxygen -g doxygen.cfg
doxygen doxygen.cfg
albert
  • 109
  • 2
John32ma
  • 51
  • 4
  • Could you show some examples that fail? A more comprehensive solution is nice, but I find emacs is on the slow-side already, so prefer to keep this stupid/simple if possible. Could you show a multi-line example that fails? I tried to spot a difference but couldn't. – ideasman42 Nov 06 '17 at 02:00
  • Noticed this breaks `(@a foo)` which is valid doxygen. Infact `foo@a bar` is also valid. – ideasman42 Nov 06 '17 at 02:28
  • I fixed the (@a foo) and foo@a bar items, plus more. I also added a test case. It's not perfect, so any help on how to syntactically handle verbatim blocks would be welcome. – John32ma Nov 09 '17 at 20:04
2

I just released a package for highlighting Doxygen comments. In addition to highlighting Doxygen commands and their arguments, it highlights code examples according to the language they are written in.

See highlight-doxygen for more information.

Lindydancer
  • 6,095
  • 1
  • 13
  • 25