3

TL;DR

Is there some method in Emacs to replace text with nested braces/parentheses?

Example

I had a situation where I had to replace

\dchi{ARG1}{ARG2}

by

\chi^{ARG2}(ARG1)

For simple cases, this is easily be done with query-replace-regexp,

(query-replace-regexp 
  "\\\\dchi{\\(?1:.*?\\)}{\\(?2:.*?\\)}" 
  "\\\\chi^{\\2}(\\1)")

This will however choke on nested-braces structures.

Before replacement:   \dchi{\xx{0}{2}}{\yy}
Obtained replacement: \chi^{2}(\xx{1)}{\yy} 
Wanted replacement:   \chi^{\yy}(\xx{0}{2})

Another example would be refactoring function calls with sub-calls, e.g.

Pattern             f1(ARG1, ARG2, ARG3, ARG4)
Replace by          f2(ARG4, ARG3, ARG2, ARG1)

Text with nesting   f1(A, B, f3(C, D), E)
Wanted replacement  f3(E, f3(C, D), B, A)
Likely result       f3(D, f3(C, B, A), E)

As I understand, this is simply a case, that goes beyond the capabilities of regular expressions (“they don't count”).

Such one-off use-cases typically are too easily fixed by hand to warrant writing a script that does the brace-matching, yet enough effort to be annoying.

Is there some other tool in emacs, that can help in such cases, or a shell-utility, that makes such generalized-regexp situations easier than writing a script froms scratch?

kdb
  • 1,561
  • 12
  • 21
  • Try `peg` https://github.com/emacsmirror/peg . I recently posted an example usage: https://emacs.stackexchange.com/a/36477/563 . – wvxvw Nov 01 '17 at 15:43

2 Answers2

3

This is a situation where I'd prefer to use a macro. Start your macro by searching forward to the beginning of your match, then perform generalized actions using methods like forward|backward-sexp, move-end-of-line, to create movements that are syntax aware but not specific to a set or count of characters. I often make great use of the mark to save a position that I can pop back to later in the macro.

Here is an example of using a macro to solve your first problem:

enter image description here

My macro consisted of the following actions:

  • search forward to the next dhci
  • move the the beginning of the search result
  • delete one char (d)
  • move forward 3 chars
  • insert ^
  • start my region
  • move forward one expression using forward-sexp, which is smart about brackets)
  • kill my region (copy the brackets around ARG1)
  • move forward one expression (go behind the end bracket of ARG2)
  • yank my kill
  • mark my current position
  • move backward one expression
  • delete one character and insert a left paren
  • pop back to my previous position
  • delete the previous character and replace with a right paren
  • end macro

You can see that even though the contents of the two bracketed areas differ greatly from instance to instance, and even though the location and surrounding text is very different, the macro applies to each desired location correctly.

Jordon Biondo
  • 12,332
  • 2
  • 41
  • 62
2

Here's a PEG parser that defines a subset of LaTeX syntax rules and generates a tree representing the LaTeX source:

(defun latex-parse-tree (input)
  (peg-parse-string
   ((s latex-fn `(a -- a))
    (latex-fn latex-fname (list (+ latex-curlies))
              `(fn arg -- (list :func fn :args arg)))
    (latex-fname "\\" latex-word `(a -- a))
    (latex-word (substring (and [alpha] (* [alnum])))
                `(a -- (list :name a)))
    (latex-curlies
     (or (and "{}" `(-- (list :opts nil)))
         (and (list "{" latex-arg (* (and "," latex-arg)) "}")
              `(a -- (list :opts a)))))
    (latex-arg (or latex-word latex-fn) `(a -- (list :argument a))))
   input))


(pp (latex-parse-tree "\\foo{bar}"))
"((:func
  (:name \"foo\")
  :args
  (:opts
   ((:argument
     (:name \"bar\"))))))
"

(pp (latex-parse-tree "\\foo{\\bar{}}"))
"((:func
  (:name \"foo\")
  :args
  (:opts
   ((:argument
     (:func
      (:name \"bar\")
      :args
      (:opts nil)))))))
"
(pp (latex-parse-tree "\\foo{\\bar{},\\baz{}}"))
"((:func
  (:name \"foo\")
  :args
  (:opts
   ((:argument
     (:func
      (:name \"bar\")
      :args
      (:opts nil)))
    (:argument
     (:func
      (:name \"baz\")
      :args
      (:opts nil)))))))
"

You can then use this to perform arbitrary operations on LaTeX source (well, you'd still need to modify this to handle the full LaTeX language). As in the example below.

Or, use this code itself as an example and just make it solve your specific problem:

(defun latex--dchi->chi (input)
  (car
   (peg-parse-string
    ((s latex-fn `(a -- a))
     (latex-fn latex-fname latex-curlies latex-curlies
               `(fn opts args -- (if (string= fn "dchi")
                                     (format "\\chi^{%s}(%s)" args opts)
                                   (format "\\%s{%s}{%s}" fn opts args))))
     (latex-fname "\\" latex-word `(a -- a))
     (latex-word (substring (and [alpha] (* [alnum]))) `(a -- a))
     (latex-curlies
      (or (and "{}" `(-- nil))
          (and "{" latex-arg (* (and "," latex-arg)) "}" `(a -- a))))
     (latex-arg (or latex-word latex-fn) `(a -- a)))
    input)))

(defun latex-dchi->chi (start end)
  (interactive "rReplace \\dchi{ARG1}{ARG2} -> \\chi^{ARG1}(ARG2)")
  (setq end (min (1- (point-max)) end))
  (save-excursion
    (while (< (point) end)
      (let* ((start (1- (search-forward "\\")))
             (body (buffer-substring-no-properties start end)))
        (ignore-errors
          (let ((rep (latex--dchi->chi body)))
            (delete-region start (+ start (length rep)))
            (insert rep)
            (goto-char (+ start (length rep)))))))))

Since LaTeX grammar described in latex--dchi->chi is incomplete latex-dchi->chi needs some heuristics to try to apply the search on different fragments of the document. Since I don't really know the full grammar of LaTeX and implementing it would take a long time, I just search for the first instance of \ and try to match there. Note also that latex--dchi->chi may not recognize all cases where the substitution should be performed. For example, if an argument is anything but a latex-word (i.e. a letter followed by letters and numbers) or another LaTeX function, it will not match (mathematical expressions such as x+y, comments, such as %foo or even whitespace characters won't match). You can, of course improve this function to match any of those.

wvxvw
  • 11,222
  • 2
  • 30
  • 55