0

I am looking a way to programatically fix invalidly formatted code for programming languages that use relatively simple list expression of elements separated by commas (ie, not in C++).

For example I'd like to be able to perform the following transformations:

  • [abc, def,, ghi,] ---> [abc, def, ghi]
  • [abc def ghi] ---> [abc, def, ghi]
  • (abc [11, 12,, 13,] def ghi,) ---> (abc, [11, 12, 13], def, ghi)

essentially imposing that each element uses one comma separator and supporting various parens pairs: ( ), [ ], { } and < >.

Does something like that already exists?

Drew
  • 75,699
  • 9
  • 109
  • 225
PRouleau
  • 744
  • 3
  • 10
  • If nothing already exists for this, looks like the easiest way is to write a function that uses Elisp syntax and regular expressions with `re-search-forward` and `replace-match`. The buffer syntax table defines the parens syntax and Elisp regex has the `"\s("` and `"\s)"` to identify the opening and closing parens for the syntax of the current buffer. – PRouleau Sep 30 '21 at 21:26
  • https://emacs.stackexchange.com/tags/elisp/info – Drew Sep 30 '21 at 22:20
  • @Drew, thanks i did not look long enough... – PRouleau Sep 30 '21 at 22:29
  • I have implemented something using syntax and regex, a function I call [pel-syntax-fix-block-content](https://github.com/pierre-rouleau/pel/blob/master/pel-syntax.el). It's rather long to put as an answer but if nothing else comes up perhaps I should add it. – PRouleau Oct 01 '21 at 12:03
  • Please post your answer as such. Comments can be deleted at any time, and they're invisible to searches. Thx. – Drew Oct 01 '21 at 17:22

1 Answers1

0

I did not find anything to do that so I wrote my own. Here's a slightly modified copy. The last function is the one to use: pel-syntax-fix-block-content:


;; Predicates
;; ----------

(defun pel-inside-string-p (&optional pos)
  "Return non-nil if POS, or point, is inside a string, nil otherwise."
  (nth 3 (syntax-ppss pos)))

;; Utilities
;; ---------

(defun pel-syntax-matching-parens-position (&optional parens-pos)
  "Return the parens position that match PARENS-POS."
  (setq parens-pos (or parens-pos (point)))
  (save-excursion
    (let ((parens-char (char-after (goto-char parens-pos))))
      (if (memq parens-char '(?\( ?\[ ?\{ ?<))
          (progn
            (forward-sexp)
            (backward-char))
        (if (memq parens-char '(?\) ?\] ?\} ?>))
            (progn
              (forward-char)
              (backward-sexp))
          (error "Invalid sexp character: %S" parens-char)))
      (point))))

;; Block syntax fixer
;; ------------------
;;

(defun pel-syntax-block-text-at (&optional pos)
  "Return text of block at POS or current point.
Return a list of (open-position close-position text)."
  (setq pos (or pos (point)))
  (let* ((syntax       (syntax-ppss pos))
         (open-p-pos  (car (nth 9 syntax)))
         (close-p-pos (pel-syntax-matching-parens-position open-p-pos)))
    (list open-p-pos
          close-p-pos
          (buffer-substring-no-properties open-p-pos (+ 1 close-p-pos)))))

(defun pel-syntax-skip-string (&optional pos)
  "Move point to character just after end of string.
Start from POS or current point."
  (goto-char (or pos (point)))
  (while (and (pel-inside-string-p)
              (not (eobp)))
    (forward-char)))

(defun pel---replace-with (from replacer)
  "Replace text in current buffer.
FROM is the regex identifying the text to change.
REPLACER is a closure that identifies the new text, the
REPLACER has access to the information from a `re-search-forward'
such as the result of `match-string'.

The function returns the number of replacements done."
  (let ((found-pos nil)
        (change-count 0))
    (while
        (progn
          (goto-char (point-min))
          (setq found-pos (re-search-forward from nil :noerror))
          ;; if found item is in string, skip the string and search again:
          ;; do not transform the content of strings.
          (while (and found-pos
                      (pel-inside-string-p found-pos))
            (pel-syntax-skip-string found-pos)
            (setq found-pos (re-search-forward from nil :noerror)))
          (when found-pos
            (replace-match (funcall replacer) :fixedcase :literal)
            (setq change-count (1+ change-count))
            t)))
    change-count))

(defmacro pel-replace (from &rest to)
  "Replace text in buffer.

FROM must be a regex string.
TO must be a form that produces a replacement string.
That form runs in the context of the string replacement code performed
by the function `re-search-forward'."
  `(pel---replace-with ,from (lambda () ,@to)))

;; --

(defun pel-syntax-fix-block-content (&optional pos)
  "Comma-separate all expressions inside the block-pair at POS or point.

Does not transform text inside any string located inside the matched-pair
block, but it may transform other text.

Returns the number of text modifications performed."
  (save-excursion
    (save-restriction
      (let* ((open.close.text (pel-syntax-block-text-at pos))
             (open-pos  (nth 0 open.close.text))
             (close-pos (nth 1 open.close.text))
             (total-changes 0)
             (changes 1))
        (narrow-to-region open-pos (1+ close-pos))
        (while (> changes 0)
          (setq changes 0)
          ;; -> ensure one space between comma and next element.
          ;;    Also eliminate multiple commas between 2 symbols.
          (cl-incf changes (pel-replace "\\(\\w\\),+\\(\\w\\)"
                                        (format "%s, %s"
                                                (match-string 1)
                                                (match-string 2))))
          ;; -> remove spaces between word, closing parens or quotes and the
          ;;    following comma
          (cl-incf changes (pel-replace "\\(\\w\\|\\s)\\|\\\"\\|'\\) +,"
                                        (format "%s," (match-string 1))))
          ;; -> replace multiple commas by a single one
          (cl-incf changes (pel-replace "\\(\\w\\),,+ "
                                        (format "%s, " (match-string 1))))
          ;; -> add a comma after word or closing parens if there is none
          ;;    before the next word or opening parens
          (cl-incf changes (pel-replace "\\(\\w\\|\\s)\\) +\\(\\w\\|\\s(\\)"
                                        (format "%s, %s"
                                                (match-string 1)
                                                (match-string 2))))
          ;; -> remove trailing commas placed just before the closing parens
          (cl-incf changes (pel-replace ",\\s-*\\(\\s)\\)"
                                        (match-string 1)))
          ;; -> replace multiple spaces after a comma by 1 space after comma
          (cl-incf changes (pel-replace ",  +\\([^ ]\\)"
                                        (format ", %s" (match-string 1))))
          ;; -> remove isolated commas not separating anything
          (cl-incf changes (pel-replace ", +," ","))
          ;; -> In erlang buffers move period after closing parens
          ;;    if it is before
          (when (eq major-mode 'erlang-mode)
            (cl-incf changes (pel-replace "\\.\\(\\s)\\)"
                                          (format "%s." (match-string 1)))))
          ;;
          (cl-incf total-changes changes))
        total-changes))))
PRouleau
  • 744
  • 3
  • 10