3

In an org tree consisting of hundreds of headings, how do I find out if there are any that are duplicates?

So, for instance, in the tree below,

* Heading 1
** some
** other
** headings
** some

I want to latch on to 'some'. I can sort them alphabetically to identify them manually. But with hundreds of headings, it is not practicable.

Also, can I control if it is a perfect match (everything under 'some' is exactly the same in both the instances) or just a heading match (we do not care about what is below 'some')?

This question is very similar to what I am asking. But unfortunately, there is no answer provided there. Although there are some hints in a comment.

deshmukh
  • 1,852
  • 13
  • 29

2 Answers2

5

Here's a function that collects all duplicate headings and their respective position in the buffer:

(defun collect-duplicate-headings ()
  (let (hls dups)
    (save-excursion
      (goto-char (point-max))
      (while (re-search-backward org-complex-heading-regexp nil t)
        (let* ((el (org-element-at-point))
               (hl (org-element-property :title el))
               (pos (org-element-property :begin el)))
          (push (cons hl pos) hls)))
      (dolist (hl hls)
        (when (> (cl-count (car hl) (mapcar #'car hls)
                           :test 'equal)
                 1)
          (push hl dups)))
      (nreverse dups))))

And here's how you can use it with Helm (in case you use it):

(defun show-duplicate-headings ()
  (interactive)
  (helm :sources (helm-build-sync-source "Duplicate headings"
                   :candidates (lambda ()
                                 (with-helm-current-buffer
                                   (collect-duplicate-headings)))
                   :follow 1
                   :action 'goto-char)))

To collect "perfect matches" use the following code instead:

(defun collect-duplicate-headings ()
  (let (dups contents hls)
    (save-excursion
      (goto-char (point-max))
      (while (re-search-backward org-complex-heading-regexp nil t)
        (let* ((el (org-element-at-point))
               (hl (org-element-property :title el))
               (pos (org-element-property :begin el)))
          (push (cons hl pos) hls)))
      (setq contents
            (cl-loop for hl in hls
                     for pos = (goto-char (cdr hl))
                     for beg = (progn pos (line-beginning-position))
                     for end = (progn pos (org-end-of-subtree nil t))
                     for content = (buffer-substring-no-properties beg end)
                     collect (list (car hl) (cdr hl) content)))
      (dolist (elt contents)
        (when (> (cl-count (last elt) (mapcar #'last contents)
                           :test 'equal)
                 1)
          (push (cons (car elt)
                      (nth 1 elt))
                dups)))
      (nreverse dups))))
jagrg
  • 3,824
  • 4
  • 19
3

This function lists duplicate headings from the current buffer based on string identity. You can extend the storage structure to include more information about the contents of the text and test accodingly. Hints on how to do it can be found from: http://ergoemacs.org/emacs/elisp_parse_org_mode.html

(require 'dash) ; for -contains?

(defun my-print-duplicate-headings ()
  "Print duplicate headings from the current org buffer."
  (interactive)
  (with-output-to-temp-buffer "*temp-out*"
    (let ((header-list '())) ; start with empty list
      (org-element-map (org-element-parse-buffer) 'headline
        (lambda (x)
          (let ((header (org-element-property :raw-value x)))
            (when (-contains? header-list header)
              (princ header)
              (terpri))
            (push header header-list)))))))
Heikki
  • 2,961
  • 11
  • 18
  • An alternative interface to this functionality: Report org headings with duplicate names in the *Messages* buffer. Reveal them and move to the last duplicate. Run after saving the buffer. https://gist.github.com/heikkil/0e8e2342f0f3136ba0d1a4da174fc185 – Heikki Mar 17 '19 at 09:43