5

I need to fix a large .bib file where all proper names are in ALL CAPITALS (thanks ProQuest!)

Is there an elisp function that searches a buffer/region for words in all capitals and turns all characters except the first one into lowercase letters?

E.g.

DOE, JOHN --> Doe, John

I found a couple of functions on Xah Lee's excellent site that work on cases (Toggle cases, Title cases), but I don't know enough elisp to turn either of them into what I need.

Drew
  • 75,699
  • 9
  • 109
  • 225
stefano
  • 165
  • 3

2 Answers2

6

How about using keyboard macros:

  • <f3> -- start recording
  • C-M-s \b[A-Z]\{2,\}\b RET -- find the next word of only upper-case letters and at least 2 letters.
  • M--M-c -- Call capitalize-word on the word before point
  • M-0<f4> -- Stop recording and run the macro repeatedly until error

n.b. If that's catching non-upper-case words as well, you'd need to set isearch-case-fold-search.


Or more directly (and you could run this via M-: as a one-time thing, or make a new interactive function if you want to re-use it):

(let ((case-fold-search nil))
  (while (re-search-forward "\\b[A-Z]\\{2,\\}\\b" nil :noerror)
    (capitalize-word -1)))

Regarding comments, some corner cases may well need custom attention, but Emacs does have you covered on the matter of diacritics, as we can simply use [[:upper:]] in place of [A-Z].

The following kind of modification might be worth trying to catch the other elements:

(let ((case-fold-search nil)
      (pattern "\\(?:\\b\\|Ma?c\\|'\\)\\([[:upper:]]\\{2,\\}\\)\\b"))
  (while (re-search-forward pattern nil :noerror)
    (save-restriction
      (narrow-to-region (match-beginning 1) (match-end 1))
      (capitalize-word -1))))

but perhaps it's good enough to simply relax the pattern to:

"\\([[:upper:]]\\{2,\\}\\)\\b"
phils
  • 48,657
  • 3
  • 76
  • 115
  • Excellent! I turned it into a command and that's exactly what I needed. Once I learn more elisp I will make it to optionally work on regions, but for now it's perfect – stefano Mar 27 '17 at 22:58
  • Instead of `M-:` (the editting facilities that it offers are... limited), I'd suggest wrapping the form with `with-current-buffer`, and evaluating it in `*scratch*` or IELM. Or making it an interactive defun, even for a one-time use. – T. Verron Mar 28 '17 at 09:10
  • After some (admittedly, rather hasty) testing, there seem to be a problem with names containing an apostrophe (O'FARRELL), Scottish-like names with alternating cases (say, McCULLOGH), and names with diacritics (say, BUÑUEL). I suppose the first two are corner cases that must be worked though individually, but what about diacritics? Is there a diacritic-aware version of [A-Z]? – stefano Mar 28 '17 at 20:28
  • There is indeed! See `[:upper:]` and `[:lower:]` in `C-h i g (elisp) Char Classes`. I've updated the answer accordingly. – phils Mar 28 '17 at 21:55
1

If you want to take @phils answer and wrap it in a command that handles the region, here's one approach:

(defun caps-to-title-case (start end)
  "Convert words in CAPS to Title Case in the current region or buffer."
  (interactive (if (use-region-p)
                   (list (region-beginning) (region-end))
                 (list (point-min) (point-max))))
  (save-excursion
    (goto-char start)
    (let ((case-fold-search nil))
      (while (re-search-forward "\\b[A-Z]\\{2,\\}\\b" end :noerror)
        (capitalize-word -1)))))

When called interactively this will restrict the changes to the current region if there is one, and otherwise will modify the entire buffer.

glucas
  • 20,175
  • 1
  • 51
  • 83
  • Thanks. I had basically written the same function, but had missed the save-excursion call. This is perfect – stefano Mar 28 '17 at 20:24