1

I'm using the latest csv-mode in Emacs 25.3.1 in centos 6.10. The problem is that I have many fields that are long documentation strings and many of them have embedded newlines. This messed up csv-mode and causes affected rows to be split into multiple lines in the buffer.

How can I make this usable? Is there a way to replace newlines embedded in strings with some other character (e.g. |) when reading the file and then put them back when writing the file?

Drew
  • 75,699
  • 9
  • 109
  • 225
David R
  • 21
  • 3

1 Answers1

1

The following code replaces newlines in quoted fields with the customizable string csv+-quoted-newline in csv-mode-hook.

It does the inverse replacement in write-contents-functions.

Excerpt of the doc string of write-contents-functions:

List of functions to be called before writing out a buffer to a file.

Only used by save-buffer'. If one of them returns non-nil, the file is considered already written and the rest are not called and neither are the functions inwrite-file-functions'. This hook can thus be used to create save behavior for buffers that are not visiting a file at all.

This variable is meant to be used for hooks that pertain to the buffer's contents, not to the particular visited file; thus, `set-visited-file-name' does not clear this variable; but changing the major mode does clear it.

(defcustom csv+-quoted-newline "\^@"
  "Replace for newlines in quoted fields."
  :group 'sv
  :type 'string)

(defun csv+-quoted-newlines (&optional b e inv)
  "Replace newlines in quoted fields of region B E by `csv+-quoted-newline'.
B and E default to `point-min' and `point-max', respectively.
If INV is non-nil replace quoted `csv+-quoted-newline' chars by newlines."
  (interactive
   (append (when (region-active-p)
         (list (region-begin)
           (region-end)))
       prefix-arg))
  (unless b (setq b (point-min)))
  (unless e (setq e (point-max)))
  (save-excursion
    (goto-char b)
    (let ((from (if inv csv+-quoted-newline "\n"))
      (to (if inv "\n" csv+-quoted-newline)))
      (while (search-forward from e t)
    (when (nth 3 (save-excursion (syntax-ppss (1- (point)))))
      (replace-match to))))))

(defsubst csv+-quoted-newlines-write-contents ()
  "Inverse operation of `csv+-quoted-newlines' for the full buffer."
  (save-excursion
    (save-restriction
      (widen)
      (let ((file (buffer-file-name))
        (contents (buffer-string)))
    (with-temp-buffer
      (insert contents)
      (csv+-quoted-newlines (point-min) (point-max) t)
      (write-region (point-min) (point-max) file)))))
  (set-visited-file-modtime)
  (set-buffer-modified-p nil)
  t ;; File contents has been written (see `write-contents-functions').
  )

(defun csv+-setup-quoted-newlines ()
  "Hook function for `csv-mode-hook'.
Transform newlines in quoted fields to `csv+-quoted-newlines'
when reading files and the other way around when writing contents."
  (add-hook 'write-contents-functions #'csv+-quoted-newlines-write-contents t t)
  (let ((modified-p (buffer-modified-p)))
    (csv+-quoted-newlines)
    (set-buffer-modified-p modified-p)))

(add-hook 'csv-mode-hook #'csv+-setup-quoted-newlines)

Tested with Emacs 26.1, csv-mode 1.7 and the following csv-file:

11, 12, "Some
string", 14
21, 22, "Another
string", 24
31, 32, "And just another
string.", 34
41, 42, "And just another
string.", 44

Emacs displays the file after find-file as follows:

11, 12, "Some^@string", 14
21, 22, "Another^@string", 24
31, 32, "And just another^@string.", 34
41, 42, "And just another^@string.", 44

Thereby, ^@ is the null character. You can input the null character by C-q C-@ to insert further newlines in quoted strings.

Tobias
  • 32,569
  • 1
  • 34
  • 75
  • Thanks - this worked great. One caveat: after saving the file, I have to revert-buffer before continuing to edit the file or I get a message asking me if I want to edit a file that has changed on disk. – David R Jun 13 '19 at 22:23
  • @DavidR I've added `set-visited-file-modtime` to `csv+-quoted-newlines-write-contents`. That avoids the problem with buffer-reverting (at least in my tests). – Tobias Jun 14 '19 at 03:50
  • Please use `add-hook` rather than `add-to-list` when manipulating hooks. – Stefan Jun 19 '19 at 19:53
  • @Stefan Thanks for the comment. I have changed `add-to-list` to `add-hook`. I was a bit irritated by the fact that `write-contents-functions` is buffer-local per se. But, a search in the Emacs lisp dir acknowledged that `write-contents-functions` is set via `add-hook`. – Tobias Jun 19 '19 at 23:07
  • @Stefan Do you know what the reason is for making hooks buffer local? Maybe `write-contents-functions` looks formally better as buffer local hook. But, technically that does not matter with `add-hook`, does it? – Tobias Jun 19 '19 at 23:22
  • You don't "make hooks buffer-local": instead, hooks always have a global and a local part and when you `add-hook` you use the `local` argument to tell when you want to add the function to the local or the global part of the hook. For historical reasons, some hooks have no global part, tho this phenomenon should hopefully disappear over time. – Stefan Jun 20 '19 at 01:51