1

Using emacs 25.3.1, I am trying to access the match data after I search a string with string-match. But the match data is wrong. To find out why I tried out the manual's simple example.

    (string-match "\\(qu\\)\\(ick\\)"
                       "The quick fox jumped quickly.")
                       ;0123456789
    (match-string 0 "The quick fox jumped quickly.")

When running this example I get the following results:

4

Debugger entered--Lisp error: (args out of range "The quick fox jumped quickly." 1513 1515)

I think this is because I used string-match on a string before executing this example. However, I expected match-data to only consider my last string-match search and return "quick".

Surprisingly, this behavior persists across sessions. If, after getting this error, I startup emacs with emacs -Q (no init file), and I paste the "quick fox" example and execute it I get the same error (with updated numbers).

I tried to reset match data myself using set-match-data, however that hasn't been working.

(match-data) ; => (1520 1520)
(set-match-data (list 0 0)) ; => nil
(match-data) ; => (1547 1547)

How can I reset the match data myself to produce expected values from string-match search?

Aquaactress
  • 1,393
  • 8
  • 11
  • 1
    Are you evaluating each of these expressions separately with `C-x C-e` or similar? – npostavs Apr 08 '18 at 02:44
  • I specifically used `eval-print-last-sexp`. – Aquaactress Apr 08 '18 at 03:18
  • 2
    @Aquaactress, the point is that if you're calling `eval-print-last-sexp` *multiple times* in the course of a single test -- e.g. once for `(string-match...)` and then again for `(match-string...)` -- then Emacs is doing a heap of things between each of those commands, and you cannot assume that the match data was not affected in that time. You want to wrap the code in `(progn ...)` or similarly ensure that a single use of `eval-print-last-sexp` runs *all* the code you are testing. – phils Apr 08 '18 at 12:20
  • This astonishes me and really clears things up. Very appreciated. – Aquaactress Apr 08 '18 at 13:29

2 Answers2

4

The match object you're accessing is global and may be changed by any piece of lisp code. Therefore you should immediately check its state. The following piece of code shows how and exhibits the expected behavior:

(let ((string "The quick fox jumped quickly.")
      (regexp "\\(qu\\)\\(ick\\)"))
  (when (string-match regexp string) ;=> See Note.
    (match-string 0 string))) ;=> "quick"

If you evaluate the lines one by one, the match object will most certainly be mutated in that time, like by syntax highlighting code in Emacs.

*Note: match-string is stateful and "can" persist on consecutive searches even if your next string-match search returns nil.

wasamasa
  • 21,803
  • 1
  • 65
  • 97
  • FWIW, I think this says the same thing I said: you must immediately follow the `string-match` with the `match-data`. – Drew Apr 08 '18 at 14:07
  • I'm sorry. I interpreted `you must immediately follow` as I can't have any `(string-match ...)` expressions or anything that changes the `match-data` in between `string-match` and `match-string` (ie. the first line must follow the second). I did not realize you meant it must be executed immediately afterwards. – Aquaactress Apr 08 '18 at 17:49
  • Also, it really helped me that wasamasa's answer made clear how volatile the match object is. As I had the impression it could only be changed when the user invokes `re-search-forward` or a similar function. I had no idea emacs itself could change it in the background. – Aquaactress Apr 08 '18 at 17:58
1

Actually, you are not passing the same (eq) string to match-data. You are passing a different string that has exactly the same chars. Try this instead:

(setq foo "The quick fox jumped quickly.")

(string-match "\\(qu\\)\\(ick\\)" foo)

(match-string 0 foo)

But besides that, it does work for two different strings that have the same chars. You must have done something in between that changed the match data or something.

Try just this - nothing else:

(string-match "\\(qu\\)\\(ick\\)" "The quick fox jumped quickly.")

(match-string 0 "The quick fox jumped quickly.")

If you still see a problem, provide a complete recipe to reproduce it, starting from emacs -Q (no init file).

Drew
  • 75,699
  • 9
  • 109
  • 225
  • I tried assigning the string to `foo`. And tried again just using `string-match` and then `match-string` with no string in between. Surprisingly, I got the same error even after starting from `emacs -Q`. After I started emacs with `emacs -Q` I pasted the `setq foo` example and called `eval-print-last-sexp` on each line. The result was `(args-out-of-range "The quick fox jumped quickly." 266 267)` – Aquaactress Apr 08 '18 at 02:53
  • My version of emacs is 25.3.1. – Aquaactress Apr 08 '18 at 02:56
  • After trying it a few more with `emacs -Q`. Even so, I'll keep this question open for a while longer to see if I can figure out why I did get the error. I'm certain that I did get the error at least twice after using `emacs -Q`. Additionally, I still don't understand why I would get this error even without `emacs -Q` if I executed the "quick fox" example and nothing else. – Aquaactress Apr 08 '18 at 03:27
  • If you can provide a recipe to reproduce the problem starting with `emacs -Q` then please consider reporting that bug: `M-x report-emacs-bug`. That command will add information about your session, including your Emacs version etc., to be included as part of the bug-report. It's entirely possible that some change in Emacs introduced a bug, especially if you are using a recent development snapshot. This problem would no doubt be a regression. Thx. – Drew Apr 08 '18 at 04:46
  • Ok. I think one thing that's happening is `match-data` does not reset on last search. If I use `string-match` on some text before the "quick fox" example, and then execute the "quick fox" example, the match-data is usually out of bounds. Suprisingly, this persists across emacs sessions. Which is why when trying the "quick fox" example in `emacs -Q` I get the same error. – Aquaactress Apr 08 '18 at 10:51
  • I also updated the question to reflect my findings. – Aquaactress Apr 08 '18 at 11:18