To match a word one can use
\v(\w+)
From the vim help :h \w
:
\w word character: [0-9A-Za-z_]
This works exactly as described in the manual. However, I want to
match words that contain characters beyond a-z
, e.g.
prästgården. Matching the regular expression \v(\w+)
against
prästgården yields to three matches, instead:
prästgården
^^ ^^^ ^^^^
How to match words containing characters beyond a-z
? My locale is set to English and if possible I'd like to keep it that way.
Edit: The words might not belong to a single locale, e.g.
prästgården
treść
[[:alpha:]]\+
in this case) are supposed to do what you want here, but according to the Vim docs (:help regex
) it doesn't: "These items only work for 8-bit characters." It does happen to work here with Vim 7.3 on OS X 10.8, but Vim 7.3 on Linux doesn't work, so I assume there's something Apple-specific about this Vim that allows it. You'll also find that doing it through the Vim Perl binding also fails, even though Perl has very good Unicode support. You might need to switch to an external Perl script, so you can turn on full Unicode support. – Warren Young Jan 07 '13 at 02:32\p{Word}
instead of a POSIX character class. There are a lot of exception cases in Perl's POSIX character class handling, which you avoid when you use Unicode properties instead. – Warren Young Jan 07 '13 at 02:34