I'm working with strings which may have any number of prefix and suffix spaces, tabs, newlines, etc. Currently I have this:
(replace-regexp-in-string
"^[^[:alnum:]]*\\(.*\\)[^[:alnum:]]*$"
"\\1" my-string)
I'm working with strings which may have any number of prefix and suffix spaces, tabs, newlines, etc. Currently I have this:
(replace-regexp-in-string
"^[^[:alnum:]]*\\(.*\\)[^[:alnum:]]*$"
"\\1" my-string)
What's the idiomatic (or best) way to trim surrounding whitespace from a string?
The built-in library subr-x.el
has included the inline functions string-trim-left
, string-trim-right
, and string-trim
since Emacs 24.4:
(eval-when-compile (require 'subr-x))
(string-trim "\n\r\s\tfoo\n\r\s\t") ; => "foo"
Since Emacs 26.1 these inline functions also accept optional regexp arguments:
(eval-when-compile (require 'subr-x))
(string-trim "aabbcc" "a+" "c+") ; => "bb"
Since Emacs 28.1 these functions are preloaded (no need to load subr-x
), and they are no longer inline.
There is the string manipulation library s.el
where trimming whitespace and newlines at the beginning and the end of a string is implemented as function s-trim
. I cite that function here with its dependencies:
(defun s-trim-left (s)
"Remove whitespace at the beginning of S."
(declare (pure t) (side-effect-free t))
(save-match-data
(if (string-match "\\`[ \t\n\r]+" s)
(replace-match "" t t s)
s)))
(defun s-trim-right (s)
"Remove whitespace at the end of S."
(save-match-data
(declare (pure t) (side-effect-free t))
(if (string-match "[ \t\n\r]+\\'" s)
(replace-match "" t t s)
s)))
(defun s-trim (s)
"Remove whitespace at the beginning and end of S."
(declare (pure t) (side-effect-free t))
(s-trim-left (s-trim-right s)))
Some differences to your first attempt
(replace-regexp-in-string
"^[^[:alnum:]]*\\(.*\\)[^[:alnum:]]*$"
"\\1" my-string)
are noteworthy:
^
as first char does not match the beginning of the string but the beginning of a line in the string. Similarly, $
matches not the end of the string but the end of a line. Use \`
for the beginning of the string and \'
for the end.\\(.*\\)
which you match as the actual string to be returned. It may be long and you force replace-regexp-in-string
to scan it.[:alnum:]
does not include characters of syntax class symbol. Therefore your function would also trim away characters that belong to this character class.string-trim
has been moved to subr.el
from subr-x.el
as of this commit in March 2021
Note: Do not have enough rep to put this as a comment under @basil's answer.