For me tr works for fine both for ascii and utf-8 files as long your OS is configured to work with utf-8 codepage.
Here is my sample #1 (Solaris 11):
$ locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_ALL=
As you can see OS is configured to work with utf-8.
I created both files in utf-8 codepage:
$ cat file
Bob’s Bob′s Bob's
$ cat apos
’′'
Then I got expected results replacing all apos like this:
$ cat file | tr "$(cat apos)" "xxx"
Bobxs Bobxs Bobxs
Here is my sample #2 (Solaris 10):
$ locale
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=
Here you can see that this OS is configured to handle simple ASCII, not utf-8, so you may expect trouble processing utf-8 files with multi-byte characters using tr. But there is workaround. As long tr command allows to input octal representation of character, then you can replace all bytes of specified character using octal representation.
In your case you have:
char hex octal
’ E2 80 99 \342\200\231
′ E2 80 B2 \342\200\262
' 27 \47
Firts and second apos is represented by three bytes. Third one is standard ascii (one byte).
So if you wanna replace first apos you can use:
$ cat file | tr "\342\200\231" "\0\0x"
Bobxs Bob▒s Bob's
Second:
$ cat file | tr "\342\200\262" "\0\0x"
Bob▒s Bobxs Bob's
Third:
$ cat file | tr "\47" "x"
Bob’s Bob′s Bobxs
To replace all in one shot you may use:
$ cat file | tr "\342\200\231\262\47" "\0\0xxx"
Bobxs Bobxs Bobxs
Of course it's not perfect as long this will replace all occurences of byte \342, \200, \231, \262 in file, so other multi-byte characters which contain these bytes will be broken. But if your file do not contain any other multi-byte characters it will work.
sed -e "s/’/X/"
works. – Nicolas Raoul Sep 19 '12 at 03:07tr
, when not patched to fix it and a few other implementations) doesn't support multibyte characters (like UTF-8, GB18030, BIG5...) – Stéphane Chazelas May 09 '17 at 19:31