I am working on AIX unix and trying to remove non-printable characters from file the data looks like Caucasian male lives in Arizona w/ fiancÃÂÃÂÃÂÃÂÃÂ
in file when I view in Notepad++ using UTF-8 encoding. When I try to view file in unix I get ^▒▒^▒▒^▒▒^▒▒^▒▒^▒▒ instead of the special characters.
I want to replace all those special characters with space.
I tried sed 's/[^[:print:]]/ /g' file
but it does not remove those characters.My locale are listed below when I run locale -a
C
POSIX
en_US.8859-15
en_US.ISO8859-1
en_US
I even tried sed -e 's/[^ -~]/ /g' file
and it did not remove the characters.
I see that others stackflow answers used UTF-8
locale with GNU sed and this worked but I do not have that locale.
Also I am using ksh
.
Ã
and▒
look pretty printable to me. A UTF-8Ã
is encoded as 0xc3 0x83. 0xc3 in iso8859-1 or 15 is alsoÃ
as it happens which is printable, 0x83 would be a control character in both though – Stéphane Chazelas Sep 25 '18 at 19:53echo "fiancÃÂÃÂÃÂÃÂÃÂ" | od -tx1
, or, maybe if your sed supports it:echo "fiancÃÂÃÂÃÂÃÂÃÂ" | sed -n l
. – Sep 25 '18 at 21:08