0

The unicode range of circled digits (U+2460 .. U+2468) cannot be converted to, or from, any of the Japanese encodings (EUC-JP, Shift-JIS, ISO-2022-JP), even though they exist there, and I run across them all the time.

% echo ①②③③④⑤⑥⑦⑧⑨ | iconv -f utf-8 -t euc-jp
iconv: (stdin):1:0: cannot convert
% echo ①②③③④⑤⑥⑦⑧⑨ | iconv -f utf-8 -t shift-jis
iconv: (stdin):1:0: cannot convert
% echo ①②③③④⑤⑥⑦⑧⑨ | iconv -f utf-8 -t iso-2022-jp
iconv: (stdin):1:0: cannot convert

% printf "\xad\xa1\xad\xa2\xad\xa3\xad\xa3 \xad\xa4\xad\xa5\xad\xa6\xad\xa7\xad\xa8\xad\xa9" | iconv -f euc-jp -t utf-8 
iconv: (stdin):1:0: cannot convert
% printf "\x87\x40\x87\x41\x87\x42\x87\x42 \x87\x43\x87\x44\x87\x45\x87\x46\x87\x47\x87\x48" | iconv -f shift-jis -t utf-8 
iconv: (stdin):1:0: cannot convert

What gives?

oals
  • 371

1 Answers1

3

Those characters actually do not exist in those three encodings. You actually want EUC-JIS-2004 aka EUC-JISX0213 instead of plain EUC-JP, SHIFT_JIS-2004 or CP932 instead of SHIFT_JIS, and ISO-2022-JP-2004 instead of plain ISO-2022-JP.

% printf "\xad\xa1\xad\xa2\xad\xa3 \xad\xa4\xad\xa5\xad\xa6\xad\xa7\xad\xa8\xad\xa9" | iconv -f euc-jisx0213 -t utf-8 
①②③ ④⑤⑥⑦⑧⑨
oals
  • 371