The answer is: In Unix-like systems, file names are composed of bytes, not characters. At least from the perspective of the kernel and its APIs.
A Unix-like kernel is normally neutral about any byte value but \000
(ASCII: NUL) and \057
(ASCII: slash). In Linux, there are no other restrictions at the filesystem layer, but certain FS drivers and their modes lead to the rejection of some names, usually due to the impossibility of translation. For example, one can’t create a filename with invalid UTF-8
on anything mounted with -o iocharset=utf8
(e. g. types cifs
or vfat
). None of DOS/Windows-compatible FSes will allow you to make \134 (ASCII: backslash) a part of a name. Or the msdos
type will apply DOS restrictions concerning 8.3 names.
Ext3/ext4 isn’t known to have restrictions but aforementioned \000
and \057
.