How it's made
The so called by you "literal characters" are implemented as ordinary Unicode charachters. Let's look how it works for Tabulation
and New line
. Check Tabulation
hex-encoding:
printf $'\t' | hexdump
The output is
0000000 0009
0000001
The output means the \t
character is ordinary UTF-8 character U+0009
. You can print it in such the way:
printf '\x00\x09'
or with echo
:
echo -e '\u0009'
Consider the following example for New line
character:
bob@alice:~$ printf $'\n' | hexdump
0000000 000a
0000001
bob@alice:~$ printf '\x00\x0A empty lines are above and below'; echo $'\n'
empty lines are above and below
bob@alice:~$ echo -e '\u000a empty line is above'
empty line is above
bob@alice:~$
How to input Unicode characters
There is so called ComposeKey
or MultiKey
in Linux. The key can be defined in xorg.conf.d/10-keyboard.conf
file, just add the line to file:
Option "xkbOptions" "grp:alt_shift_toggle,terminate:ctrl_alt_bksp,compose:menu"`
UTF-8 (Unicode) compose sequence hints can be found in Compose
file:
less /usr/share/X11/locale/en_US.UTF-8/Compose
In GUI terminals also works CTRL+SHIFT+U
keybinding - press it and you'll see u
letter. Input 266a
and complete it with Space
or Enter
key - the Eights Note sign appears.
Additional information
- ANSI-C Quoting
- Ubuntu - ComposeKey
- Wikipedia - Compose key
- How to set a Compose Key in Ubuntu 18.04
hexdump
dumps 16bit words by default. Useod -vtx1
instead. It would only be encoded as 0009 in UTF-16BE, an encoding that is not Unix-compatible. – Stéphane Chazelas Jul 25 '18 at 16:26U+0009
for Unicode tabulation control code. I didn't use intentionally a "byte-language". Take into account, using of\x00
isn't a mistake too since according to part 23.1 of Unicode Standard v.11 usage ofU+0000
is outside the scope of the Unicode Standard, which does not require any particular usage of null (page 858). Read: http://www.unicode.org/versions/Unicode11.0.0/UnicodeStandard-11.0.pdf – Bob Jul 25 '18 at 18:23