4

I'm setting up the keyboard layouts in Xubuntu 20.04.

I generally want to be able to input various Unicode characters, both in pure console and in X, using the same keystrokes. I know that console has a limited support for Unicode (512 glyphs), so I'm OK with console only working within this limited range.

I'm keeping "Use system defaults" ON in xfce4-keyboard-settings, understanding this option as "Use the /etc/default/keyboard setup" and hoping to have a single config for both pure console and X.

I've run sudo dpkg-reconfigure keyboard-configuration and I've chosen the "US intl., with AltGr dead keys" layout variant.

I've discovered that in pure console bash messes up some AltGr sequences by going into "(arg: x)" mode, googled that up to be related to some readline peculiarities and, considering this as offtopic for now, ruled this behavior out by starting the simplest sh on top of bash.

At this point, I gladly have the keyboard layout working seemingly the same way under console and under X: AltGr+2 types in ², AltGr+` then e types in è.

For the Compose key, I've chosen Caps Lock.

Under X, the behavior is expected: Compose o o types in the degree sign, Compose e = types in the euro sign, etc. etc.

But in the console, almost no Compose sequences seem to work. No degree, no euro (while they can perfectly be displayed and even input through AltGr). The Compose key is not completely useless, as it still works with the most basic combinations like Compose ` e making up è. But this is far more limited than even 512 glyphs.

So why is the console Compose behavior so different and can it be tuned to support at least the seemingly simple things like degree and euro?

Anton K
  • 319

1 Answers1

2

The console behavior is different because it's handled by completely different code. /etc/default/keyboard contains settings for XKB, which is a part of X that handles keyboard input. The console-setup package translates these settings to what the console is capable of doing, thanks to the programs setupcon (which reads and parses /etc/default/keyboard) and ckbcomp (which translates XKB settings into console settings). These tools are limited with what the console can do.

The Linux console, which is implemented inside the kernel, only has very partial support for multibyte character sets. Regarding the compose key in particular, there is a hard-coded limitation in the kernel: the compose table (accent_table) has a hard-coded size of MAX_DIACR which is 256 unless you recompile your kernel¹. This may explain why the console-setup package doesn't have a Unicode compose table: with only 256 entries, you aren't going to cover many characters whatever you do. As far as I can tell, when you use console-setup with the Unicode character set, you end up with the kernel's built-in compose table whch only lists the latin1 non-ASCII accented letters (and not the punctuation characters such as °).

You can define your own compose table. Pick up to 256 combinations of two characters to combine to a third, and list them in a console compose map file.

Compose 'o' 'o' to degree
Compose 0x6f 0x006f to 0xb0
Compose U+6F U+006f to U+00B0

The first lines above illustrate different ways to say that Compose o o inserts °. Of course you only need one. The ways to specify a character are²:

  • A number in decimal, in octal with a leading 0, or in hexadecimal with a leading 0x.
  • A single-byte character inside '. There can be a backslash before this character, and the backslash is mandatory for backslash and single quote: '\o' is equivalent to 'o', but backslash and single quote can only be written '\\' and '\' respectively (or using another syntax).
  • '\ooo' where ooo are exactly 3 octal digits specifying a value of at most 255 (\377).
  • U+hhhh where hhhh are exactly 4 hexadecimal digits.
  • On the right-hand side of to only, a symbolic name for a character. The symbolic name is a subset of the X11 keysym names. See the source (syms.*.h) for the list of symbolic names.

To load your own compose map, run

loadkeys /path/to/my/compose.kmap

This replaces the currently loaded compose table.

I can't find a way to tell console-setup to load a custom compose map. ckbcomp loads /etc/console-setup/compose.${charmap}.inc for non-Unicode encodings, but for Unicode it skips that step.

¹ The number of elements of the kbdiacruc array in struct kbdiacrsuc is the maximum number of items that userland can set. It probably needs to be set to the same value.
² Source: the source code (kbd package, files src/libkeymap/analyze.l and src/libkeymap/parser.y). I couldn't find these details in the documentation.

  • This answer helped a lot, thanks. I've got another question related to the topic, would appreciate if somebody could link it here: https://unix.stackexchange.com/questions/715720/x-follows-keyboard-layout-toggle-but-console-does-not-crazily. And sure I'd appreciate if you could take a look, too. – Anton K Sep 01 '22 at 21:27
  • @AntonK It's the same principle: XKB settings don't apply to the console. There's software (I think it's console-setup, but I'm not sure) that does a partial translation of /etc/default/keyboard to something the console can do, but the console is a lot less capable than X so not everything can be translated. Look, it's 2022, it must have been about a decade since anyone worked on console support other than routine maintenance. The Linux console just isn't going to ever have decent Unicode support. – Gilles 'SO- stop being evil' Sep 01 '22 at 21:40
  • @gilles-so-stop-being-evil First, thanks. Second: it would look less dramatic (to me) if console would just ignore part of the settings, as you point out. But behaving differently for the same setting looks more like it's actually capable but very buggy. Please also note I'm not talking about unicode in this case. – Anton K Sep 01 '22 at 22:18