3

This is a problem I have been thinking about.

This Wikipedia page suggests that Ctrl+{Symbol} uses the shifted variant of {Symbol} when written in Caret notation.

This suggests that CTRL+\ becomes ^| (instead of ^\) in Caret notation and CTRL+4 becomes ^$ (instead of ^4).

But this assumes an American keyboard layout. How does this apply to keyboards layouts from other countries that have different layouts?

I am wondering if there is a standard way to write symbols in Caret notation. It seems like Caret notation does not make very much sense when you consider keyboard layouts with other symbol locations that what English keyboards use.

wefwefa3
  • 1,385

1 Answers1

7

Caret notation usually represents characters, not key presses. They represent the 33 control characters in the ASCII character set. Only the following control characters exist: ^@, ^A through ^Z, ^[, ^\, ^], ^^ and ^_, plus ^?.

The correspondence between control characters and their notation is that ^char is the character whose encoding is 64 less than char. This makes sense in binary — the char are characters whose number is written 010vwxyz, and the corresponding control character is the one whose number is written 000vwxyz. For ^?, the flipped bit is the same, but it's set instead of cleared: ? is 00111111 and ^? is 01111111.

There is no ^4 character, so pressing Ctrl+4 sends something different. See What does CTRL+4 (and CTRL+\) do in bash?

  • I had always understood that Ctrl+char returned the lower 32 bits of the ASCII char. Ctrl+4 would correspond to DC4 (0x14). This is expected in CP/M. The reality is that modern OS actually return a keycode plus flags for the combination of modifier keys pressed. The end result is application specific e.g. on OS X (which is BSD UNIX) Ctrl+4 causes terminal to switch to Finder. – Milliways Aug 30 '15 at 10:21
  • @Milliways Ctrl+char returning the lower 5 bits (not 32 — 32 is the number of possible values in the result, not the number of bits) only applies to characters in the range 64–95 on most systems, plus lowercase ASCII letters. That systematic thing in CP/M is a simplification for a small OS, it isn't something you can expect elsewhere. – Gilles 'SO- stop being evil' Aug 30 '15 at 10:31
  • My error - I meant 5 bits i.e. lower 32 bit characters. Ctrl+chars(0x30-0x3F) would translate to (0x10-0x1F). Most of these except ESC 0x1B are not used, so have no meaning although Ctrl+[ and Ctrl+; commonly translate to ESC – Milliways Aug 30 '15 at 11:05