0

The following are the control characters in ASCII (highlighted in yellow):

enter image description here

To send one of these control characters to the line discipline from the terminal, we type Ctrl+someChar, for example to send the 0x03 control character, we type Ctrl+C.

Now are all of these control characters shown in the image supported in Linux, or is only a subset of these control characters supported?


Edit:

I mean by "supported" if they can be sent to the line discipline from the terminal. But I have just found the following documentation, which says that only 14 control characters are supported (and not 33 as there are in the ASCII table), so I guess the answer to my question is No, not all the control characters in the ASCII table are supported.

Joseph
  • 375
  • 1
    Why do you ask and what is the actual problem you want to solve. – Basile Starynkevitch Oct 29 '17 at 06:34
  • 3
    "Supported" in the sense "do they have a special meaning": No, not all of them have a special meaning. Supported in the sense "can you transmit them, and leave it to the the application(s) to give them a special meaning if they want to": yes. – dirkt Oct 29 '17 at 06:34
  • Please edit your question (currently unclear and too broad) to improve it and explain why do you ask it, for what purpose. – Basile Starynkevitch Oct 29 '17 at 06:40
  • Without additional information and motivation, your question is too broad, so I voted to close it. – Basile Starynkevitch Oct 29 '17 at 06:55
  • @dirkt I meant by "supported" your second point, in that if they can be transmitted. But I have just found the following documentation: http://man7.org/linux/man-pages/man4/console_codes.4.html, which says that only 14 control characters are supported (and not 33 as there are in the ASCII table). – Joseph Oct 29 '17 at 07:15
  • 1
    @Joseph: Please edit your question to improve it (it needs to be). Don't comment your own question; comments are for others. – Basile Starynkevitch Oct 29 '17 at 07:22
  • Even with edit, the question stays unclear. What particular control character do you have in mind? For example ESC is often interpreted for ANSI escape codes (which are not defined by ASCII) – Basile Starynkevitch Oct 29 '17 at 07:33
  • An d why do ask about ASCII since today we have UTF-8 everywhere? In practice you are very unlikely to find a Linux using ASCII only today. – Basile Starynkevitch Oct 29 '17 at 07:48
  • BTW, if your goal is to code a terminal based application, you should have told that. – Basile Starynkevitch Oct 29 '17 at 08:05
  • @Basile Starynkevitch No, I don't want to create a terminal based application, I am just trying to understand how the terminal works. – Joseph Oct 29 '17 at 08:11
  • But that is too broad, since as all answers explained, they are various software layers involved. And why do you care about ASCII which is not used today? Current distributions use UTF-8 everywhere which makes things more complex. – Basile Starynkevitch Oct 29 '17 at 08:13
  • @Basile Starynkevitch I don't think there are various software layers involved, there is only: Terminal <-> Line Discipline <-> Program. And it is not that I care about ASCII, it is that I care about the control characters in ASCII, which I think exist also in Unicode (I don't know if Unicode added new control characters). – Joseph Oct 29 '17 at 08:18
  • I believe you are wrong. Did you follow all the links I have given in my answer? – Basile Starynkevitch Oct 29 '17 at 08:20
  • @Basile Starynkevitch Yes, I have read them before. Sure there are more layers than these three that I have gave, but when it comes to control characters, only these three layers are the ones that handle them (the Terminal send the byte representing the control character to the Line discipline, and the Line discipline either handle the control character, or pass it to the Program to handle it (depending on the termios settings)). – Joseph Oct 29 '17 at 08:27
  • BTW, formfeeds and tabs are not handled by the line discipline in the kernel, but by application code and terminal emulators. That does not mean that they are not handled. And terminal emulators are important too... – Basile Starynkevitch Oct 29 '17 at 08:43
  • You've misread that document. The 14 characters it lists are characters that have a special effect (on the Linux console, not “in Linux” in general). Any character is supported by your definition “can be transmitted” (which you've given in a comment — as others have requested, please edit your question to add this information: as it stands your question doesn't make sense). – Gilles 'SO- stop being evil' Nov 05 '17 at 23:11

3 Answers3

2

You seem to be confused about the various levels and building blocks of Linux.

The line discipline only interprets Ctrl-C (sends a SIGINT signal to all processes in the foreground group), and, if enabled, the software flow control characters Ctrl-S and Ctrl-Q.

Various terminals interpret various control sequences, e.g. xterm mostly based on the VT100 interpreted the control sequences, or the console sequences you found.

Other applications may interpret other control sequences; for example, legacy applications emulating mainframe processing could interpret the FS, GS, RS and US separators (which nobody else uses on Linux, because it's not record-oriented).

There's no central point that somehow says "this control sequence always has to mean this particular thing". Nor is there a need to somehow interpret all ASCII control characters.

Edit

Line discipline doesn't have anything to do with line editing. The line in line discipline means an electrical connection (e.g. telephone line) by which external devices (terminals) were connected to the computer. And it's the job of the line discipline to control communication on that connection, which is why it interprets software flow control characters. There are also other line disciplines in the kernel who do a different kind of control.

Line editing totally depends on the application you are running. E.g. bash has a line editor that interprets keystrokes in a way modelled either on emacs or vi. This is why Ctrl-W (in emacs mode) deletes a word. And this assignment has nothing to do with ASCII, at all.

Again: There are many parts making up your Linux system, and each interprets control characters in whatever way it pleases it.

dirkt
  • 32,309
  • What about when I erase a word using Ctrl+W, isn't it the line discipline who removes the word from the line buffer, and then tell the terminal to erase it from its output window? – Joseph Oct 29 '17 at 07:36
  • No, I don't think so. Again that comment should go into the question. IIRC, Ctrl+W under bash is handled by the readline library. – Basile Starynkevitch Oct 29 '17 at 07:46
  • @Basile Starynkevitch I have created a C program that doesn't have any code (except a code that prevent it from terminating), and I run it in the terminal, and I typed some words and then I pressed Ctrl+W and the last word to the right was erased (so it is the line discipline that is doing the erasing). – Joseph Oct 29 '17 at 07:55
  • That depends upon your terminal setting. See stty – Basile Starynkevitch Oct 29 '17 at 07:56
  • "The line in line discipline means an electrical connection" There is also something called a line buffer (or it could have some other name also I don't know). This line buffer sits in the line discipline layer, and receives every character you type in the terminal, and once it receives the newline character, it sends whatever string it holds to the shell (or whatever program is running on the other side), note that you can change the termios settings to disable this line buffer and have every character you type go directly to the shell (this is what bash does)... – Joseph Oct 29 '17 at 08:09
  • ...This article talks about the line buffer: https://blog.nelhage.com/2009/12/a-brief-introduction-to-termios/, also see this question which has something to do with the line buffer: https://stackoverflow.com/questions/44101057/how-to-read-from-the-terminal-keystrokes-buffer (see @paul's answer). – Joseph Oct 29 '17 at 08:09
  • 1
    And no, the line discipline is a software and kernel thing. It is not hardware related (most terminals are virtual today, without any actual hardware specific to them). – Basile Starynkevitch Oct 29 '17 at 08:19
2

Yes.

The terminal can send any character that it likes, control or otherwise, through the serial device (if it is a real terminal) to the line discipline and thence to the application.

If the line discipline is in non-canonical input mode, as I explained in my answer to your "Prevent the line discipline from handling control characters" question, then the application can read the very characters that the terminal sent. If the line discipline is in canonical input mode then editing characters such as the word or line erase characters will be enacted by the line discipline.

Modern shells (since the 1980s) use non-canonical input mode and enact all of the editing functionality themselves, operating upon the raw character stream generated by the terminal. When those shells invoke other programs they put the terminal into canonical input mode, which is why you see the line discipline's editing functionality in effect when you run your C program.

I mean by "supported" if they can be sent to the line discipline from the terminal. But I have just found [the Linux control_codes(4) manual page], which says that only 14 control characters are supported

You are getting input and output mixed up. The manual page that tells you how the built-in terminal emulator in the kernel interprets control codes that are sent out to the terminal is not telling you about control codes that are received in from the terminal.

the control characters in ASCII

ASCII is a 7-bit character set. Also since the 1980s, since well before the 1980s in fact, we have had the idea of 8-bit character sets. 8-bit character sets have a second set of control codes, the "C1" control codes.

Configure the serial device to have 8 data bits on the wire (if this is a real terminal) and the line discipline to support 8-bit characters, and again in non-canonical mode one can send every character in the entire 8-bit character set — be it a C0 control code, a C1 control code, or otherwise — from the terminal to the application.

JdeBP
  • 68,745
0

The line discipline is related to terminals and pseudo terminals. Read the tty demystified page at first. Then read termios(3). A terminal may have several states, see stty(1). In some states it does not handle control characters. In other states, it won't handle all of them (for example; DC3 might not have a specific handling).

Terminals are quite complex stuff (and they are legacy things, since in the real world physical terminals such as the VT100 are no more used, you'll find them in museums only in 2017, only virtual terminals are practically used in Linux today). I recommend using some library like ncurses if you want to code some text-based user interface (or something like readline if you want a line based one). See also termcap and read about ANSI escape codes. BTW most interactive shells (like bash or zsh ...) and terminal applications (e.g. vim) are using a library such as libreadline, libtinfo, libncurses, etc... And it is the application or library code and the terminal emulator program (e.g. gnome-terminal or xterm) which handles most control characters and escape codes. The kernel only handles the line discipline.

BTW, in 2017 we use UTF-8 everywhere (not ASCII anymore), so even terminal emulators know about UTF-8. Unicode requires a more sophisticated behavior (think about a user mixing left-to-right and right-to-left languages -e.g. English and Hebrew or Arabic- in the same input line). The behavior of most terminal emulators is configurable (e.g. you can enable or disable the audio beep for BEL or make it only a blink).

And the support of various control characters may happen in different layers (or might not happen, in the sense that some don't have any special meaning)...

At last, graphical user interfaces (with widget toolkits like Qt or GTK+) and web interfaces (you might use some HTTP server library like libonion) are more widely used than before.

If you want to code a text-based user interface application today, I strongly recommend using some additional library (like ncurses etc...).

BTW, some behavior could be tuned or configured for each user or pseudo-terminal, and some various terminal emulators are more or less configurable; see also console_codes(4), locale(7), ascii(7), UTF-8(7), charsets(7), environ(7), pty(7), signal(7), term(7), termio(7), unicode(7).

You could also configure, or improve, some particular terminal emulator (they are free software, so you can study and patch their source code) to fit your particular needs (and add specific behavior for those weird control characters that you want to do something useful). The behavior of a specific terminal emulator on weird control characters is specific to that emulator; I guess that most of them would skip or ignore some of them.

However, you can have device files (such as for modems, keyboards, etc...) and files (or socket(7)-s) handling arbitrary sequence of bytes (they don't need to be ASCII or UTF-8, the character encoding is conventional only).

AFAIK, many control characters (notably horizontal and vertical tabs, returns, form feeds, escapes, ...) - probably most of them - are not handled by the line discipline implemented in kernel code. But they are known to application code (in particular those using ncurses or readline) and to the terminal emulator (e.g. gnome-terminal or xterm).

  • Let's say the line discipline is configured to handle all control characters it knows about, does the line discipline knows about all the control characters in the ASCII table, or does it only know about a subset of these control characters? – Joseph Oct 29 '17 at 06:35
  • Please improve your question and explain why you are asking it. It also could depend on the system and the user's configuration. – Basile Starynkevitch Oct 29 '17 at 06:37
  • In practice, better use a library like ncurses or readline if you want to code a terminal application – Basile Starynkevitch Oct 29 '17 at 08:07