2

This came from ~/.bashrc

PS1='${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '

Notice the \033[01;32m

I know \033[ is a Control Sequence Introducer. I know 32 is the color code for green.

But, what are the 01; and m?

Which part of ANSI escape code does \033[01;32m belongs to.

  • 3
    "m" is the actual function to do ("Select Graphic Rendition" or SGR) (some other functions are listed here). The "01" and "32" are parameters to that "m"/SGR function, which tell it to set "bold" and "foreground color = green" respectively (more are listed here). – Gordon Davisson May 24 '20 at 04:44
  • 2
    https://stackoverflow.com/questions/4842424/list-of-ansi-color-escape-sequences – jsotola May 24 '20 at 04:48
  • 2
    If you got the answer, please post it as an answer instead of editing the question – muru May 24 '20 at 08:10
  • 1
    @GordonDavisson please don't post answers as comments. We can't vote on them properly, and they can dissuade others from posting proper answers, meaning the question can remain unanswered. – terdon May 25 '20 at 19:32
  • @terdon Sorry; I didn't really consider my comment fleshed-out enough to be a full answer, and was hoping someone'd write up a more complete one (or two, as it turns out). – Gordon Davisson May 26 '20 at 04:28

2 Answers2

3

Based on research I've found out:

  • \033[01;32m — The part of the ANSI escape code which \033[01;32m belongs to is called Select Graphic Rendition (SGI) Terminal Output Sequence (which has the code CSI n m.)
  • \033[ — is a Control Sequence Introducer
  • 01 — is code for "bold or increased intensity".
  • ; — is a delimiter for codes. We can have as many codes as we want. There is a table for those codes on the Wikipedia page ANSI escape codes at Select Graphic Rendition (SGR) parameters.
  • 32 is code for foreground green text.
  • m marks the entire sequence as being CSI n m SGI sequence.
3

The standards to read are ECMA-35 and ECMA-48. ("ANSI" is largely a misnomer. So too is "VT100 style" for this case.) These explain that a control sequence has four parts:

  1. a Control Sequence Introducer (CSI) character, which is U+009F in modern parlance and 9/15 in the parlance of these 1970s standards
  2. zero or more parameter characters, taken from the range U+0030 to U+003F
  3. zero or more intermediate characters, taken from the range U+0020 to U+002F
  4. a single final character, taken from the range U+0040 to U+007E

The Control Sequence Introducer is in a range of so-called C1 control characters, whose values are from U+0080 to U+009F.

By the middle of the 1980s, the world of terminals and serial communication was almost wholly 8-bit clean, in large part thanks to selection pressure from the world of personal computers, BBSes, Fidonet, et al.. Back in the 1970s when fitting into 7 bits was still a significant concern, ECMA-38 and ECMA-45 provided a system of alternative 7-bit encodings for the C1 control characters whose values did not fit into 7 bits. This remains fossilized, some 40 years later, in the Escape character () followed by [ being a 7-bit encoding for CSI.

You will find that there are alternative 7-bit encodings for all of the C1 range. You will also find that there are a lot of people who don't know this. There are softwares that do not recognize the actual CSI character as Control Sequence Introducer. There are softwares that do not process all of the 7-bit encodings, only the one for Control Sequence Introducer. And there are, on the other hand, a few softwares that have caught up with the middle 1980s and recognize the actual C1 control characters even when not 7-bit encoded.

Breaking your specific control sequence down, therefore, there are:

  1. \033[ — an encoding of an encoding, the 7-bit encoding of the CSI character further encoded as a C-style escape sequence which is processed by the Bourne Again shell
  2. 01;32 — five parameter characters
  3. m — the final character

The combination of the (possibly no) intermediate characters and the final character designate the function of the control sequence. There are rather a lot of such functions, including a whole set that is reserved for vendor extensions. (Because of the aforegiven structure to control sequences, even unknown vendor extensions can be processed/skipped in a stream.) The one denoted in this case is Set Graphic Rendition (SGR), one of the standard control sequences.

The parameter characters encode in base 10 a string of semi-colon-separated numeric parameters. (Actually, the parameter string can contain more than that. The colon is a legal parameter character, after all, it having the value U+003A. It is used, per a later ITU standard, to denote sub-parameters. This actually has applicability to SGR. Similarly, DEC VTs use parameter character U+003F, ?, as an extension marker for some DEC variants on standard control sequences.) In the case of SGR these parameters denote colours and attributes to be set for printed output (i.e. the "rendition" of "graphic" characters).

There is an extensive set of these, and in this particular case they mean:

  • 01 — boldface on
  • 32 — green foreground

There's a whole digression to be had here on how the CGA display system on the IBM PC, and using IBM PC compatibles as terminals, led to font weights such as boldface being turned into colour changes, a convention that has been fortunately gradually disappearing in favour of boldface actually meaning boldface once more (as it meant in the times before CGA). Under this convention, and thus on some, but fortunately fewer and fewer, terminal emulators this SGR sequence would effectively set colour #10, bright green, as the foreground.

Many years ago, the AIXterm terminal emulator introduced SGRs 90 to 97 and 100 to 107 for setting colours 9 to 15 as foreground and background colours. Not only is that 16 colour convention now widespread, we've even had a 256 colour palette convention for quite a long time. The more reliable way to get foreground colour #10 is to use SGR 92, not SGR 1;32.

Note that this is not a necessity for shell prompts in general. For example: Rather than directly encoding specific control sequences with C-style escaping, the Z shell allows a user to encode colour and attribute changes using percent sequences in the PS1, RPROMPT, and similar shell variables for the various prompts. The Z shell goes and looks up the corresponding control sequences in the terminfo database. So a similar prompt string in the Z shell could look like:

PS1=%B%F{green}%n@%m%f%b
or, if colour #10 and no boldface was what was actually wanted:
PS1=%F{10}%n@%m%f

Finally: These are not "VT100 style". The VT100 is too often bandied about by people as a generalization. It is an incorrect one; these are ECMA-48 control sequences, not "VT100 style". An important fact here is, for starters, that the VT100 was monochrome and did not have multi-colour capabilities. Do not get into the bad habit of abusing either "vt100" or "vt102" as the name for this.

Further reading

Jerry
  • 5
JdeBP
  • 68,745