12

Let's say if I type in cd in my shell. Is cd loaded from the memory at that moment? My intuition is that these built-in commands are pre-loaded to the system memory after the kernel has been loaded, but someone insisted that they are loaded only when I actually invoke the command (press enter on a shell). Could you please tell me if there is a reference that explains this?

terdon
  • 242,166
Forethinker
  • 1,399

4 Answers4

10

Let's say if I type in cd in my shell. Is cd loaded from the memory at that moment? My intuition is that these built-in commands are pre-loaded to the system memory after the kernel has been loaded, but someone insisted that they are loaded only when I actually invoke the command...

In broad terms the other answers are correct -- the built-ins are loaded with the shell, the stand-alones are loaded when invoked. However, a very stickly weasel-y "someone" could insist that it isn't that simple.

This discussion is somewhat about how the OS works, and different OS's work different ways, but I think in general the following is probably true for all contemporary *nixes.

First, "loaded into memory" is an ambiguous phrase; really what we are referring to is has its virtual address space mapped into memory. This is significant because "virtual address space" refers to stuff that may need to be placed into memory, but in fact is not initially: mostly what is actually loaded into memory is the map itself -- and the map is not the territory. The "territory" would be the executable on disk (or in disk cache) and, in fact, most of that is probably not loaded into memory when you invoke an executable.

Also, much of "the territory" is references to other territories (shared libraries), and again, just because they have been referred to does not mean they are really loaded either. They don't get loaded until they are actually used, and then only the pieces of them that actually need to be loaded in order for whatever "the use" is to succeed.

For example, here's a snippet of top output on linux referring to a bash instance:

VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                  
113m 3672 1796 S  0.0  0.1   0:00.07 bash   

The 113 MB VIRT is the virtual address space, which is mapped in RAM. But RES is the actual amount of RAM consumed by the process -- only 3.7 kB. And of that, some is part of the shared territory mentioned above -- 1.8 kB SHR. But my /bin/bash on disk is 930 kB, and the basic libc it links to (a shared lib) twice as big again.

That shell isn't doing anything right now. Let's say I invoke a built-in command, which we said earlier was already "loaded into memory" along with the rest of the shell. The kernel executes whatever code is involved starting at a point in the map, and when it reaches a reference to code that hasn't really been loaded, it loads it -- from an executable image on disk -- even though in a more casual sense, that executable (be it the shell, a stand-alone tool, or a shared library) was already "loaded into memory".

This is called demand paging.

goldilocks
  • 87,661
  • 30
  • 204
  • 262
9

While waiting for one of the heavyweights to come and give a full historical perspective, I'll give you my more limited understanding.

Built-in commands like alias, cd, echo etc are part of your shell (bash, zsh, ksh or whatever). They get loaded at the same time the shell is and are simply internal functions of that shell.

terdon
  • 242,166
4

I did the following experiment to show that the builtin commands are in fact loaded as part of the exectuable bash. Hence why they're called builtins, but a demo is always the best way to prove something.

Example

  1. Start up a new bash shell, and note its process ID (PID):

    $ bash
    $ echo $$
    6402
    
  2. In a second terminal run the ps command so we can watch and see if bash starts taking up any additional memory:

    $ watch "ps -Fp 6402"
    

    The output looks like this:

    Every 2.0s: ps -Fp 6402                        Sat Sep 14 14:40:49 2013
    
    UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
    saml      6402  6349  0 28747  6380   1 14:33 pts/38   00:00:00 bash
    

    NOTE: Memory usage is shown with the SZ & RSS columns here.

  3. Start running commands in the shell (pid 6402):

    As you cd around you'll notice the memory does in fact go up, but this isn't because of the executable cd being loaded into memory, rather this is because the directory structure on disk is getting loaded into memory. If you keep cd'ing into other directories you'll see it incrementally keep going up.

    Every 2.0s: ps -Fp 30208                        Sat Sep 14 15:11:22 2013
    
    UID        PID  PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
    saml     30208  6349  0 28780  6492   0 15:09 pts/38   00:00:00 bash
    

    You can do more elaborate tests like this:

    $ for i in `seq 1000`; do cd ..; cd 90609;done
    

    This command will cd up a level and then back down into the directory 90609 1000 times. While running this if you monitor the memory usage in the ps window you'll notice that it doesn't change. While running something like this, no additional memory usage should be noticed.

  4. strace

    Here's another tell that we're dealing with a builtin function to bash rather than an actual executable. When you try and run strace cd .. you'll get the following message:

    $ strace cd ..
    strace: cd: command not found
    
slm
  • 369,824
3

"built-in command" refers to commands built into the shell, rather than as separate programs. ls, for example, actually isn't a built-in command but a separate program. It will be loaded into RAM when it is invoked, unless it is already in the disk cache.

An example of a built-in command would be printf or cd. These are part of the shell, and are loaded along with the rest of the shell.

No commands are pre-loaded by default, though systems have been created to do this.