That entirely depends on what services you want to have on your device.
Programs
You can make Linux boot directly into a shell. It isn't very useful in production — who'd just want to have a shell sitting there — but it's useful as an intervention mechanism when you have an interactive bootloader: pass init=/bin/sh
to the kernel command line. All Linux systems (and all unix systems) have a Bourne/POSIX-style shell in /bin/sh
.
You'll need a set of shell utilities. BusyBox is a very common choice; it contains a shell and common utilities for file and text manipulation (cp
, grep
, …), networking setup (ping
, ifconfig
, …), process manipulation (ps
, nice
, …), and various other system tools (fdisk
, mount
, syslogd
, …). BusyBox is extremely configurable: you can select which tools you want and even individual features at compile time, to get the right size/functionality compromise for your application. Apart from sh
, the bare minimum that you can't really do anything without is mount
, umount
and halt
, but it would be atypical to not have also cat
, cp
, mv
, rm
, mkdir
, rmdir
, ps
, sync
and a few more. BusyBox installs as a single binary called busybox
, with a symbolic link for each utility.
The first process on a normal unix system is called init
. Its job is to start other services. BusyBox contains an init system. In addition to the init
binary (usually located in /sbin
), you'll need its configuration files (usually called /etc/inittab
— some modern init replacement do away with that file but you won't find them on a small embedded system) that indicate what services to start and when. For BusyBox, /etc/inittab
is optional; if it's missing, you get a root shell on the console and the script /etc/init.d/rcS
(default location) is executed at boot time.
That's all you need, beyond of course the programs that make your device do something useful. For example, on my home router running an OpenWrt variant, the only programs are BusyBox, nvram
(to read and change settings in NVRAM), and networking utilities.
Unless all your executables are statically linked, you will need the dynamic loader (ld.so
, which may be called by different names depending on the choice of libc and on the processor architectures) and all the dynamic libraries (/lib/lib*.so
, perhaps some of these in /usr/lib
) required by these executables.
Directory structure
The Filesystem Hierarchy Standard describes the common directory structure of Linux systems. It is geared towards desktop and server installations: a lot of it can be omitted on an embedded system. Here is a typical minimum.
/bin
: executable programs (some may be in /usr/bin
instead).
/dev
: device nodes (see below)
/etc
: configuration files
/lib
: shared libraries, including the dynamic loader (unless all executables are statically linked)
/proc
: mount point for the proc filesystem
/sbin
: executable programs. The distinction with /bin
is that /sbin
is for programs that are only useful to the system administrator, but this distinction isn't meaningful on embedded devices. You can make /sbin
a symbolic link to /bin
.
/mnt
: handy to have on read-only root filesystems as a scratch mount point during maintenance
/sys
: mount point for the sysfs filesystem
/tmp
: location for temporary files (often a tmpfs
mount)
/usr
: contains subdirectories bin
, lib
and sbin
. /usr
exists for extra files that are not on the root filesystem. If you don't have that, you can make /usr
a symbolic link to the root directory.
Device files
Here are some typical entries in a minimal /dev
:
console
full
(writing to it always reports “no space left on device”)
log
(a socket that programs use to send log entries), if you have a syslogd
daemon (such as BusyBox's) reading from it
null
(acts like a file that's always empty)
ptmx
and a pts
directory, if you want to use pseudo-terminals (i.e. any terminal other than the console) — e.g. if the device is networked and you want to telnet or ssh in
random
(returns random bytes, risks blocking)
tty
(always designates the program's terminal)
urandom
(returns random bytes, never blocks but may be non-random on a freshly-booted device)
zero
(contains an infinite sequence of null bytes)
Beyond that you'll need entries for your hardware (except network interfaces, these don't get entries in /dev
): serial ports, storage, etc.
For embedded devices, you would normally create the device entries directly on the root filesystem. High-end systems have a script called MAKEDEV
to create /dev
entries, but on an embedded system the script is often not bundled into the image. If some hardware can be hotplugged (e.g. if the device has a USB host port), then /dev
should be managed by udev (you may still have a minimal set on the root filesystem).
Boot-time actions
Beyond the root filesystem, you need to mount a few more for normal operation:
- procfs on
/proc
(pretty much indispensible)
- sysfs on
/sys
(pretty much indispensible)
tmpfs
filesystem on /tmp
(to allow programs to create temporary files that will be in RAM, rather than on the root filesystem which may be in flash or read-only)
- tmpfs, devfs or devtmpfs on
/dev
if dynamic (see udev in “Device files” above)
- devpts on
/dev/pts
if you want to use [pseudo-terminals (see the remark about pts
above)
You can make an /etc/fstab
file and call mount -a
, or run mount
manually.
Start a syslog daemon (as well as klogd
for kernel logs, if the syslogd
program doesn't take care of it), if you have any place to write logs to.
After this, the device is ready to start application-specific services.
How to make a root filesystem
This is a long and diverse story, so all I'll do here is give a few pointers.
The root filesystem may be kept in RAM (loaded from a (usually compressed) image in ROM or flash), or on a disk-based filesystem (stored in ROM or flash), or loaded from the network (often over TFTP) if applicable. If the root filesystem is in RAM, make it the initramfs — a RAM filesystem whose content is created at boot time.
Many frameworks exist for assembling root images for embedded systems. There are a few pointers in the BusyBox FAQ. Buildroot is a popular one, allowing you to build a whole root image with a setup similar to the Linux kernel and BusyBox. OpenEmbedded is another such framework.
Wikipedia has an (incomplete) list of popular embedded Linux distributions. An example of embedded Linux you may have near you is the OpenWrt family of operating systems for network appliances (popular on tinkerers' home routers). If you want to learn by experience, you can try Linux from Scratch, but it's geared towards desktop systems for hobbyists rather than towards embedded devices.
A note on Linux vs Linux kernel
The only behavior that's baked into the Linux kernel is that the first program that's launched at boot time. (I won't get into initrd and initramfs subtleties here.) This program, traditionally called init, has process ID 1 and has certain privileges (immunity to KILL signals) and responsibilities (reaping orphans). You can run a system with a Linux kernel and start whatever you want as the first process, but then what you have is an operating system based on the Linux kernel, and not what is normally called “Linux” — Linux, in the common sense of the term, is a Unix-like operating system whose kernel is the Linux kernel. For example, Android is an operating system which is not Unix-like but based on the Linux kernel.