13

I am using Debian Stretch. My root partition is mounted read-only. Only when I install or upgrade packages, is / remounted to read-write (by using apt hook), and then remounted back to ro.

Sometimes after package upgrade I am unable to remount / back to read-only:

mount -o remount,ro /
mount: / is busy

On older Debian versions (Wheezy), I could list open files that have been unlinked with lsof:

 lsof +L1

or, more specifically, files that prevent / from being remounted back to ro:

{ lsof +L1 ; lsof|sed -n '/SYSV/d; /DEL|(path /p;' ; } | grep -Ev '/(dev|home|tmp|var)'

However, on Debian Stretch, lsof +L1 does not list any files.

I don't see any changes to +|-L in man lsof that would explain why it stopped working.

Why does lsof +L1 no longer list open files that have been unlinked ?

How can I list those files that prevent / from being remounted to read-only?

UPDATE

I have stopped all processes that can be stopped, and only have init and getty still running, but I still cannot remount / to ro.

Jeff Schaller
  • 67,283
  • 35
  • 116
  • 255
Martin Vegter
  • 358
  • 75
  • 236
  • 411

4 Answers4

2

How can I list those files that prevent / from being remounted to read-only?

A) fuser can be found in the psmisc package; this is a use case where I find fuser shines & is more useful than lsof.

# fuser -v -m / 2>&1 | grep '[Ff]r.e'

That will show all processes that have files open on / for reading (f) and writing (F). The files that would prevent / from being remounted to read-only are those that are opened for writing (F).

Kill the processes that are an executable being run with root directory files open for writing., i.e.

# for fupid in $(fuser -v -m / 2>&1 | grep Fr.e | awk '{print $2}'); do kill $fupid; done

That is above the systemd comments with a caveat. If systemd is init then fuser will see it and there are other considerations. With systemd running, it can (re)start processes behind your back, even if they've just been identified and killed with fuser. systemd is much more advanced than the traditional sysvinit.

B) The UPDATE in the description states the system only has ... init and getty still running ...

I see the comment that says the system is not using systemd, it's using init. On stretch, systemd is init. The comment didn't explicitly say sysvinit, so I'm assuming the system in question may be using the default stretch systemd for init. Or that other people who stumble on this post, that are using stretch's systemd, find this part useful.

Per the Debian Wiki,

The system initialization process is handled by the init daemon. In squeeze and earlier releases, that daemon is provided by the sysvinit package, and no alternatives are supported. In wheezy, the default init daemon is still sysvinit, but a "technology preview" of systemd is available. In jessie and stretch, the default init system is systemd, but switching to sysvinit is supported.

Since jessie, only systemd is fully supported; sysvinit is mostly supported, but Debian packages are not required to provide sysvinit start scripts. runit is also packaged, but has not received the same level of testing and support as the others, and is not currently supported as PID 1.

With systemd running, there are a few additional steps that should be taken to free up / so that it can be remounted without issue.

It's likely system.slice is holding open files for systemd-journald.service or systemd-udevd.service (both of which have socket dependencies). Or, if NetworkManager is running it can respawn dhclient which writes leases to /var/... (& /var/ isn't always its own device), etc. fuser might find & you kill dhclient but NetworkManager starts it right back up.

The moral is lots of things are automated that could 'want' / (and even more so with systemd).

To be sure, if it's feasible, the systemd equivalent of run level 1 is matched by rescue.target (and runlevel1.target is a symbolic link to rescue.target).

1) Start by isolating the system to rescue.target

# systemctl isolate rescue.target

It should prompt you to enter the root password; follow on screen instructions.

2) At the rescue shell, find out what wants /.

# systemctl show -p Wants /

Typically, it's system.slice; stop everything that Wants /. e.g.

# systemctl stop system.slice

3) At this point, the remount should not report mount: / is busy and mount -o remount,ro / should work. If not, check again with fuser.

4) FWIW; I've also seen times when umount fails when/if another device is mounted on a sub-directory of another mount, i.e. nested mounts. For example, umount / would fail if /var/ or /boot/ is on another device (and mounted). Though mount -o remount,ro / should still work in this case.

lsblk can be helpful to visualize nested mounts.

Why does lsof +L1 no longer list open files that have been unlinked ?

Because they aren't available (sockets or most FIFOs & pipes), they're not open files anymore (the parent process closed the file descriptor), or they (still) have a link count greater than 1.

man lsof(8) details ...

+|-L [l]

This option enables ('+') or disables ('-') the listing of file link counts, where they are available - e.g., they aren't available for sockets, or most FIFOs and pipes.

When +L is specified without a following number, all link counts will be listed. When -L is specified (the default), no link counts will be listed.

When +L is followed by a number, only files having a link count less than that number will be listed. (No number may follow -L.) A specification of the form ''+L1'' will select open files that have been unlinked. A specification of the form +aL1 <file_system> will select unlinked open files on the specified file system.

0

Do you have /proc mounted?

Apperently being someone who takes care to have / mounted read-only most of the time, I can imagine you might also chose not to mount procfs. But procfs it is needed for lsof to find open files.

Files held open by processes are exposed by the kernel through symbolic links in procfs. The directories /proc/<pid>/fd contain a symlink for each file held open. The name of the symlinks are the file descriptors numbers, and the path referenced by the symlink is the file path.

Dangling symlinks still remain in /proc for open files that are already deleted. And the referenced path of the file gets renamed to end with "(deleted)".

What lsof +L1 does is essentially no different from a quick one-liner like:

stat -c%N /proc/[0-9]*/fd/* | grep deleted

So, you can use a similar one-liner to list all open files that may prevent the root file system from being remounted (provided a working /proc).

However if you did/do have /proc mounted, the only other causes I can think of are bugs... Anyway, FYI, on my current Debian Stretch system. lsof +L1 works as expected.

bash# lsb_release -d
Description:    Debian GNU/Linux 9.5 (stretch)

bash# uname -a
Linux bwp-249-8 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64 GNU/Linux

bash# lsof -v
lsof version information:
    revision: 4.89
    [...]
Hkoof
  • 1,667
  • yes, I have /proc mounted. I don't follow your reasoning why I might not have. Anyways, stat -c%N /proc/[0-9]*/fd/* | grep deleted shows me nothing. – Martin Vegter Dec 09 '18 at 09:46
0

I could reproduce this problem only once, and solved it by just using mount with the -n option.

Quoting man mount:

-n, --no-mtab
      Mount without writing in /etc/mtab.  This is necessary for example when /etc is on a read-only filesystem.

The mount program itself opening file(s) for writing in the root file system sounded like an plausible explanation to me. Specifically mount writes /etc/mtab after all and /etc often is part of the root file system. However I could not reproduce it again on the same machine after I did it once...

Could this solve your issue?

Hkoof
  • 1,667
0

Without visibility into your system, it's very difficult to tell you exactly what the problem is. The comments and previous answers are good starts.

That said, I would go back all the way through the debian wiki that describes prereq's for mounting / read only.

The link to the documentation is here: https://wiki.debian.org/ReadonlyRoot

The big one's I'll walk you through here:

1 - there are specific locations under / that must be read write. Based on the documentation it looks something like this:

debian ro root

your block devices will probably be different, depending on your storage stack configuration (partitions, partionless lvm, etc..) but the main idea is that you need those 4 mount points to have their subsequent mounted filesystem to have the RW mounting option.

2 - there are a number of special files in /etc that you need to either create a symbolic link for or implement some other change (specifically detailed in the linked article.). These may or may not apply based on what applications your linux server is running. some of files may not even exist on your machine, but I included everything in the docs. Keep in mind, I strongly recommend making these changes EVEN IF you have killed the pid of the process. Here are the paths directly from the debian wiki:

  • adjtime
  • init.d/alsa-utils
  • /etc/courier/shared/index
  • any cups state files, classes.conf, cupsd.conf, printers.conf subscriptions.conf
  • /etc/lvm/lvm.conf
  • mtab (which it looks like you tried to address by giving mount the -n flag)
  • network/run (used by ifup and ifdown, in squeeze. may not apply to stretch, ymmv)
  • nologin
  • resolv.conf
  • both passwd and shadow files
  • samba/dhcp.conf
  • suck
  • udev

Once you have checked all the above and confirmed they conform to the spec in the wiki, the next thing to check is /etc/apt/apt.conf

DPkg {
// Auto re-mounting of a readonly /
Pre-Invoke { "mount -o remount,rw /"; };
Post-Invoke { "test ${NO_APT_REMOUNT:-no} = yes || mount -o remount,ro / || true"; };
}; 

based on your error, the final thing you can check based on the documentation comes from the below:

"After an upgrade of packages you might be faced with the problem that mount refuses to remount the filesystem readonly telling you “/ is busy.” This is caused by deleted files they are still used by a process. To find out which processes use deleted files use the tool checkrestart(1) from the package debian-goodies or use the following command. Often these are daemons using upgraded libraries. You have to restart them to make the files are released. "

command provided in the doc.:

{lsof +L1; lsof|sed -n '/SYSV/d; /DEL\|(path /p;'} |grep -Ev '/(dev|home|tmp|var)'

Without knowing your exact filesystem configuration, partioning, and storage device configuration, it's hard to give you much else to follow. I would start with going back and rechecking your prereq's in the documentation (and outlined above).

frontsidebus
  • 136
  • 5