For some time I've been experiencing a strange issue with NFS where a seemingly random subset of directories (always the same ones) under /
consistently show up with stale file handles immediately after NFS mount.
I've been able to correct the problem by explicitly exporting the seemingly-random set of problem directories, but I'd like to see if I can fix things more completely so I don't have to occasionally add random directories to the export table.
Below, I mount a filesystem, show that there are no open file handles, run ls
, and rerun lsof
. Empty lines added between commands for clarity:
# mount -t nfs -o vers=4,noac,hard,intr 192.168.0.2:/ /nfs -vvv
mount.nfs: trying text-based options 'vers=4,noac,hard,intr,addr=192.168.0.2,clientaddr=192.168.0.4'
192.168.0.2:/ on /nfs type nfs (rw,vers=4,noac,hard,intr)
lsof | grep /nfs
ls -lh /nfs
ls: cannot access /nfs/usr: Stale file handle
ls: cannot access /nfs/root: Stale file handle
ls: cannot access /nfs/etc: Stale file handle
ls: cannot access /nfs/home: Stale file handle
lrwxrwxrwx 1 root root 7 Mar 27 2017 bin -> usr/bin
drwxr-xr-x 6 root root 16K Jan 1 1970 boot
drwxr-xr-x 438 i336 users 36K Feb 28 12:12 data
drwxr-xr-x 2 root root 4.0K Mar 14 2016 dev
d????????? ? ? ? ? ? etc
d????????? ? ? ? ? ? home
lrwxrwxrwx 1 root root 7 Mar 27 2017 lib -> usr/lib
lrwxrwxrwx 1 root root 7 Mar 27 2017 lib64 -> usr/lib
drwxr-xr-x 15 root root 4.0K Oct 15 15:51 mnt
drwxr-xr-x 2 root root 4.0K Aug 9 2017 nfs
drwxr-xr-x 14 root root 4.0K Jan 28 17:00 opt
dr-xr-xr-x 2 root root 4.0K Mar 14 2016 proc
d????????? ? ? ? ? ? root
drwxr-xr-x 2 root root 4.0K Mar 14 2016 run
lrwxrwxrwx 1 root root 7 Mar 27 2017 sbin -> usr/bin
drwxr-xr-x 6 root root 4.0K Jun 22 2016 srv
dr-xr-xr-x 2 root root 4.0K Mar 14 2016 sys
drwxrwxrwt 2 root root 4.0K Dec 10 2016 tmp
d????????? ? ? ? ? ? usr
drwxr-xr-x 15 root root 4.0K May 24 2017 var
lsof | grep /nfs
The subdirectories in question are not mount points; they seem completely normal:
$ ls -dlh /usr /root /etc /home
drwxr-xr-x 123 root root 12K Mar 3 13:34 /etc
drwxr-xr-x 7 root root 4.0K Jul 28 2017 /home
drwxrwxrwx 32 root root 4.0K Mar 3 13:55 /root
drwxr-xr-x 15 root root 4.0K Feb 24 17:48 /usr
There are no related errors in syslog about these directories. The only info that does show up mentions a different set of directories:
... rpc.mountd[10080]: Cannot export /proc, possibly unsupported filesystem or fsid= required
... rpc.mountd[10080]: Cannot export /dev, possibly unsupported filesystem or fsid= required
... rpc.mountd[10080]: Cannot export /sys, possibly unsupported filesystem or fsid= required
... rpc.mountd[10080]: Cannot export /tmp, possibly unsupported filesystem or fsid= required
... rpc.mountd[10080]: Cannot export /run, possibly unsupported filesystem or fsid= required
Here's what /etc/exports
currently looks like:
/ *(rw,subtree_check,no_root_squash,nohide,crossmnt,fsid=0,sync)
The server side is running Arch Linux and is currently on kernel 4.10.3.
The client-side is Slackware 14.1 with kernel 4.1.6.
nfsd
couldn't bestrace
d as it's a kernel process. Maybe there's something I can do with BPF or similar...? Or perhaps Wireshark might come in handy? – i336_ Mar 06 '18 at 02:03nfsd
, mount everything, then startnfsd
again. So this at least restores access to my USB HDD, but doesn't explain why/home
remains stale. – i336_ Mar 09 '18 at 01:31/home
,/etc
, etc in my exports file for months. A few days ago everything fell apart and I decided "alright let's see if we can fix this for good." I suspect the "fix" will be switching to SMB, heh... – i336_ Mar 09 '18 at 01:38/
export to/srv/nfs4/root
(or whatever) and exporting from there? This is often done (though for other reasons) and may possibly help here. – Ned64 May 18 '19 at 10:28