Bug 1120850
Summary: | unable recover NFSv3 locks NLM_DENIED_NOLOCK | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | David Vossel <dvossel> | ||||||||||
Component: | kernel | Assignee: | J. Bruce Fields <bfields> | ||||||||||
kernel sub component: | NFS | QA Contact: | JianHong Yin <jiyin> | ||||||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||||||
Severity: | urgent | ||||||||||||
Priority: | urgent | CC: | bcodding, bfields, crose, dhoward, dvossel, eguan, fdinitto, jkurik, mnovacek, sbonnevi, sreekanth_reddy, swhiteho, tgummels | ||||||||||
Version: | 7.1 | Keywords: | TestBlocker, ZStream | ||||||||||
Target Milestone: | rc | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | Unspecified | ||||||||||||
OS: | Unspecified | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | kernel-3.10.0-191.el7 | Doc Type: | Bug Fix | ||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | |||||||||||||
: | 1150889 (view as bug list) | Environment: | |||||||||||
Last Closed: | 2015-03-05 12:30:10 UTC | Type: | Bug | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Bug Depends On: | |||||||||||||
Bug Blocks: | 1069341, 1091101, 1118356, 1134854, 1150889, 1168400 | ||||||||||||
Attachments: |
|
Description
David Vossel
2014-07-17 21:00:37 UTC
(In reply to David Vossel from comment #0) > 4. Restart the nfs server on node1. Note that I am recreating the statd > state file. In this scenario this is required for the client to attempt to > reclaim the lock. Without creating a new state file different from the > previous one, the nfs client will ignore the sm-notification. This, by the way, is surprising--sm-notify has code to update the state file itself (unless you tell it not to with the -n option), so the sm-notify at the end here: > exportfs -v -u 192.168.122.0/255.255.255.0:/root/testnfs > systemctl stop nfs-server > systemctl stop nfs-lock > sleep 2 > dd if=/dev/urandom of=/var/lib/nfs/statd/state bs=1 count=4 &> /dev/null > systemctl start nfs-lock > systemctl start nfs-server > exportfs -v fsid=0,rw,sync,no_root_squash > 192.168.122.0/255.255.255.0:/root/testnfs > sm-notify -f should have done the job. (In reply to J. Bruce Fields from comment #6) > (In reply to David Vossel from comment #0) > > 4. Restart the nfs server on node1. Note that I am recreating the statd > > state file. In this scenario this is required for the client to attempt to > > reclaim the lock. Without creating a new state file different from the > > previous one, the nfs client will ignore the sm-notification. > > This, by the way, is surprising--sm-notify has code to update the state file > itself (unless you tell it not to with the -n option), so the sm-notify at > the end here: > > > exportfs -v -u 192.168.122.0/255.255.255.0:/root/testnfs > > systemctl stop nfs-server > > systemctl stop nfs-lock > > sleep 2 > > dd if=/dev/urandom of=/var/lib/nfs/statd/state bs=1 count=4 &> /dev/null > > systemctl start nfs-lock > > systemctl start nfs-server > > exportfs -v fsid=0,rw,sync,no_root_squash > > 192.168.122.0/255.255.255.0:/root/testnfs > > sm-notify -f > > should have done the job. Perhaps that is the case now. In rhel6 I know if I didn't re-create the statefile after stop/start of the nfs-server, the statefile would not be updated and the SM_NOTIFY requests would be ignored by the clients because it would look like the service instance never changed. -- Vossel This still doesn't work. After further testing, it turns out using fnctl has the exact same behavior as flock. My previous tests showed that locks grabbed using fnctl would in fact failover between nodes, but this actually already worked with flock as well. The original steps to reproduce this issue still fail in the exact same way regardless if flock or fnctl is used. Below are the updated steps (using fnctl) required to reproduce this issue standalone, outside of pacemaker or the HA resource-agents. You need two nodes. node1 is server node2 is nfs client 1. start NFS server on node1. Note, the modprobe is necessary because of another unrelated bug in the nfs systemd unit files that involves not requiring the /proc/fs/nfsd mount before nfsserver starts. I am using STATDARG=--no-notify in order to prevent nfs-lock from doing sm-notifications automatically. echo STATDARG="--no-notify" > /etc/sysconfig/nfs modprobe nfsd systemctl start nfs-lock systemctl start nfs-server mkdir /root/testnfs touch /root/testnfs/file exportfs -v -o fsid=0,rw,sync,no_root_squash 192.168.122.0/255.255.255.0:/root/testnfs 2. on node2 mount the client, grab a file lock. You won't be able to establish the lock until after the server's grace period expires. (90s by default) mkdir /root/testmount mount -v -o "vers=3" rhel7-auto1:/root/testnfs /root/testmount wget https://raw.githubusercontent.com/davidvossel/phd/master/misc/fnctl_locker.c gcc fnctl_locker.c ./a.out -f /root/testmount/file 3. Start wireshark on node2 to view the NLM lock traffic. tshark -V -i eth0 -Y nlm 4. Restart the nfs server on node1. Note that I am recreating the statd state file. In this scenario this is required for the client to attempt to reclaim the lock. Without creating a new state file different from the previous one, the nfs client will ignore the sm-notification. exportfs -v -u 192.168.122.0/255.255.255.0:/root/testnfs systemctl stop nfs-server systemctl stop nfs-lock sleep 2 dd if=/dev/urandom of=/var/lib/nfs/statd/state bs=1 count=4 &> /dev/null systemctl start nfs-lock systemctl start nfs-server exportfs -v -o fsid=0,rw,sync,no_root_squash 192.168.122.0/255.255.255.0:/root/testnfs sm-notify -f 5. Watch the NLM traffic on node2, After node2 attempts to reclaim its lock, instead of seeing the server grant that lock reclaim because it occurred during the server's grace period, you'll see NLM_DENIED_NOLOCK From that point forward, any attempt to lock the file on the client results in this message ./a.out -f /root/testmount/file Attempting to get write lock... fcntl: No locks available Note: in step 4 we discussed that updating the state file isn't necessary. I've reproduced this problem with and without the state file update. The results are the same. -- Vossel I can reproduce this, apologies for being slow to get to it. One other odd thing I notice about the directions is the exportfs ordering--we want to set up the exports before starting the server. (See nfs-utils/README.) (But the exportfs ordering is unrelated to this issue.) (In reply to J. Bruce Fields from comment #9) > I can reproduce this, apologies for being slow to get to it. > > One other odd thing I notice about the directions is the exportfs > ordering--we want to set up the exports before starting the server. (See > nfs-utils/README.) Yes, we are aware of this. We're using a floating IP though, so we have more control over when requests are received than the standalone use-case. The directions you're referring to. "C/ exportfs -av ; rpc.mountd It is important that exportfs be run before mountd so that mountd is working from current information (in /var/lib/nfs/etab). It is also important that both of these are run before rpc.nfsd. If not, any NFS requests that arrive before mountd is started will get replied to with a 'Stale NFS File handle' error. " In the HA NFS use-cases, no NFS requests can arrive to mountd before an export is started. We can be assured of this because the floating IP clients use to make requests to the server doesn't start until after the nfs daemons and exports have started. I didn't put the floating ip in the steps to reproduce this issue because it is irrelevant. All the other steps reflect exactly what we're doing in the HA NFS active/passive use-case during startup and recovery. -- Vossel If I turn on nsm debugging with "rpcdebug -m nlm -s all", among other things I see this lockd: NSM upcall RPC failed, status=-111 logged (111 = ECONNREFUSED), which seems suspicious. From "rpcinfo" run on the server, before restarting: 100024 1 tcp 0.0.0.0.148.181 status 29 and after restarting: 100024 1 tcp 0.0.0.0.175.82 status 29 And capturing traffic on the server's "lo", I see failed attempts to connect to port 38069 (== 148.181). So the server's attempting to contact the new statd without doing a new portmap call. Created attachment 928949 [details]
[PATCH] lockd: allow rebinding to statd
It looks like this might do the job, but I haven't tested it yet.
That test kernel is still failing my test. I haven't figured out why yet. Created attachment 932433 [details] [PATCH] lockd: allow rebinding to statd The RPC_CLNT_CREATE_AUTOBIND flag isn't working as I expected. The attached appears to do the job partly by just adding a force_rebind in one more place in the client rpc code, but I'm not at all sure this is correct. There are some known bugs in the rpc client's error handling in this area (see https://bugzilla.redhat.com/show_bug.cgi?id=1134911) so it may be worth retrying withe AUTOBIND approach with those patches. But it might be simpler just for the nsm code to deal with this by hand, somehow, which is what the nlm code seems to do. Trond had one more suggestion: http://mid.gmane.org/<CAHQdGtRZ43npC---LK1-kZy91V9PCk63g_QAR_5DsrfFM-2u4Q.com> I'll do some more testing. Created attachment 939392 [details]
[PATCH] lockd: allow rebinding to statd
I tried adding RPC_CLNT_CREATE_HARDRTRY (attached), but that doesn't seem to do it.
The later ENOLCKs are gone, but the reclaim still isn't happening and I haven't figured out why yet.
(My test is:
# One-time setup, on server:
mkdir /exports
chmod a+rwx /exports
touch /exports/file
echo "/exports $CLIENT(rw)" >/etc/exports
echo STATDARG="--no-notify" > /etc/sysconfig/nfs
# On client:
mount -v -o "vers=3" $SERVER:/exports /mnt
flock /mnt/file sleep 10000
# On server:
systemctl stop nfs-server
systemctl stop nfs-lock
sleep 2
systemctl start nfs-lock
systemctl start nfs-server
sm-notify -f
sleep 2
grep $(stat -c'%i' /exports/file) /proc/locks)
The final grep should show a line for the lock something like (details may vary):
5: POSIX ADVISORY WRITE 11307 fd:00:135822054 0 EOF
Created attachment 940590 [details] statd_port_change_recovery.patch I think this should allow re-connection to statd. I've submitted it upstream for review: http://marc.info/?l=linux-nfs&m=141148958603889&w=2 Patch(es) available on kernel-3.10.0-191.el7 Verified: bkr job: https://beaker.engineering.redhat.com/jobs/847291 test log: ----------- [12:02:05 root@ ~~]# service_nfs stop Redirecting to /bin/systemctl stop nfs.service :: [ PASS ] :: Running 'service_nfs stop' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:02:05 root@ ~~]# service_nfslock stop Redirecting to /bin/systemctl stop nfs-lock.service :: [ PASS ] :: Running 'service_nfslock stop' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:02:05 root@ ~~]# sleep 2 :: [ PASS ] :: Running 'sleep 2' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:02:07 root@ ~~]# service_nfslock start Redirecting to /bin/systemctl start nfs-lock.service :: [ PASS ] :: Running 'service_nfslock start' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:02:08 root@ ~~]# service_nfs start Redirecting to /bin/systemctl start nfs.service {Info} nfs rpcinfo: 100005 3,2,1 tcp6,udp6,tcp,udp mountd superuser 100003 4,3 udp6,tcp6,udp,tcp nfs superuser :: [ PASS ] :: Running 'service_nfs start' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:02:10 root@ ~~]# sm-notify -f :: [ PASS ] :: Running 'sm-notify -f' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:02:10 root@ ~~]# sleep 80 :: [ PASS ] :: Running 'sleep 80' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:03:30 root@ ~~]# grep $(stat -c"%i" $expdir/testfile) /proc/locks 1: POSIX ADVISORY WRITE 15830 fd:00:1802629 0 EOF :: [ PASS ] :: Running 'grep $(stat -c"%i" $expdir/testfile) /proc/locks' (Expected 0, got 0) -------------------------------------------------------------------------------- [12:03:30 root@ ~~]# ls -i $expdir/testfile 1802629 /exportDir-bz1120850/testfile Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-0290.html |