kernel-2.6.40-4.fc15.x86_64 When I have more that one F15 system running Firefox on gnome-shell, I experience desktop freezes across various systems. For example, when I run one F15/gnome-shell desktop, everything is OK. If a second worker starts their computer, my desktop may freeze when they start using gnomeshell+firefox. All home folders are nfsv4 mounted. Authentication is via nis. I don't see a lot of useful log outputs, but I did catch the snippet below, which seems to confirm my theory that Something Bad is happening to the nfs system. It is reminiscent of bug 517629, but this kernel seems to have the patch that closed that bug. - Mike Aug 22 15:16:45 xena kernel: [ 9294.039202] nfs4_reclaim_open_state: Lock reclaim failed! Aug 22 15:16:55 xena kernel: [ 9303.364075] nfs4_reclaim_open_state: Lock reclaim failed! Aug 22 15:16:55 xena kernel: [ 9303.394724] nfs4_reclaim_open_state: Lock reclaim failed! Aug 22 15:16:55 xena kernel: [ 9303.579710] nfs4_reclaim_open_state: Lock reclaim failed! Aug 22 15:16:55 xena kernel: [ 9303.608663] nfs4_reclaim_open_state: Lock reclaim failed! Aug 22 15:17:05 xena kernel: [ 9313.465224] nfs4_reclaim_open_state: Lock reclaim failed! Aug 22 15:17:05 xena kernel: [ 9313.466475] NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801539f1420! Aug 22 15:17:05 xena kernel: [ 9313.467178] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.467800] NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801129c3e20! Aug 22 15:17:05 xena kernel: [ 9313.468429] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.469037] NFS: v4 server returned a bad sequence-id error on an unconfirmed sequence ffff8801129c2420! Aug 22 15:17:05 xena kernel: [ 9313.469663] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.470814] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.471429] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.472077] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.472706] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.473337] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state Aug 22 15:17:05 xena kernel: [ 9313.473961] nfs4_reclaim_open_state: unhandled error -10026. Zeroing state .... last line repeated many times ...
This was so problematic for me that I switched to glusterfs. Everything works fine now. Better than fine, really. glusterfs rocks. Something is badly broken in F15 + gnome-shell + firefox + nfsv4, but I'm no longer able to help debug it. - Mike
Bumping to Fedora 16... running into same problem. My setup: NFSv4 server: Fedora 16, kernel 3.3.2-6.fc16.x86_64 Filesystem ext4 on top of md (RAID 1) /etc/exports: /g 10.10.20.0/24(rw,no_root_squash,fsid=0) NFSv4 client: Fedora 16. Home directories are mounted via NFSv4. Known working kernel: kernel-3.2.9-2.fc16.x86_64 Known broken kernels: kernel-3.3.2-6.fc16.x86_64 kernel-3.3.4-1.fc16.x86_64 Mount entry: 10.10.20.1:/ on /g type nfs4 (rw,relatime,vers=4,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=0.0.0.0,minorversion=0,local_lock=none,addr=10.10.20.1) Symptoms: Receive the following message, upon first access of a home dir, more than 1200 per second! May 4 14:50:26 bd kernel: [ 761.391786] nfs4_reclaim_open_state: Lock reclaim failed!
Here is a related and probably useful upstream commit: commit 96dcadc2fdd111dca90d559f189a30c65394451a Author: William Dauchy <wdauchy> Date: Wed Mar 14 12:32:04 2012 +0100 NFSv4: Rate limit the state manager for lock reclaim warning messages Adding rate limit on `Lock reclaim failed` messages since it could fill up system logs Signed-off-by: William Dauchy <wdauchy> Signed-off-by: Trond Myklebust <Trond.Myklebust>
Same issue on Fedora 15 after kernel upgrade via "updates": "Broken" kernel: 2.6.43.2-6.fc15.x86_64 Working kernel: 2.6.42.12-1.fc15.x86_64 Server runs: 2.6.32-220.4.2.el6.centos.plus.x86_64 My home directory is on a NFSv4 share and kmail stalls on creation of new messages. Booting the old kernel works instantly. I'll try to reboot the NFS server in the evening, may be it's a "protocol" upgrade incompatibility.
From discusion around http://mid.gmane.org/1334770614-10653-1-git-send-email-Trond.Myklebust@netapp.com it looks like we probably need 55725513b5e "NFSv4: Ensure that we check lock exclusive/shared type against open modes" and 05ffe24f529 "NFSv4: Ensure that the LOCK code sets exception->inode".
(In reply to comment #5) > From discusion around > > > http://mid.gmane.org/1334770614-10653-1-git-send-email-Trond.Myklebust@netapp.com > > it looks like we probably need 55725513b5e "NFSv4: Ensure that we check lock > exclusive/shared type against open modes" and 05ffe24f529 "NFSv4: Ensure that > the LOCK code sets exception->inode". Those are already applied on F16 and F17. They're part of the 3.3.5 stable queue that should be released today. F15 will pick them up at that point. We'll grab the rate limit patch today too, though it looks mostly to be a paper over "fix" if anything.
Patch has been applied.
kernel-2.6.43.5-2.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.43.5-2.fc15
kernel-3.3.5-2.fc16 has been submitted as an update for Fedora 16. https://admin.fedoraproject.org/updates/kernel-3.3.5-2.fc16
Package kernel-3.3.5-2.fc16: * should fix your issue, * was pushed to the Fedora 16 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing kernel-3.3.5-2.fc16' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2012-7538/kernel-3.3.5-2.fc16 then log in and leave karma (feedback).
kernel-3.3.5-2.fc16 has been pushed to the Fedora 16 stable repository. If problems still persist, please make note of it in this bug report.
kernel-2.6.43.5-2.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.
Thanks for the fast resolution of this issue! It's highly appreciated. Everything back to normal with the latest updates.