Bug 402581
Summary: | Deadlock while performing nfs operations. | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 4 | Reporter: | Sachin Prabhu <sprabhu> | ||||
Component: | kernel | Assignee: | Jeff Layton <jlayton> | ||||
Status: | CLOSED ERRATA | QA Contact: | Martin Jenner <mjenner> | ||||
Severity: | low | Docs Contact: | |||||
Priority: | low | ||||||
Version: | 4.5 | CC: | aleksey, staubach, steved, tao | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | RHSA-2008-0665 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-07-24 19:22:18 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Sachin Prabhu
2007-11-28 12:39:04 UTC
Thanks. I think we may end up having to use a different workqueue for nfs4_close_state_work. Not sure if we'll require an entirely separate one, but the kevents workqueue might not be suitable... I presume this is happening in a race between nfs4 closes while unmounting a different filesystem. Do they have a way to reproduce this at will? Now that I've looked over this a bit, I think I see how this deadlocked. They had a race while doing a close on a nfs4 file while at the same time tearing down an rpc_client. The rpc client was either a one-shot client or just one that was dead, but it could even have been completely unrelated to the nfs4 connection. Either way, rpciod called flush_scheduled_work() while a nfsv4 close was in progress and this deadlock occurred. The easiest fix is to create a new workqueue and queue nfs4_close_state_work to that workqueue. We could also queue it to an existing workqueue that won't be subject to the same deadlock, though I don't know of any appropriate candidates for that. We could also consider backing out the patch that added nfs4_close_state_work in the first place and doing something closer to what upstream did to fix it. That would be a significant change, however. A reproducer for this would also be nice. I ran quite a bit of stress testing with this patch and never hit this deadlock, so if they can offer a way to reproduce it more readily that would be helpful. Created attachment 271811 [details] patch -- add new workqueue for handling nfs4 scheduled work In some of my earlier passes in bug 228292, I had a dedicated workqueue for handling scheduled work. It worked well, but at the time I didn't see the need for a dedicated workqueue. That need seems to now be evident. This patch adds that back and changes nfs4_close_state to queue its work to that workqueue. I've done some testing with fsstress and it seems to work as expected. Having a way to reproduce the customer's deadlock (or having them test it) would be ideal. I'll see about adding this patch to my test kernel in the near future. I'm building some test kernels with this patch now and should hopefully have them up on my people page a little later today. Once I do, it would be nice if they could test them and see if they can still reproduce this deadlock (presuming that it is easily reproducible). Once I have them up, I'll reset this to NEEDINFO... I've built some kernels with the above patch and put them on my people page: http://people.redhat.com/jlayton/ ...if the customer is willing and able, can they test it someplace non-critical and let me know if the problem is still reproducible? If this does fix it, I'm still not sure how acceptable this patch will be, but knowing whether it fixes it is a good way to sanity check the diagnosis. Thanks for the analysis, Frank. Actually...it's not reclaiming locks, it's reclaiming open state. The concept is similar though. Something happened to make the state of open files get out of sync between the client and server, and they fell into state recovery mode. A server crash or reboot looks likely here. It seems like the state reclaimer thread only gets spawned when allocating a new NFS client. A long list of open states doesn't necessarily mean that the list was corrupt. IIRC, each open file will have an open_state. If the client happened to have a large number of open files then the whole list could be valid. Do we know what actually made it crash here? This error is interesting: nfs4_reclaim_open_state: unhandled error -116. Zeroing state I think -116 is -ESTALE, so perhaps it's trying to reclaim a bunch of stale NFS file handles? Maybe the server rebooted and the fsid of the exported filesystem changed? Either way, I think this bug isn't likely to be related to the original deadlock. It should probably get a new BZ. If you open one, let me know the number and I'll plan to grab it. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Committed in 68.27.EL . RPMS are available at http://people.redhat.com/vgoyal/rhel4/ An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2008-0665.html (In reply to comment #9) > This error is interesting: > > nfs4_reclaim_open_state: unhandled error -116. Zeroing state > > I think -116 is -ESTALE, so perhaps it's trying to reclaim a bunch of stale NFS > file handles? Maybe the server rebooted and the fsid of the exported filesystem > changed? > > Either way, I think this bug isn't likely to be related to the original > deadlock. It should probably get a new BZ. > Jeff, have you opened an entry for this one? We are seeing it on RHEL5 with kernel-2.6.18-194.32.1.el5. Another potentially related: BUG: soft lockup - CPU#15 stuck for 10s! [192.27.172.41-r:30868] CPU 15: Modules linked in: md5 autofs4 hidp nfs fscache nfs_acl rfcomm l2cap bluetooth rpcsec_gss_krb5 auth_rpcgss testmgr_cipher testmgr aead crypto_blkcipher crypto_algapi des lockd sunrpc ip_conntrack_netbios_ns iptable_filter ipt_MASQUERADE iptable_nat ip_nat ip_conntrack nfnetlink ip_tables ip6t_REJECT xt_tcpudp ip6table_filter ip6_tables x_tables cpufreq_ondemand acpi_cpufreq freq_table rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) ib_ipoib(U) ipoib_helper(U) ib_cm(U) ib_sa(U) ipv6 xfrm_nalgo crypto_api ib_uverbs(U) ib_umad(U) iw_cxgb3(U) cxgb3(U) mlx4_ib(U) ib_mthca(U) ib_mad(U) ib_core(U) dm_mirror dm_multipath scsi_dh video hwmon backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport nvidia(PU) joydev shpchp sg mlx4_core(U) igb i2c_i801 8021q i2c_core serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_log dm_mod dm_mem_cache ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 30868, comm: 192.27.172.41-r Tainted: P 2.6.18-164.11.1.el5 #1 RIP: 0010:[<ffffffff89165575>] [<ffffffff89165575>] :nfs:nfs4_reclaim_open_state+0x135/0x150 RSP: 0018:ffff81013a985e90 EFLAGS: 00000246 RAX: ffff810083c76bc0 RBX: ffff81011d7b9340 RCX: ffffffff80309c28 RDX: ffff8101664a10c8 RSI: ffff8101664a10c0 RDI: ffffffff89182f50 RBP: 0000000000000007 R08: ffffffff80309c28 R09: ffffffffffffff10 R10: ffff81013a985b30 R11: 0000000000000280 R12: ffff8102bfc27dc0 R13: 00000002676f2d3d R14: ffff81013a985b30 R15: ffff8103106fe5c0 FS: 0000000000000000(0000) GS:ffff81033fcc10c0(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002aaaab1e4000 CR3: 0000000000201000 CR4: 00000000000006e0 Call Trace: [<ffffffff89165734>] :nfs:reclaimer+0x1a4/0x2ac [<ffffffff89165590>] :nfs:reclaimer+0x0/0x2ac [<ffffffff80032950>] kthread+0xfe/0x132 [<ffffffff8005dfb1>] child_rip+0xa/0x11 [<ffffffff8009fe9f>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032852>] kthread+0x0/0x132 [<ffffffff8005dfa7>] child_rip+0x0/0x11 Aleksey, This particular issue seems to be closer to bz 526888 which should have been fixed in your version of the kernel. There have been other significant changes since 2.6.18-164.11.1.el5. Could you please check with a later version of the kernel. If you still see the problem with that issue, please open a case with Red Hat Support. Sachin Prabhu |