Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Comment 2RHEL Program Management
2011-10-18 18:10:55 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products. This request is not yet committed for inclusion in an Update release.
Posted patch:
From: Andy Adamson <andros>
Date: Wed, 19 Oct 2011 10:47:43 -0400
Subject: [RHEL6.2 PATCH 1/1] pNFS can hang or oops on umounts.
This fix is part of the upstream commit 9e3bd4e24 that
went into 3.0-rc5. The patch fixes an oops that can occur
after the connectathon special tests are run on an
pNFS mount and then an umount is done.
Signed-off-by: Steve Dickson <steved>
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=746861
---
fs/nfs/pnfs_dev.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/fs/nfs/pnfs_dev.c b/fs/nfs/pnfs_dev.c
index bee94a3..005e82d 100644
--- a/fs/nfs/pnfs_dev.c
+++ b/fs/nfs/pnfs_dev.c
@@ -239,9 +239,10 @@ _deviceid_purge_client(const struct nfs_client *clp, long hash)
synchronize_rcu();
while (!hlist_empty(&tmp)) {
+ d = hlist_entry(tmp.first, struct nfs4_deviceid_node, tmpnode);
+ hlist_del(&d->tmpnode);
if (atomic_dec_and_test(&d->ref))
d->ld->free_deviceid_node(d);
- hlist_del_init(&d->tmpnode);
}
}
(In reply to comment #6)
> Hi Andy,
>
> Will NetApp verify the fix once a test kernel is available?
I just talked to Andy and he said this patch was verified
at that this year's Bakathon (which happen last week).
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
http://rhn.redhat.com/errata/RHSA-2011-1530.html
Description of problem: Bug in nfs4_deviceid_purge_client that is fixed in 3.0-rc5 commit 9e3bd4e24 Pid: 2731, comm: umount.nfs Not tainted 2.6.32-209.el6.x86_64 #1 VMware, Inc. VM ware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[<ffffffffa053bab8>] [<ffffffffa053bab8>] nfs4_deviceid_purge_client+ 0xe8/0x170 [nfs] RSP: 0018:ffff88006a243dc8 EFLAGS: 00000246 RAX: 0000000000000000 RBX: ffff88006a243e08 RCX: 0000000000000050 RDX: ffff880066584a50 RSI: ffffffffa00f0c70 RDI: 0000000000000282 RBP: ffffffff8100bc0e R08: ffff88006a243d10 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88006a243d78 R13: ffff88006a243d58 R14: 0000000000000282 R15: dead000000200200 FS: 00007fbc5093d700(0000) GS:ffff880002200000(0000) knlGS:0000000000000000CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f1d300949d0 CR3: 0000000053bca000 CR4: 00000000000006f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount.nfs (pid: 2731, threadinfo ffff88006a242000, task ffff88006a37f50 0) Stack: ffff880066584a50 ffffffff81c00140 ffff88006a152400 ffff880069e7e000 <0> ffff880069e7e000 ffffffff81c00140 ffff88006a152400 ffff8800378ab9c0 <0> ffff88006a243e28 ffffffffa04fda3a ffffffff81c00140 ffff880069e7e000 Call Trace: [<ffffffffa04fda3a>] ? nfs_free_client+0x9a/0x120 [nfs] [<ffffffffa04fe04b>] ? nfs_put_client+0x7b/0xb0 [nfs] [<ffffffffa04fe143>] ? nfs_free_server+0xc3/0x130 [nfs] [<ffffffffa050b3a9>] ? nfs4_kill_super+0x49/0x90 [nfs] [<ffffffff81179650>] ? deactivate_super+0x70/0x90 [<ffffffff811955cf>] ? mntput_no_expire+0xbf/0x110 [<ffffffff8119606b>] ? sys_umount+0x7b/0x3a0 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b Code: 00 00 00 4c 89 f8 c7 00 00 00 00 00 48 83 7d c0 00 74 70 e8 9b 18 b5 e0 49 8d 5c 24 48 eb 0e 0f 1f 40 00 49 8b 44 24 18 48 85 c0 <75> 26 48 83 7d c0 00 74 4f f0 ff 0b 0f 94 c0 84 c0 74 e5 49 8b Call Trace: [<ffffffffa053bad6>] ? nfs4_deviceid_purge_client+0x106/0x170 [nfs] [<ffffffffa04fda3a>] ? nfs_free_client+0x9a/0x120 [nfs] [<ffffffffa04fe04b>] ? nfs_put_client+0x7b/0xb0 [nfs] [<ffffffffa04fe143>] ? nfs_free_server+0xc3/0x130 [nfs] [<ffffffffa050b3a9>] ? nfs4_kill_super+0x49/0x90 [nfs] [<ffffffff81179650>] ? deactivate_super+0x70/0x90 [<ffffffff811955cf>] ? mntput_no_expire+0xbf/0x110 [<ffffffff8119606b>] ? sys_umount+0x7b/0x3a0 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b BUG: unable to handle kernel NULL pointer dereference at 0000000000000068IP: [<ffffffffa053bad3>] nfs4_deviceid_purge_client+0x103/0x170 [nfs]PGD 0Oops: 0000 [#1] SMP last sysfs file: /sys/kernel/mm/ksm/runCPU 1Modules linked in: nfs_layout_nfsv41_files nfs lockd fscache nfs_acl auth_rpcgss nls_utf8 fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat xt_CHECKSUM iptable_mangle bridge stp llc autofs4 sunrpc ipt_REJECT nf_conntrack_ipv4 nf_d efrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter Got error -10052 from the server on DESTROY_SESSION. Session has been destroyed regardless... ip6_tables ipv6 vhost_net macvtap macvlan tun uinput ppdev parport_pc parport snd_ens1371 snd_rawmidi snd_ac97_codec ac97_bus snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc e1000 microcode vmware_balloon sg i2c_piix4 i2c_core shpchp ext4 mbcache jbd2 sd_mod crc_t10dif sr_mod cdrom mptspi mptscsi h mptbase scsi_transport_spi pata_acpi ata_generic ata_piix dm_mirror dm_region_ hash dm_log dm_mod [last unloaded: speedstep_lib] Pid: 2731, comm: umount.nfs Not tainted 2.6.32-209.el6.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform RIP: 0010:[<ffffffffa053bad3>] [<ffffffffa053bad3>] nfs4_deviceid_purge_client+ 0x103/0x170 [nfs] RSP: 0018:ffff88006a243dc8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff880066584e08 RCX: 0000000000000050 RDX: ffff880066584a50 RSI: ffffffffa00f0c70 RDI: ffff880066584dc0 RBP: ffff88006a243e08 R08: ffff88006a243d10 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff880066584dc0 R13: ffff880069e7e000 R14: ffffffffa054e5e0 R15: ffffffffa054e5c0 FS: 00007fbc5093d700(0000) GS:ffff880002300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fc6d59e7000 CR3: 0000000053bca000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process umount.nfs (pid: 2731, threadinfo ffff88006a242000, task ffff88006a37f50 0) Stack: ffff880066584a50 ffffffff81c00140 ffff88006a152400 ffff880069e7e000 <0> ffff880069e7e000 ffffffff81c00140 ffff88006a152400 ffff8800378ab9c0 <0> ffff88006a243e28 ffffffffa04fda3a ffffffff81c00140 ffff880069e7e000 Call Trace: [<ffffffffa04fda3a>] nfs_free_client+0x9a/0x120 [nfs] [<ffffffffa04fe04b>] nfs_put_client+0x7b/0xb0 [nfs] [<ffffffffa04fe143>] nfs_free_server+0xc3/0x130 [nfs] [<ffffffffa050b3a9>] nfs4_kill_super+0x49/0x90 [nfs] [<ffffffff81179650>] deactivate_super+0x70/0x90 [<ffffffff811955cf>] mntput_no_expire+0xbf/0x110 [<ffffffff8119606b>] sys_umount+0x7b/0x3a0 [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1bCode: 24 48 eb 0e 0f 1f 40 00 49 8b 44 24 18 48 85 c0 75 26 48 83 7d c0 00 74 4f f0 ff 0b 0f 94 c0 84 c0 74 e5 49 8b 44 24 20 4c 89 e7 <ff> 50 68 49 8b 44 24 18 48 85 c0 74 da 49 8b 54 24 10 48 85 d2 RIP [<ffffffffa053bad3>] nfs4_deviceid_purge_client+0x103/0x170 [nfs] RSP <ffff88006a243dc8> CR2: 0000000000000068 ---[ end trace 7afe685c8e44198a ]--- Kernel panic - not syncing: Fatal exception Pid: 2731, comm: umount.nfs Tainted: G D ---------------- 2.6.32-209.el6.x86_64 #1 Call Trace: [<ffffffff814ebd7b>] ? panic+0x78/0x143 [<ffffffff814eff14>] ? oops_end+0xe4/0x100 [<ffffffff810422eb>] ? no_context+0xfb/0x260 [<ffffffff81042575>] ? __bad_area_nosemaphore+0x125/0x1e0 [<ffffffff8104269e>] ? bad_area+0x4e/0x60 [<ffffffff81042da3>] ? __do_page_fault+0x3c3/0x480 [<ffffffff814ed305>] ? schedule_timeout+0x215/0x2e0 [<ffffffff814eef5b>] ? _spin_unlock_bh+0x1b/0x20 [<ffffffff814f1ece>] ? do_page_fault+0x3e/0xa0 [<ffffffff814ef285>] ? page_fault+0x25/0x30 [<ffffffffa053bad3>] ? nfs4_deviceid_purge_client+0x103/0x170 [nfs] [<ffffffffa053bad6>] ? nfs4_deviceid_purge_client+0x106/0x170 [nfs] [<ffffffffa04fda3a>] ? nfs_free_client+0x9a/0x120 [nfs] [<ffffffffa04fe04b>] ? nfs_put_client+0x7b/0xb0 [nfs] [<ffffffffa04fe143>] ? nfs_free_server+0xc3/0x130 [nfs] [<ffffffffa050b3a9>] ? nfs4_kill_super+0x49/0x90 [nfs] [<ffffffff81179650>] ? deactivate_super+0x70/0x90 [<ffffffff811955cf>] ? mntput_no_expire+0xbf/0x110 [<ffffffff8119606b>] ? sys_umount+0x7b/0x3a0 [<ffffffff8100b0f2>] ? system_call_fastpath+0x16/0x1b How reproducible: Very. Steps to Reproduce: 1. Run connectathon Special tests on a pNFS mount 2. umount 3. Actual results: umount hangs or Oops Expected results: umount succeeds Additional info: Here is the broken code: static void _deviceid_purge_client(const struct nfs_client *clp, long hash) { ....... while (!hlist_empty(&tmp)) { if (atomic_dec_and_test(&d->ref)) d->ld->free_deviceid_node(d); hlist_del_init(&d->tmpnode); } } Here is the fixed code. static void _deviceid_purge_client(const struct nfs_client *clp, long hash) { ........ while (!hlist_empty(&tmp)) { d = hlist_entry(tmp.first, struct nfs4_deviceid_node, tmpnode); hlist_del(&d->tmpnode); if (atomic_dec_and_test(&d->ref)) d->ld->free_deviceid_node(d); } }