Bug 596111
Summary: | nfs related kernel BUG | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | John Kacur <jkacur> | ||||
Component: | realtime-kernel | Assignee: | John Kacur <jkacur> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | David Sommerseth <davids> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 1.2 | CC: | bhu, johnstul, lgoncalv, ovasik, tglx, williams | ||||
Target Milestone: | 1.3 | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2010-09-15 10:13:00 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
John Kacur
2010-05-26 10:42:48 UTC
Created attachment 416760 [details]
nfs fix from John Stultz
The patch seems to have fixed the kernel panics, but now it uncovered the next bug, observed in almost all of the tests: ======= I386 http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=14359093 http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=14359078 http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=14359152 http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=14359152 ... ------------[ cut here ]------------ WARNING: at fs/namespace.c:1203 umount_tree+0xce/0x10b() Hardware name: ProLiant DL320 G5 Modules linked in: nfs nfs_acl auth_rpcgss autofs4 i2c_dev i2c_core hidp rfcomm l2cap crc16 bluetooth rfkill lockd sunrpc ipv6 loop dm_multipath scsi_dh video output sbs sbshc battery ac parport_pc lp parport tg3 sg ipmi_devintf hpilo ipmi_si ipmi_msghandler hpwdt snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss snd_pcm serio_raw button snd_timer snd soundcore tpm_tis snd_page_alloc tpm tpm_bios iTCO_wdt iTCO_vendor_support pcspkr dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod pata_acpi ata_piix ata_generic libata sd_mod crc_t10dif scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: nf_defrag_ipv4] Pid: 13811, comm: mount.nfs4 Tainted: G W 2.6.33.4-rt20.19.el5rt #1 Call Trace: [<c04e7ca7>] ? umount_tree+0xce/0x10b [<c043b03c>] warn_slowpath_common+0x7a/0x91 [<c04e7ca7>] ? umount_tree+0xce/0x10b [<c043b06d>] warn_slowpath_null+0x1a/0x1c [<c04e7ca7>] umount_tree+0xce/0x10b [<c04e83fe>] put_mnt_ns+0x83/0xa7 [<c04dc835>] ? vfs_path_lookup+0x7e/0x8d [<f84c8acb>] nfs_follow_remote_path+0x4b/0xf8 [nfs] [<c04cdd02>] ? slab_irq_enable+0x39/0x6c [<f84c8a76>] ? nfs_do_root_mount+0x70/0x7a [nfs] [<f84c8bdf>] nfs4_try_mount+0x67/0x9f [nfs] [<c04b86db>] ? strndup_user+0x45/0x64 [<f84c8e46>] nfs4_get_sb+0x22f/0x2a4 [nfs] [<c04d1595>] ? __alloc_percpu+0xf/0x12 [<c04d54ea>] vfs_kern_mount+0x8b/0x131 [<c04e6ba4>] ? get_fs_type+0x39/0x8d [<c04d5661>] do_kern_mount+0x3c/0xbb [<c04e92a9>] do_mount+0x5f9/0x669 [<c04bac08>] ? page_address+0x17/0x5b [<c04acdb9>] ? __get_free_pages+0x27/0x2d [<c04e7ae7>] ? copy_mount_options+0x2c/0xdd [<c04e9386>] sys_mount+0x6d/0x99 [<c0402b93>] sysenter_do_call+0x12/0x22 ---[ end trace d817ceec35a87c27 ]--- ======= X86_64 http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=14372046 http://rhts.redhat.com/cgi-bin/rhts/test_log.cgi?id=14371209 ... ------------[ cut here ]------------ WARNING: at fs/namespace.c:1203 umount_tree+0xf3/0x13a() Hardware name: ProLiant DL320 G5 Modules linked in: nfs nfs_acl auth_rpcgss autofs4 i2c_dev i2c_core hidp rfcomm l2cap crc16 bluetooth rfkill lockd sunrpc ipv6 loop dm_multipath scsi_dh video output sbs sbshc battery ac parport_pc lp parport tg3 sg ipmi_devintf ipmi_si ipmi_msghandler hpwdt hpilo snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device button snd_pcm_oss snd_mixer_oss tpm_tis tpm snd_pcm serio_raw tpm_bios snd_timer snd soundcore snd_page_alloc shpchp iTCO_wdt i3000_edac pcspkr iTCO_vendor_support edac_core dm_snapshot dm_zero dm_mirror dm_region_hash dm_log dm_mod pata_acpi ata_piix ata_generic libata sd_mod crc_t10dif scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd [last unloaded: nf_defrag_ipv4] Pid: 23073, comm: mount.nfs4 Tainted: G W 2.6.33.4-rt20.19.el5rt #1 Call Trace: [<ffffffff8110cd74>] ? umount_tree+0xf3/0x13a [<ffffffff8104238b>] warn_slowpath_common+0x7c/0x94 [<ffffffff810423b7>] warn_slowpath_null+0x14/0x16 [<ffffffff8110cd74>] umount_tree+0xf3/0x13a [<ffffffff8110d674>] put_mnt_ns+0x9b/0xc5 [<ffffffffa049b654>] nfs_follow_remote_path+0x5c/0x12f [nfs] [<ffffffff810eca71>] ? slab_irq_enable+0x47/0x93 [<ffffffffa049b5e6>] ? nfs_do_root_mount+0x86/0x98 [nfs] [<ffffffff810d0497>] ? strndup_user+0x42/0x80 [<ffffffffa049b7a1>] nfs4_try_mount+0x7a/0xb4 [nfs] [<ffffffffa049ba53>] nfs4_get_sb+0x278/0x302 [nfs] [<ffffffff810f7957>] vfs_kern_mount+0x9e/0x14f [<ffffffff810f7b06>] do_kern_mount+0x4c/0xdd [<ffffffff8110e7e0>] do_mount+0x6d3/0x759 [<ffffffff8110e8ea>] sys_mount+0x84/0xbd [<ffffffff8108e447>] ? audit_syscall_entry+0x103/0x12f [<ffffffff81002d1b>] system_call_fastpath+0x16/0x1b ---[ end trace 1cec4b11ad46cb11 ]--- Just to confirm, its just a MNT_MOUNTED related WARN_ON, and the tests completed successfully? Did the kernel you were testing include commit e86825210e29c0be2691e7055f942555d02a4314 ? Hrmm.. So looking at nfs_follow_remote_path() static int nfs_follow_remote_path(struct vfsmount *root_mnt, const char *export_path, struct vfsmount *mnt_target) { struct mnt_namespace *ns_private; struct nameidata nd; struct super_block *s; int ret; ns_private = create_mnt_ns(root_mnt); ret = PTR_ERR(ns_private); if (IS_ERR(ns_private)) goto out_mntput; ret = vfs_path_lookup(root_mnt->mnt_root, root_mnt, export_path, LOOKUP_FOLLOW, &nd); put_mnt_ns(ns_private); ... The issue here is we call create_mnt_ns(), with allocates the namespace, then seemingly never do anything with the new namespace then call put_mnt_ns, which tries to unmount it. Trying to read over the history for the rational here. Commit 301933a0acfdec837fd8b4884093b3f0fff01d8a seems to be where this is introduced, and has the rational. I'm still trying to figure out if its correct or if its an oddball case where we want to cleanup the vfsmount even if its not mounted. Bah. Copied the wrong hash. c02d7adf8c5429727a98bad1d039bccad4c61c50 is where it was introduced. (In reply to comment #3) > Just to confirm, its just a MNT_MOUNTED related WARN_ON, and the tests > completed successfully? > > Did the kernel you were testing include commit > e86825210e29c0be2691e7055f942555d02a4314 ? Yes, it does include the above commit. |