Description of problem: Two identical SMP machines with drbd working in master-master mode and gfs2 filesystem experience several segmentation fault during removing random directory on gfs2 filesystem. [12598.923148] ------------[ cut here ]------------ [12598.923233] kernel BUG at fs/dcache.c:232! [12598.923303] invalid opcode: 0000 [#1] SMP [12598.923381] Modules linked in: gfs2 ipt_CLUSTERIP drbd dlm lru_cache sctp libcrc32c bonding xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm microcode serio_raw lpc_ich mfd_core ipmi_si ipmi_msghandler hpwdt hpilo bnx2 i5000_edac edac_core e1000e i5k_amb ptp pps_core shpchp mperf hpsa radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi i2c_core cciss [12598.924105] CPU: 2 PID: 24673 Comm: rm Not tainted 3.11.6-200.fc19.x86_64 #1 [12598.924105] Hardware name: HP ProLiant DL380 G5, BIOS P56 05/02/2011 [12598.924105] task: ffff8801e72a2620 ti: ffff8801e0c3a000 task.ti: ffff8801e0c3a000 [12598.924105] RIP: 0010:[<ffffffff811bd0d8>] [<ffffffff811bd0d8>] d_free+0x58/0x60 [12598.924105] RSP: 0018:ffff8801e0c3be48 EFLAGS: 00010286 [12598.924105] RAX: 0000000000002710 RBX: ffff8801c984e180 RCX: ffffffff81cecc00 [12598.924105] RDX: ffffffff81c50240 RSI: ffffffffa0543030 RDI: ffff8801c984e180 [12598.924105] RBP: ffff8801e0c3be50 R08: 5018000000000000 R09: 00000000ffffffff [12598.924105] R10: fe1552de83592a03 R11: 0000000000014180 R12: 0000000000000000 [12598.924105] R13: ffff8801c984ec00 R14: ffff8801ccad9410 R15: 0000000000000000 [12598.924105] FS: 00007f0da8a0e740(0000) GS:ffff88022fa80000(0000) knlGS:0000000000000000 [12598.924105] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [12598.924105] CR2: 0000000000d0eab0 CR3: 00000001df46b000 CR4: 00000000000006e0 [12598.924105] Stack: [12598.924105] ffff8801c984ec00 ffff8801e0c3be80 ffffffff811be05d ffff8801e0b89000 [12598.924105] 0000000000000000 0000000001ed0518 0000000000000004 ffff8801e0c3be90 [12598.924105] ffffffff811be169 ffff8801e0c3bf68 ffffffff811b57cc ffff8801c984e180 [12598.924105] Call Trace: [12598.924105] [<ffffffff811be05d>] dput.part.15+0x18d/0x280 [12598.924105] [<ffffffff811be169>] dput+0x19/0x20 [12598.924105] [<ffffffff811b57cc>] do_rmdir+0x18c/0x1d0 [12598.924105] [<ffffffff81085324>] ? task_work_run+0xa4/0xe0 [12598.924105] [<ffffffff81012a21>] ? do_notify_resume+0x61/0xa0 [12598.924105] [<ffffffff811b85a5>] SyS_unlinkat+0x25/0x40 [12598.924105] [<ffffffff81656e99>] system_call_fastpath+0x16/0x1b [12598.924105] Code: c0 74 02 ff d0 f6 03 80 48 8d bb 90 00 00 00 74 12 48 c7 c6 90 cb 1b 81 e8 26 03 f4 ff 5b 5d c3 0f 1f 00 e8 bb fa ff ff 5b 5d c3 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 48 [12598.924105] RIP [<ffffffff811bd0d8>] d_free+0x58/0x60 [12598.924105] RSP <ffff8801e0c3be48> [12598.943179] ---[ end trace 19e19b7308605a25 ]--- Version-Release number of selected component (if applicable): kernel-3.11.6-200.fc19.x86_64 Additional info: This appeared after installation of kernel 3.11.6-200. In the evening I will reboot machines with previous kernel to make sure it's related to specific kernel version.
My machines died anyway so I was able to test this on 3.11.4-201.fc19.x86_64 - same issue: [ 2016.595360] ------------[ cut here ]------------ [ 2016.595444] kernel BUG at fs/dcache.c:232! [ 2016.595512] invalid opcode: 0000 [#1] SMP [ 2016.595584] Modules linked in: ipt_CLUSTERIP gfs2 dlm drbd lru_cache sctp libcrc32c bonding xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack iTCO_wdt iTCO_vendor_support coretemp kvm_intel kvm microcode lpc_ich mfd_core serio_raw ipmi_si ipmi_msghandler hpwdt hpilo bnx2 i5000_edac e1000e edac_core i5k_amb ptp pps_core shpchp mperf hpsa radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi i2c_core cciss [ 2016.596319] CPU: 3 PID: 31825 Comm: rm Not tainted 3.11.4-201.fc19.x86_64 #1 [ 2016.596319] Hardware name: HP ProLiant DL380 G5, BIOS P56 05/02/2011 [ 2016.596319] task: ffff880222e2e320 ti: ffff880221632000 task.ti: ffff880221632000 [ 2016.596319] RIP: 0010:[<ffffffff811bd0b8>] [<ffffffff811bd0b8>] d_free+0x58/0x60 [ 2016.596319] RSP: 0018:ffff880221633e48 EFLAGS: 00010286 [ 2016.596319] RAX: 0000000000002710 RBX: ffff8801e2e16a80 RCX: ffffffff81cecc00 [ 2016.596319] RDX: ffffffff81c50240 RSI: ffffffffa04b7070 RDI: ffff8801e2e16a80 [ 2016.596319] RBP: ffff880221633e50 R08: a018000000000000 R09: 00000000ffffffff [ 2016.596319] R10: fded8b80cd331403 R11: 0000000000014180 R12: 0000000000000000 [ 2016.596319] R13: ffff8801e2e98cc0 R14: ffff8801f474fbb8 R15: 0000000000000000 [ 2016.596319] FS: 00007fb7dcbc2740(0000) GS:ffff88022fac0000(0000) knlGS:0000000000000000 [ 2016.596319] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 2016.596319] CR2: 000000360ef421c0 CR3: 00000002236f3000 CR4: 00000000000006e0 [ 2016.596319] Stack: [ 2016.596319] ffff8801e2e98cc0 ffff880221633e80 ffffffff811be03d ffff8801e12c1000 [ 2016.596319] 0000000000000000 0000000001eee518 0000000000000004 ffff880221633e90 [ 2016.596319] ffffffff811be149 ffff880221633f68 ffffffff811b57ac ffff8801e2e16a80 [ 2016.596319] Call Trace: [ 2016.596319] [<ffffffff811be03d>] dput.part.15+0x18d/0x280 [ 2016.596319] [<ffffffff811be149>] dput+0x19/0x20 [ 2016.596319] [<ffffffff811b57ac>] do_rmdir+0x18c/0x1d0 [ 2016.596319] [<ffffffff81085314>] ? task_work_run+0xa4/0xe0 [ 2016.596319] [<ffffffff81012a21>] ? do_notify_resume+0x61/0xa0 [ 2016.596319] [<ffffffff811b8585>] SyS_unlinkat+0x25/0x40 [ 2016.596319] [<ffffffff81656a19>] system_call_fastpath+0x16/0x1b [ 2016.596319] Code: c0 74 02 ff d0 f6 03 80 48 8d bb 90 00 00 00 74 12 48 c7 c6 70 cb 1b 81 e8 96 03 f4 ff 5b 5d c3 0f 1f 00 e8 bb fa ff ff 5b 5d c3 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 48 [ 2016.596319] RIP [<ffffffff811bd0b8>] d_free+0x58/0x60 [ 2016.596319] RSP <ffff880221633e48> [ 2016.620942] ---[ end trace d822a6692175e384 ]---
Steve, any clues as to how to proceed on this one?
Well this may be VFS rather than GFS2 since we do not do anything special with the dentry ref counts in the GFS2 code at all. It is not something that I've seen before, so a bit of a mystery at the moment, but maybe all will become clear once we've looked into it for a bit.
Hmm - I wonder if it might be related to atomic_open as that has changed recently and I don't know if those kernels have all the fixes - quite possibly not, so that is one possible line of enquiry. Kuba, what are you doing to reproduce this? Can you suggest a simple way we can test to see if we can get the same result here?
Actually nothing special, I've got small script creating users directory structure with maildir and subfolders (3 level depth). I used it and find out that I made a typo in root of this tree so I wanted to remove it and create it again. After rm -fr seg fault popped up. And I was able to reproduce it few times, creating small tree of dirs, trying to remove - seg fault.
Today I bumped on this error just listing one of directories on gfs2 mountpont: Oct 28 12:09:18 dev-1 kernel: [ 9178.419900] ------------[ cut here ]------------ Oct 28 12:09:18 dev-1 kernel: [ 9178.419981] kernel BUG at fs/dcache.c:630! Oct 28 12:09:18 dev-1 kernel: [ 9178.420048] invalid opcode: 0000 [#1] SMP Oct 28 12:09:18 dev-1 kernel: [ 9178.420121] Modules linked in: ipt_CLUSTERIP gfs2 drbd lru_cache dlm sctp libcrc32c bonding xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode serio_raw lpc_ich mfd_core ipmi_si ipmi_msghandler hpwdt hpilo i5000_edac edac_core bnx2 e1000e i5k_amb ptp pps_core shpchp mperf hpsa radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi i2c_core cciss Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] CPU: 3 PID: 9851 Comm: ls Not tainted 3.11.6-200.fc19.x86_64 #1 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] Hardware name: HP ProLiant DL380 G5, BIOS P56 05/02/2011 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] task: ffff8801f1a1cc40 ti: ffff8801db744000 task.ti: ffff8801db744000 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RIP: 0010:[<ffffffff811bcf09>] [<ffffffff811bcf09>] dget_parent+0x49/0x50 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RSP: 0018:ffff8801db745cb8 EFLAGS: 00010246 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RAX: 0000000000000000 RBX: ffff8801d6455840 RCX: 0000000000000002 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RDX: 00000000000000a1 RSI: 0000000000000003 RDI: ffff8801d6455898 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RBP: ffff8801db745cd0 R08: 8080808080808080 R09: fefefefefefefeff Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] R10: 2f2f2f2f2f2f2f2f R11: fefefefefeff2d2d R12: ffff8801d6455898 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] R13: ffff8801d6455c00 R14: 00000000ffffff9c R15: ffff8801db745ef8 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] FS: 00007ff367fdc800(0000) GS:ffff88022fac0000(0000) knlGS:0000000000000000 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] CR2: 0000000000b3e000 CR3: 0000000222cb9000 CR4: 00000000000006e0 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] Stack: Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] ffff8801db745de0 ffff8801d6455c00 ffff880223cbf6c0 ffff8801db745cf8 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] ffffffff811b32c8 0000000000000000 0000000000000000 ffff8801db745de0 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] ffff8801db745d10 ffffffff811b3408 00000000db745ef8 ffff8801db745d98 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] Call Trace: Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b32c8>] follow_dotdot+0x58/0x160 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b3408>] handle_dots+0x38/0x40 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b476d>] path_lookupat+0x1bd/0x7d0 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffffa04a9f0e>] ? gfs2_getxattr+0xee/0x130 [gfs2] Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff8118e045>] ? kmem_cache_alloc+0x35/0x210 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b34af>] ? getname_flags+0x4f/0x190 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b4dab>] filename_lookup+0x2b/0xd0 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b80a4>] user_path_at_empty+0x54/0x90 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b3432>] ? final_putname+0x22/0x50 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b363b>] ? putname+0x2b/0x40 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b80af>] ? user_path_at_empty+0x5f/0x90 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811b80f1>] user_path_at+0x11/0x20 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811acfc0>] vfs_fstatat+0x50/0xa0 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811ad202>] SYSC_newlstat+0x22/0x40 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811c5a36>] ? mntput+0x26/0x40 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff810e6496>] ? __audit_syscall_exit+0x1f6/0x2a0 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff811ad61e>] SyS_newlstat+0xe/0x10 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] [<ffffffff81656e99>] system_call_fastpath+0x16/0x1b Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] Code: 17 49 00 49 3b 5d 18 75 1c 8b 43 5c 85 c0 74 1b 83 c0 01 89 43 5c 41 80 04 24 01 48 89 d8 5b 41 5c 41 5d 5d c3 80 43 58 01 eb c8 <0f> 0b 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RIP [<ffffffff811bcf09>] dget_parent+0x49/0x50 Oct 28 12:09:18 dev-1 kernel: [ 9178.420803] RSP <ffff8801db745cb8> Oct 28 12:09:18 dev-1 kernel: [ 9178.513266] ---[ end trace 71273e47c6f7e42d ]--- this caused pacemaker cluster to breakdown and few lines below: Oct 28 12:09:47 dev-1 kernel: [ 9207.149405] ------------[ cut here ]------------ Oct 28 12:09:47 dev-1 kernel: [ 9207.154774] WARNING: CPU: 3 PID: 10782 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0() Oct 28 12:09:47 dev-1 kernel: [ 9207.160039] list_del corruption. prev->next should be ffff8801dbdf05d0, but was 8948559066666666 Oct 28 12:09:47 dev-1 kernel: [ 9207.165375] Modules linked in: ipt_CLUSTERIP gfs2 drbd lru_cache dlm sctp libcrc32c bonding xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode serio_raw lpc_ich mfd_core ipmi_si ipmi_msghandler hpwdt hpilo i5000_edac edac_core bnx2 e1000e i5k_amb ptp pps_core shpchp mperf hpsa radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi i2c_core cciss Oct 28 12:09:47 dev-1 kernel: [ 9207.188529] CPU: 3 PID: 10782 Comm: umount Tainted: G D 3.11.6-200.fc19.x86_64 #1 Oct 28 12:09:47 dev-1 kernel: [ 9207.194359] Hardware name: HP ProLiant DL380 G5, BIOS P56 05/02/2011 Oct 28 12:09:47 dev-1 kernel: [ 9207.200211] 0000000000000009 ffff8801d6aabd48 ffffffff81647c7f ffff8801d6aabd90 Oct 28 12:09:47 dev-1 kernel: [ 9207.206126] ffff8801d6aabd80 ffffffff8106715d ffff8801dbdf05d0 ffff8801dbdf0540 Oct 28 12:09:47 dev-1 kernel: [ 9207.212026] ffff8801d6aabe58 ffff8801f5930300 0000000000000000 ffff8801d6aabde0 Oct 28 12:09:47 dev-1 kernel: [ 9207.217959] Call Trace: Oct 28 12:09:47 dev-1 kernel: [ 9207.223765] [<ffffffff81647c7f>] dump_stack+0x45/0x56 Oct 28 12:09:47 dev-1 kernel: [ 9207.229812] [<ffffffff8106715d>] warn_slowpath_common+0x7d/0xa0 Oct 28 12:09:47 dev-1 kernel: [ 9207.235547] [<ffffffff810671cc>] warn_slowpath_fmt+0x4c/0x50 Oct 28 12:09:47 dev-1 kernel: [ 9207.241240] [<ffffffff813109a1>] __list_del_entry+0xa1/0xd0 Oct 28 12:09:47 dev-1 kernel: [ 9207.246888] [<ffffffff813109dd>] list_del+0xd/0x30 Oct 28 12:09:47 dev-1 kernel: [ 9207.252481] [<ffffffff811be986>] shrink_dentry_list+0x256/0x3b0 Oct 28 12:09:47 dev-1 kernel: [ 9207.258067] [<ffffffff811beb62>] shrink_dcache_sb+0x82/0xb0 Oct 28 12:09:47 dev-1 kernel: [ 9207.263615] [<ffffffffa04a7279>] gfs2_kill_sb+0x59/0x80 [gfs2] Oct 28 12:09:47 dev-1 kernel: [ 9207.269173] [<ffffffff811aaa2d>] deactivate_locked_super+0x3d/0x60 Oct 28 12:09:47 dev-1 kernel: [ 9207.274776] [<ffffffff811aaa96>] deactivate_super+0x46/0x60 Oct 28 12:09:47 dev-1 kernel: [ 9207.280373] [<ffffffff811c59b5>] mntput_no_expire+0xc5/0x120 Oct 28 12:09:47 dev-1 kernel: [ 9207.285937] [<ffffffff811c6a41>] SyS_umount+0x91/0x3a0 Oct 28 12:09:47 dev-1 kernel: [ 9207.291489] [<ffffffff81656e99>] system_call_fastpath+0x16/0x1b Oct 28 12:09:47 dev-1 kernel: [ 9207.297069] ---[ end trace 71273e47c6f7e42e ]--- Oct 28 12:09:47 dev-1 kernel: [ 9207.302665] ------------[ cut here ]------------ Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] kernel BUG at fs/dcache.c:232! Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] invalid opcode: 0000 [#2] SMP Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] Modules linked in: ipt_CLUSTERIP gfs2 drbd lru_cache dlm sctp libcrc32c bonding xt_LOG nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack coretemp kvm_intel kvm iTCO_wdt iTCO_vendor_support microcode serio_raw lpc_ich mfd_core ipmi_si ipmi_msghandler hpwdt hpilo i5000_edac edac_core bnx2 e1000e i5k_amb ptp pps_core shpchp mperf hpsa radeon i2c_algo_bit drm_kms_helper ttm drm ata_generic pata_acpi i2c_core cciss Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] CPU: 3 PID: 10782 Comm: umount Tainted: G D W 3.11.6-200.fc19.x86_64 #1 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] Hardware name: HP ProLiant DL380 G5, BIOS P56 05/02/2011 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] task: ffff8801d612cc40 ti: ffff8801d6aaa000 task.ti: ffff8801d6aaa000 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RIP: 0010:[<ffffffff811bd0d8>] [<ffffffff811bd0d8>] d_free+0x58/0x60 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RSP: 0018:ffff8801d6aabe00 EFLAGS: 00010286 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RAX: dead000000200200 RBX: ffff8801dbdf0540 RCX: 0000000000000000 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RDX: ffff88022facfde0 RSI: 0000000000000000 RDI: ffff8801dbdf0540 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RBP: ffff8801d6aabe08 R08: 0000000000000000 R09: 00000000ffffffff Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] R10: 000000000000000f R11: 0000000007070707 R12: ffff8801f5930300 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] R13: ffff8801d6aabe58 R14: ffff8801f5930300 R15: 0000000000000000 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] FS: 00007fc6a8c16880(0000) GS:ffff88022fac0000(0000) knlGS:0000000000000000 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] CR2: 00007fc6a87e1750 CR3: 00000001d718e000 CR4: 00000000000006e0 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] Stack: Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] ffff8801dbdf0598 ffff8801d6aabe48 ffffffff811bea21 ffff8801f5a96f80 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] ffff8801f1ab3400 ffff8801f1ab34d0 ffff8801d6aabe58 ffff8801d6aabe58 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] ffff8801db6d5120 ffff8801d6aabe88 ffffffff811beb62 ffff8801d64ab380 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] Call Trace: Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff811bea21>] shrink_dentry_list+0x2f1/0x3b0 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff811beb62>] shrink_dcache_sb+0x82/0xb0 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffffa04a7279>] gfs2_kill_sb+0x59/0x80 [gfs2] Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff811aaa2d>] deactivate_locked_super+0x3d/0x60 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff811aaa96>] deactivate_super+0x46/0x60 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff811c59b5>] mntput_no_expire+0xc5/0x120 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff811c6a41>] SyS_umount+0x91/0x3a0 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] [<ffffffff81656e99>] system_call_fastpath+0x16/0x1b Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] Code: c0 74 02 ff d0 f6 03 80 48 8d bb 90 00 00 00 74 12 48 c7 c6 90 cb 1b 81 e8 26 03 f4 ff 5b 5d c3 0f 1f 00 e8 bb fa ff ff 5b 5d c3 <0f> 0b 66 0f 1f 44 00 00 66 66 66 66 90 55 48 89 e5 41 54 53 48 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RIP [<ffffffff811bd0d8>] d_free+0x58/0x60 Oct 28 12:09:47 dev-1 kernel: [ 9207.303649] RSP <ffff8801d6aabe00> Oct 28 12:09:47 dev-1 kernel: [ 9207.521083] ---[ end trace 71273e47c6f7e42f ]--- which looks same as the first I reported. This really makes my machines unusable.
Lines on which my kernel crashes were changed by Linus/Waiman on 29th of August. This change was pushed to 3.11 so I rollbacked my kernel to 3.10. Currently I'm on 3.10.9-200.fc19.x86_64 and so far everything is stable for over 2 days.
I have a suspicion that this may relate to some recent fixes, of which: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs2?id=ea0341e071527d5cec350917b01ab901af09d758 is the most important. Does this still occur if you use a kernel with this fix in it? It has gone to the -stable tree, so it should have appeared in recent distro kernels by now I think.
Since this report matches the bug we recently fixed, and there has been no further information, I'm closing this bug as current release. Please reopen if the problem still persists on kernels including the fix described in comment #8
I am currently on 3.13.9-100.fc19.x86_64 and this problems seems to be fixed. Thank you for your help