Bug 464932
Summary: | GFS2: Recursive glock detected, io_submit to shrink_slab to gfs2_dinode_dealloc | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Nate Straz <nstraz> | ||||
Component: | kernel | Assignee: | Steve Whitehouse <swhiteho> | ||||
Status: | CLOSED DUPLICATE | QA Contact: | Cluster QE <mspqa-list> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 5.3 | CC: | cluster-maint, edamato | ||||
Target Milestone: | rc | ||||||
Target Release: | --- | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2008-10-07 10:20:41 UTC | Type: | --- | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Nate Straz
2008-10-01 14:41:57 UTC
With luck I hit it again, this time on dash-02. Tags running at the time dash-02 failed, genesis_reg and laio_rwransync. original: gfs2_rindex_hold+0x32/0x153 [gfs2] pid : 28542 lock type: 2 req lock state : 3 new: gfs2_rindex_hold+0x32/0x153 [gfs2] pid: 28542 lock type: 2 req lock state : 3 G: s:SH n:2/18402 f:s t:SH d:EX/0 l:0 a:0 r:5 H: s:SH f:H e:0 p:28542 [genesis] gfs2_rindex_hold+0x32/0x153 [gfs2] I: n:19/99330 t:8 f:0x00000010 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at ...r/build/BUILD/gfs2-kmod-1.116/_kmod_build_/glock.c:957 invalid opcode: 0000 [1] SMP last sysfs file: /kernel/dlm/brawl0/control CPU 0 Modules linked in: sctp gnbd(U) lock_nolock gfs(U) lock_dlm gfs2(U) dlm configfs ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge autofs4 hidp rfcomm l2cap bluetooth dm_log_clustered(U) sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg i2c_i801 i2c_core qla2xxx scsi_transport_fc shpchp tg3 libphy ide_cd i3000_edac pcspkr edac_mc cdrom serio_raw dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 28542, comm: genesis Tainted: G 2.6.18-116.gfs2abhi.001 #1 RIP: 0010:[<ffffffff886059fd>] [<ffffffff886059fd>] :gfs2:gfs2_glock_nq+0x231/0x273 RSP: 0018:ffff810005f39898 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff810004933318 RCX: ffff81001201fb18 RDX: 00000000ffffffff RSI: 0000000000000046 RDI: ffffffff802feb5c RBP: ffff810004933f18 R08: 000000000000000d R09: 000000000000003f R10: ffffffff803e5520 R11: 0000000000000000 R12: ffff81001bf3e1a8 R13: ffff81001bf3e1a8 R14: ffff81000455e000 R15: ffff810004933e00 FS: 00002ad311b26220(0000) GS:ffffffff803b8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000001c145068 CR3: 0000000025d3c000 CR4: 00000000000006e0 Process genesis (pid: 28542, threadinfo ffff810005f38000, task ffff81001201f860) Stack: ffff81000455e000 ffff81003cf46368 ffff81001b99c9a8 0000000000000000 ffff810004933f18 ffffffff88617fed ffff810005f39928 0000000000000000 ffff810005f39938 ffffffff80063aed ffff81001201fa50 ffff81003cf46610 Call Trace: [<ffffffff88617fed>] :gfs2:gfs2_rindex_hold+0x3a/0x153 [<ffffffff80063aed>] __wait_on_bit+0x60/0x6e [<ffffffff88604138>] :gfs2:just_schedule+0x0/0xe [<ffffffff80063b67>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff886077b9>] :gfs2:gfs2_dinode_dealloc+0x8e/0x1a7 [<ffffffff8861402e>] :gfs2:gfs2_delete_inode+0xf1/0x18d [<ffffffff88613f83>] :gfs2:gfs2_delete_inode+0x46/0x18d [<ffffffff8860bb15>] :gfs2:gfs2_meta_wait+0x18/0x83 [<ffffffff88613f3d>] :gfs2:gfs2_delete_inode+0x0/0x18d [<ffffffff8002f2b6>] generic_delete_inode+0xc6/0x143 [<ffffffff800e6fe7>] prune_one_dentry+0x4d/0x76 [<ffffffff8002e76f>] prune_dcache+0x10f/0x149 [<ffffffff800e7027>] shrink_dcache_memory+0x17/0x30 [<ffffffff8003ef9f>] shrink_slab+0xdc/0x153 [<ffffffff800c8e27>] try_to_free_pages+0x1c1/0x2b9 [<ffffffff8000f265>] __alloc_pages+0x1cb/0x2ce [<ffffffff8861b34d>] :gfs2:gfs2_do_trans_begin+0x102/0x144 [<ffffffff8860d7ae>] :gfs2:gfs2_write_begin+0x1a0/0x342 [<ffffffff8860f1ab>] :gfs2:gfs2_file_buffered_write+0x14b/0x2e5 [<ffffffff8008ac42>] enqueue_task+0x41/0x56 [<ffffffff8860f5e1>] :gfs2:__gfs2_file_aio_write_nolock+0x29c/0x2d4 [<ffffffff800e5c0d>] __posix_lock_file_conf+0x376/0x3be [<ffffffff8860f784>] :gfs2:gfs2_file_write_nolock+0xaa/0x10f [<ffffffff885e09f5>] :dlm:dlm_posix_lock+0x1ff/0x210 [<ffffffff8009e9e2>] autoremove_wake_function+0x0/0x2e [<ffffffff8009e9e2>] autoremove_wake_function+0x0/0x2e [<ffffffff8860ffab>] :gfs2:gfs2_open+0x0/0x13a [<ffffffff8860f8d4>] :gfs2:gfs2_file_write+0x49/0xa7 [<ffffffff80016723>] vfs_write+0xce/0x174 [<ffffffff80016ff0>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 5d db 61 88 c2 bd 03 be 01 00 00 00 4c 89 ef e8 45 RIP [<ffffffff886059fd>] :gfs2:gfs2_glock_nq+0x231/0x273 RSP <ffff810005f39898> <0>Kernel panic - not syncing: Fatal exception Created attachment 319178 [details]
Proposed patch
After analyzing the call trace above, it appears that the "bad"
sequence of events was something like this:
gfs2_write_begin
gfs2_inplace_reserve
gfs2_rindex_hold
...
gfs2_trans_begin
try_to_free_pages
shrink_slab
gfs2_delete_inode
gfs2_dinode_dealloc
gfs2_rindex_hold
It's my fear that the call to shrink_slab can occur at unpredictable
places, but it will end with gfs2_dinode_dealloc getting called.
Therefore, it seems reasonable to me to say that when we get to function
gfs2_dinode_dealloc, we may sometimes already have the rindex locked.
Therefore, I concocted this patch that basically makes that function
first check to see if the rindex is already held.
The function goes on to grab an exclusive lock on the rgrp, which may
also be held. This patch doesn't cover that case. After that, the
function does a trans_begin which grabs an exclusive glock for the
transaction glock, then log reserve. These are also potential problems.
So this patch is not a final solution, just a beginning.
Hopefully Steve Whitehouse will have some input on this matter.
This looks like the bug might be elsewhere. The allocation in gfs2_do_trans_begin() very clearly has the GFP_NOFS flag set. So I rather think that maybe the memory subsystem is broken.... there is no way that we should be calling shrink slab for the dcache for such allocations, after all thats the whole point of GFP_NOFS. It would certainly explain why we had the problem after the other patch was applied. So I wonder what recent patches there have been which affect the slab relaim code. After further inspecting my configuration while trying to verify 462579, I discovered that I had an old kmod-gfs2 package installed and I was loading gfs2.ko built Sep 19 instead of the kmod from the 2.6.18-116.gfs2abhi.001 kernel built Sep 29. I need to rerun these tests. After re-running with kmod-gfs2 removed and confirming that GFS2 was built the same day as the kernel, my tests have all succeeded. *** This bug has been marked as a duplicate of bug 460218 *** |