Description of problem: While running d_io with plocks on GFS2 with 32MB RGs I hit the following panic while laio_rwssync and laio_lock1 were running locally on dash-03. original: gfs2_rindex_hold+0x32/0x153 [gfs2] pid : 27035 lock type: 2 req lock state : 3 new: gfs2_rindex_hold+0x32/0x153 [gfs2] pid: 27035 lock type: 2 req lock state : 3 G: s:SH n:2/18402 f:s t:SH d:EX/0 l:0 a:0 r:5 H: s:SH f:H e:0 p:27035 [xdoio] gfs2_rindex_hold+0x32/0x153 [gfs2] H: s:SH f:H e:0 p:27027 [xdoio] gfs2_rindex_hold+0x32/0x153 [gfs2] I: n:19/99330 t:8 f:0x00000010 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at ...r/build/BUILD/gfs2-kmod-1.116/_kmod_build_/glock.c:957 invalid opcode: 0000 [1] SMP last sysfs file: /kernel/dlm/brawl0/id CPU 0 Modules linked in: sctp gnbd(U) lock_nolock gfs(U) lock_dlm gfs2(U) dlm configfs ipt_MASQUERADE iptable_n at ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge a utofs4 hidp rfcomm l2cap bluetooth dm_log_clustered(U) sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scs i_dh video backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg shpc hp tg3 libphy pcspkr qla2xxx scsi_transport_fc i3000_edac edac_mc serio_raw i2c_i801 i2c_core ide_cd cdro m dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 27035, comm: xdoio Tainted: G 2.6.18-116.gfs2abhi.001 #1 RIP: 0010:[<ffffffff886129fd>] [<ffffffff886129fd>] :gfs2:gfs2_glock_nq+0x231/0x273 RSP: 0018:ffff81000d3e9918 EFLAGS: 00010296 RAX: 0000000000000000 RBX: ffff8100301b5518 RCX: ffff810008da33b8 RDX: 00000000ffffffff RSI: 0000000000000046 RDI: ffffffff802feb5c RBP: ffff810030bc9318 R08: 000000000000000d R09: 000000000000003f R10: ffffffff803e5520 R11: 0000000000000000 R12: ffff810007d921a8 R13: ffff810007d921a8 R14: ffff81002fcc8000 R15: ffff810030bc9200 FS: 00002b1465044ce0(0000) GS:ffffffff803b8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000018803c08 CR3: 00000000105de000 CR4: 00000000000006e0 Process xdoio (pid: 27035, threadinfo ffff81000d3e8000, task ffff810005dfd7e0) Stack: ffff81002fcc8000 ffff810017479cc8 ffff8100240e79a8 0000000000000000 ffff810030bc9318 ffffffff88624fed ffff81000d3e99a8 0000000000000000 ffff81000d3e99b8 ffffffff80063aed ffff810005dfd9d0 ffff810017479f70 Call Trace: [<ffffffff88624fed>] :gfs2:gfs2_rindex_hold+0x3a/0x153 [<ffffffff80063aed>] __wait_on_bit+0x60/0x6e [<ffffffff88611138>] :gfs2:just_schedule+0x0/0xe [<ffffffff80063b67>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff886147b9>] :gfs2:gfs2_dinode_dealloc+0x8e/0x1a7 [<ffffffff8862102e>] :gfs2:gfs2_delete_inode+0xf1/0x18d [<ffffffff88620f83>] :gfs2:gfs2_delete_inode+0x46/0x18d [<ffffffff88620f3d>] :gfs2:gfs2_delete_inode+0x0/0x18d [<ffffffff8002f2b6>] generic_delete_inode+0xc6/0x143 [<ffffffff800e6fe7>] prune_one_dentry+0x4d/0x76 [<ffffffff8002e76f>] prune_dcache+0x10f/0x149 [<ffffffff800e7027>] shrink_dcache_memory+0x17/0x30 [<ffffffff8003ef9f>] shrink_slab+0xdc/0x153 [<ffffffff800c8e27>] try_to_free_pages+0x1c1/0x2b9 [<ffffffff8000f265>] __alloc_pages+0x1cb/0x2ce [<ffffffff8862834d>] :gfs2:gfs2_do_trans_begin+0x102/0x144 [<ffffffff8861a7ae>] :gfs2:gfs2_write_begin+0x1a0/0x342 [<ffffffff8861c1ab>] :gfs2:gfs2_file_buffered_write+0x14b/0x2e5 [<ffffffff8861c5e1>] :gfs2:__gfs2_file_aio_write_nolock+0x29c/0x2d4 [<ffffffff8000f180>] __alloc_pages+0xe6/0x2ce [<ffffffff8861c67e>] :gfs2:gfs2_file_aio_write+0x65/0xc1 [<ffffffff800ea4a4>] aio_pwrite+0x2c/0x75 [<ffffffff800ead85>] aio_run_iocb+0xef/0x18a [<ffffffff800eb8ba>] io_submit_one+0x29d/0x2ea [<ffffffff800ebdf9>] sys_io_submit+0x9b/0x108 [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Version-Release number of selected component (if applicable): kernel-2.6.18-116.gfs2abhi.001 How reproducible: Unknown Actual results: Expected results: Additional info: Tests running across the cluster at the time. dash-01: stuffed_rwssync laio_rwrandirect dash-02: mtfile_rwransync stuffed_lockseq dash-03: laio_lock1 laio_rwssync
With luck I hit it again, this time on dash-02. Tags running at the time dash-02 failed, genesis_reg and laio_rwransync. original: gfs2_rindex_hold+0x32/0x153 [gfs2] pid : 28542 lock type: 2 req lock state : 3 new: gfs2_rindex_hold+0x32/0x153 [gfs2] pid: 28542 lock type: 2 req lock state : 3 G: s:SH n:2/18402 f:s t:SH d:EX/0 l:0 a:0 r:5 H: s:SH f:H e:0 p:28542 [genesis] gfs2_rindex_hold+0x32/0x153 [gfs2] I: n:19/99330 t:8 f:0x00000010 ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at ...r/build/BUILD/gfs2-kmod-1.116/_kmod_build_/glock.c:957 invalid opcode: 0000 [1] SMP last sysfs file: /kernel/dlm/brawl0/control CPU 0 Modules linked in: sctp gnbd(U) lock_nolock gfs(U) lock_dlm gfs2(U) dlm configfs ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge autofs4 hidp rfcomm l2cap bluetooth dm_log_clustered(U) sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath scsi_dh video backlight sbs i2c_ec button battery asus_acpi acpi_memhotplug ac parport_pc lp parport sg i2c_i801 i2c_core qla2xxx scsi_transport_fc shpchp tg3 libphy ide_cd i3000_edac pcspkr edac_mc cdrom serio_raw dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 28542, comm: genesis Tainted: G 2.6.18-116.gfs2abhi.001 #1 RIP: 0010:[<ffffffff886059fd>] [<ffffffff886059fd>] :gfs2:gfs2_glock_nq+0x231/0x273 RSP: 0018:ffff810005f39898 EFLAGS: 00010292 RAX: 0000000000000000 RBX: ffff810004933318 RCX: ffff81001201fb18 RDX: 00000000ffffffff RSI: 0000000000000046 RDI: ffffffff802feb5c RBP: ffff810004933f18 R08: 000000000000000d R09: 000000000000003f R10: ffffffff803e5520 R11: 0000000000000000 R12: ffff81001bf3e1a8 R13: ffff81001bf3e1a8 R14: ffff81000455e000 R15: ffff810004933e00 FS: 00002ad311b26220(0000) GS:ffffffff803b8000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000001c145068 CR3: 0000000025d3c000 CR4: 00000000000006e0 Process genesis (pid: 28542, threadinfo ffff810005f38000, task ffff81001201f860) Stack: ffff81000455e000 ffff81003cf46368 ffff81001b99c9a8 0000000000000000 ffff810004933f18 ffffffff88617fed ffff810005f39928 0000000000000000 ffff810005f39938 ffffffff80063aed ffff81001201fa50 ffff81003cf46610 Call Trace: [<ffffffff88617fed>] :gfs2:gfs2_rindex_hold+0x3a/0x153 [<ffffffff80063aed>] __wait_on_bit+0x60/0x6e [<ffffffff88604138>] :gfs2:just_schedule+0x0/0xe [<ffffffff80063b67>] out_of_line_wait_on_bit+0x6c/0x78 [<ffffffff886077b9>] :gfs2:gfs2_dinode_dealloc+0x8e/0x1a7 [<ffffffff8861402e>] :gfs2:gfs2_delete_inode+0xf1/0x18d [<ffffffff88613f83>] :gfs2:gfs2_delete_inode+0x46/0x18d [<ffffffff8860bb15>] :gfs2:gfs2_meta_wait+0x18/0x83 [<ffffffff88613f3d>] :gfs2:gfs2_delete_inode+0x0/0x18d [<ffffffff8002f2b6>] generic_delete_inode+0xc6/0x143 [<ffffffff800e6fe7>] prune_one_dentry+0x4d/0x76 [<ffffffff8002e76f>] prune_dcache+0x10f/0x149 [<ffffffff800e7027>] shrink_dcache_memory+0x17/0x30 [<ffffffff8003ef9f>] shrink_slab+0xdc/0x153 [<ffffffff800c8e27>] try_to_free_pages+0x1c1/0x2b9 [<ffffffff8000f265>] __alloc_pages+0x1cb/0x2ce [<ffffffff8861b34d>] :gfs2:gfs2_do_trans_begin+0x102/0x144 [<ffffffff8860d7ae>] :gfs2:gfs2_write_begin+0x1a0/0x342 [<ffffffff8860f1ab>] :gfs2:gfs2_file_buffered_write+0x14b/0x2e5 [<ffffffff8008ac42>] enqueue_task+0x41/0x56 [<ffffffff8860f5e1>] :gfs2:__gfs2_file_aio_write_nolock+0x29c/0x2d4 [<ffffffff800e5c0d>] __posix_lock_file_conf+0x376/0x3be [<ffffffff8860f784>] :gfs2:gfs2_file_write_nolock+0xaa/0x10f [<ffffffff885e09f5>] :dlm:dlm_posix_lock+0x1ff/0x210 [<ffffffff8009e9e2>] autoremove_wake_function+0x0/0x2e [<ffffffff8009e9e2>] autoremove_wake_function+0x0/0x2e [<ffffffff8860ffab>] :gfs2:gfs2_open+0x0/0x13a [<ffffffff8860f8d4>] :gfs2:gfs2_file_write+0x49/0xa7 [<ffffffff80016723>] vfs_write+0xce/0x174 [<ffffffff80016ff0>] sys_write+0x45/0x6e [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 5d db 61 88 c2 bd 03 be 01 00 00 00 4c 89 ef e8 45 RIP [<ffffffff886059fd>] :gfs2:gfs2_glock_nq+0x231/0x273 RSP <ffff810005f39898> <0>Kernel panic - not syncing: Fatal exception
Created attachment 319178 [details] Proposed patch After analyzing the call trace above, it appears that the "bad" sequence of events was something like this: gfs2_write_begin gfs2_inplace_reserve gfs2_rindex_hold ... gfs2_trans_begin try_to_free_pages shrink_slab gfs2_delete_inode gfs2_dinode_dealloc gfs2_rindex_hold It's my fear that the call to shrink_slab can occur at unpredictable places, but it will end with gfs2_dinode_dealloc getting called. Therefore, it seems reasonable to me to say that when we get to function gfs2_dinode_dealloc, we may sometimes already have the rindex locked. Therefore, I concocted this patch that basically makes that function first check to see if the rindex is already held. The function goes on to grab an exclusive lock on the rgrp, which may also be held. This patch doesn't cover that case. After that, the function does a trans_begin which grabs an exclusive glock for the transaction glock, then log reserve. These are also potential problems. So this patch is not a final solution, just a beginning. Hopefully Steve Whitehouse will have some input on this matter.
This looks like the bug might be elsewhere. The allocation in gfs2_do_trans_begin() very clearly has the GFP_NOFS flag set. So I rather think that maybe the memory subsystem is broken.... there is no way that we should be calling shrink slab for the dcache for such allocations, after all thats the whole point of GFP_NOFS. It would certainly explain why we had the problem after the other patch was applied. So I wonder what recent patches there have been which affect the slab relaim code.
After further inspecting my configuration while trying to verify 462579, I discovered that I had an old kmod-gfs2 package installed and I was loading gfs2.ko built Sep 19 instead of the kmod from the 2.6.18-116.gfs2abhi.001 kernel built Sep 29. I need to rerun these tests.
After re-running with kmod-gfs2 removed and confirming that GFS2 was built the same day as the kernel, my tests have all succeeded.
*** This bug has been marked as a duplicate of bug 460218 ***