Description of problem: While running tests w/ plocks on a 1k block size GFS2 file system, one node hit the following BUG: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000004 printing eip: c04d631a *pde = 31f03001 Oops: 0000 [#1] SMP last sysfs file: /devices/pci0000:00/0000:00:02.0/0000:01:1f.0/0000:03:02.1/irq Modules linked in: sctp lock_dlm(U) gfs2(U) dlm configfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac lp parport_pc floppy pcspkr parport i2c_i801 sg ide_cd intel_rng e1000 e7xxx_edac i2c_core edac_mc cdrom dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd CPU: 0 EIP: 0060:[<c04d631a>] Not tainted VLI EFLAGS: 00010246 (2.6.18-68.el5PAE #1) EIP is at generic_make_request+0x1b/0x258 eax: 00000000 ebx: cf090d80 ecx: 00000001 edx: f19dc000 esi: 00000002 edi: fffffffe ebp: cf090d80 esp: f19dce50 ds: 007b es: 007b ss: 0068 Process glock_workqueue (pid: 6186, ti=f19dc000 task=f2849aa0 task.ti=f19dc000) Stack: 56e84000 00000000 00000002 00000000 00b42e62 00000000 00000096 f2ad37a4 00000002 00000000 f2849aa0 00000000 00000003 01cc589c 00001ba7 0000000a 00000100 00001ba7 00000246 00000010 f7c3ac40 f7c39620 c0455a18 f19dced8 Call Trace: [<c0455a18>] mempool_alloc+0x28/0xc9 [<c04d829e>] submit_bio+0xbf/0xc5 [<c0473933>] bio_alloc_bioset+0x9b/0xf3 [<c04709fd>] submit_bh+0xe8/0x106 [<f8d92d7d>] gfs2_log_flush+0x1a7/0x40c [gfs2] [<c0606331>] schedule+0x90d/0x9ba [<f8d8f927>] inode_go_sync+0x97/0xbe [gfs2] [<f8d8e54a>] gfs2_glock_drop_th+0x19/0xfb [gfs2] [<f8d8ea83>] run_queue+0xa6/0x237 [gfs2] [<f8d8f3e1>] glock_work_func+0x24/0x31 [gfs2] [<c0433918>] run_workqueue+0x78/0xb5 [<f8d8f3bd>] glock_work_func+0x0/0x31 [gfs2] [<c04341cc>] worker_thread+0xd9/0x10d [<c0420713>] default_wake_function+0x0/0xc [<c04340f3>] worker_thread+0x0/0x10d [<c04365e5>] kthread+0xc0/0xeb [<c0436525>] kthread+0x0/0xeb [<c0405c3b>] kernel_thread_helper+0x7/0x10 ======================= Code: f5 ff f0 0f ba 6f 10 02 83 c4 50 5b 5e 5f 5d c3 55 89 c5 57 56 53 83 ec 60 8b 40 20 c1 e8 09 89 44 24 20 e8 c7 00 13 00 8b 45 0c <8b> 40 04 8b 50 40 8b 40 3c 0f ac d0 09 c1 fa 09 89 d1 09 c1 74 EIP: [<c04d631a>] generic_make_request+0x1b/0x258 SS:ESP 0068:f19dce50 <0>Kernel panic - not syncing: Fatal exception Version-Release number of selected component (if applicable): kernel-2.6.18-68.el5 kmod-gfs2-1.68-1.3 How reproducible: Unknown
If I had to guess I'd say slab memory corruption, but its not at all obvious looking at the code paths. A number of functions have obviously been inlined in that call path and are missing from the stack trace.
Need to see if this is reproduceable, marking NEEDINFO for now.
Nate, have you seen this since? The 5.1 kernel is rather old anyway, so I'm tempted to close this if its not been seen on 5.2+
No further info/reports of this, so closing it.