Description of problem: Three of four cluster nodes panicked in gfs2_block_map while running d_io, which are independent loads on the same file system. Version-Release number of selected component (if applicable): kernel-2.6.18-84.el5 kmod-gfs2-1.84-1.3 How reproducible: Unknown Actual results: morph-02:: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000019 printing eip: f8d8732f *pde = 77c6d067 Oops: 0000 [#1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq Modules linked in: lock_dlm(U) gfs2(U) dlm configfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac lp i2c_i801 i2c_core floppy e7xxx_edac ide_cd edac_mc intel_rng pcspkr e1000 cdrom parport_pc sg parport dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 1 EIP: 0060:[<f8d8732f>] Tainted: G VLI EFLAGS: 00010246 (2.6.18-84.el5 #1) EIP is at gfs2_block_map+0x8c0/0xb54 [gfs2] eax: 00000001 ebx: 000000e8 ecx: 00000000 edx: 00000002 esi: f2cd4ccc edi: f26c7b58 ebp: f2cd4cf8 esp: f2cd4c4c ds: 007b es: 007b ss: 0068 Process xdoio (pid: 8167, ti=f2cd4000 task=f7dd0000 task.ti=f2cd4000) Stack: f20e3690 00000050 0000c34f 00000000 f533da00 f5d61134 0267df00 f23078b4 c165ab60 00000001 f24c3f0c f24c3518 f213f000 00000000 00000000 00000001 00000000 00000000 00000001 00000001 00000003 00000004 00000000 02b6c389 Call Trace: [<c0473352>] __block_prepare_write+0x1a2/0x433 [<f8d94518>] gfs2_log_reserve+0x11a/0x171 [gfs2] [<c04735f9>] block_prepare_write+0x16/0x23 [<f8d86a6f>] gfs2_block_map+0x0/0xb54 [gfs2] [<f8d97832>] gfs2_prepare_write+0x2ba/0x31e [gfs2] [<f8d86a6f>] gfs2_block_map+0x0/0xb54 [gfs2] [<c04565fb>] generic_file_buffered_write+0x226/0x5a2 [<c042db0b>] __mod_timer+0x99/0xa3 [<c042a131>] current_fs_time+0x4a/0x55 [<c0456e1d>] __generic_file_aio_write_nolock+0x4a6/0x52a [<f8d904ff>] gfs2_glock_dq+0x9e/0xb2 [gfs2] [<f8d8f1d6>] gfs2_glock_put+0x1b/0x100 [gfs2] [<f8d903da>] gfs2_holder_uninit+0xb/0x1b [gfs2] [<c045707b>] generic_file_write+0x0/0x94 [<c0456fd1>] __generic_file_write_nolock+0x86/0x9a [<c0435fc7>] autoremove_wake_function+0x0/0x2d [<c0471459>] remote_llseek+0xb1/0xbb [<f8d98e04>] gfs2_llseek+0x81/0x91 [gfs2] [<c06085ae>] mutex_lock+0xb/0x19 [<c04570b5>] generic_file_write+0x3a/0x94 [<c045707b>] generic_file_write+0x0/0x94 [<c0470ff7>] vfs_write+0xa1/0x143 [<c04715e9>] sys_write+0x3c/0x63 [<c0404eff>] syscall_call+0x7/0xb ======================= Code: 00 66 8b 7d fe 8b 44 24 10 83 7c 24 58 02 66 89 7c 24 6a 8b b8 88 01 00 00 19 db 8b 06 81 e3 d0 00 00 00 83 c3 18 83 7c 24 58 00 <8b> 40 18 89 44 24 28 75 08 0f 0b 83 01 4d 54 da f8 83 7e 04 00 EIP: [<f8d8732f>] gfs2_block_map+0x8c0/0xb54 [gfs2] SS:ESP 0068:f2cd4c4c <0>Kernel panic - not syncing: Fatal exception morph-03:: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000019 printing eip: f8d7e32f *pde = 7f514067 Oops: 0000 [#1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq Modules linked in: lock_dlm(U) gfs2(U) dlm configfs sg autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac lp floppy intel_rng pcspkr parport_pc e7xxx_edac i2c_i801 edac_mc parport ide_cd e1000 i2c_core cdrom dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 0 EIP: 0060:[<f8d7e32f>] Tainted: G VLI EFLAGS: 00010246 (2.6.18-84.el5 #1) EIP is at gfs2_block_map+0x8c0/0xb54 [gfs2] eax: 00000001 ebx: 000000e8 ecx: 00000000 edx: 00000002 esi: f22d0ccc edi: f1c95b58 ebp: f22d0cf8 esp: f22d0c4c ds: 007b es: 007b ss: 0068 Process xdoio (pid: 8143, ti=f22d0000 task=f7dd8aa0 task.ti=f22d0000) Stack: f25c4c10 00000050 0000c34f 00000000 f1c96424 f5b83134 0267df00 f25c58b4 c16393a0 00000001 f25d6ea4 f25d6240 f1cb2000 00000000 00000000 00000001 00000000 00000000 00000001 00000001 00000003 00000004 00000000 056e8501 Call Trace: [<c0473352>] __block_prepare_write+0x1a2/0x433 [<f8d8b518>] gfs2_log_reserve+0x11a/0x171 [gfs2] [<c04735f9>] block_prepare_write+0x16/0x23 [<f8d7da6f>] gfs2_block_map+0x0/0xb54 [gfs2] [<f8d8e832>] gfs2_prepare_write+0x2ba/0x31e [gfs2] [<f8d7da6f>] gfs2_block_map+0x0/0xb54 [gfs2] [<c04565fb>] generic_file_buffered_write+0x226/0x5a2 [<c042db0b>] __mod_timer+0x99/0xa3 [<c042a131>] current_fs_time+0x4a/0x55 [<c0456e1d>] __generic_file_aio_write_nolock+0x4a6/0x52a [<f8d874ff>] gfs2_glock_dq+0x9e/0xb2 [gfs2] [<f8d861d6>] gfs2_glock_put+0x1b/0x100 [gfs2] [<f8d873da>] gfs2_holder_uninit+0xb/0x1b [gfs2] [<c045707b>] generic_file_write+0x0/0x94 [<c0456fd1>] __generic_file_write_nolock+0x86/0x9a [<c0435fc7>] autoremove_wake_function+0x0/0x2d [<c0471459>] remote_llseek+0xb1/0xbb [<f8d8fe04>] gfs2_llseek+0x81/0x91 [gfs2] [<c06085ae>] mutex_lock+0xb/0x19 [<c04570b5>] generic_file_write+0x3a/0x94 [<c045707b>] generic_file_write+0x0/0x94 [<c0470ff7>] vfs_write+0xa1/0x143 [<c04715e9>] sys_write+0x3c/0x63 [<c0404eff>] syscall_call+0x7/0xb ======================= Code: 00 66 8b 7d fe 8b 44 24 10 83 7c 24 58 02 66 89 7c 24 6a 8b b8 88 01 00 00 19 db 8b 06 81 e3 d0 00 00 00 83 c3 18 83 7c 24 58 00 <8b> 40 18 89 44 24 28 75 08 0f 0b 83 01 4d c4 d9 f8 83 7e 04 00 EIP: [<f8d7e32f>] gfs2_block_map+0x8c0/0xb54 [gfs2] SS:ESP 0068:f22d0c4c <0>Kernel panic - not syncing: Fatal exception morph-04:: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000019 printing eip: f8d7c32f *pde = 75af1067 Oops: 0000 [#1] SMP last sysfs file: /devices/pci0000:00/0000:00:00.0/irq Modules linked in: lock_dlm(U) gfs2(U) dlm configfs autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api dm_multipath video sbs backlight i2c_ec button battery asus_acpi ac lp floppy ide_cd pcspkr i2c_i801 e7xxx_edac parport_pc i2c_core cdrom edac_mc parport intel_rng sg e1000 dm_snapshot dm_zero dm_mirror dm_mod qla2xxx scsi_transport_fc ata_piix libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd CPU: 1 EIP: 0060:[<f8d7c32f>] Tainted: G VLI EFLAGS: 00010246 (2.6.18-84.el5 #1) EIP is at gfs2_block_map+0x8c0/0xb54 [gfs2] eax: 00000001 ebx: 000000e8 ecx: 00000000 edx: 00000002 esi: f22b9ccc edi: f2551b58 ebp: f22b9cf8 esp: f22b9c4c ds: 007b es: 007b ss: 0068 Process xdoio (pid: 8126, ti=f22b9000 task=f74aa550 task.ti=f22b9000) Stack: f20e9e10 00000050 0000c34f 00000000 f56eca00 f5ccb134 0267df00 f20e8aa8 c1657360 00000001 f4d01b30 f51133e0 f1c7e000 00000000 00000000 00000001 00000000 00000000 00000001 00000001 00000003 00000004 00000000 08254690 Call Trace: [<c0473352>] __block_prepare_write+0x1a2/0x433 [<f8d89518>] gfs2_log_reserve+0x11a/0x171 [gfs2] [<c04735f9>] block_prepare_write+0x16/0x23 [<f8d7ba6f>] gfs2_block_map+0x0/0xb54 [gfs2] [<f8d8c832>] gfs2_prepare_write+0x2ba/0x31e [gfs2] [<f8d7ba6f>] gfs2_block_map+0x0/0xb54 [gfs2] [<c04565fb>] generic_file_buffered_write+0x226/0x5a2 [<f8d853f9>] gfs2_glmutex_lock+0xf/0x77 [gfs2] [<c042db0b>] __mod_timer+0x99/0xa3 [<c042a131>] current_fs_time+0x4a/0x55 [<c0456e1d>] __generic_file_aio_write_nolock+0x4a6/0x52a [<f8d854ff>] gfs2_glock_dq+0x9e/0xb2 [gfs2] [<f8d841d6>] gfs2_glock_put+0x1b/0x100 [gfs2] [<f8d853da>] gfs2_holder_uninit+0xb/0x1b [gfs2] [<c045707b>] generic_file_write+0x0/0x94 [<c0456fd1>] __generic_file_write_nolock+0x86/0x9a [<c0435fc7>] autoremove_wake_function+0x0/0x2d [<c0471459>] remote_llseek+0xb1/0xbb [<f8d8de04>] gfs2_llseek+0x81/0x91 [gfs2] [<c06085ae>] mutex_lock+0xb/0x19 [<c04570b5>] generic_file_write+0x3a/0x94 [<c045707b>] generic_file_write+0x0/0x94 [<c0470ff7>] vfs_write+0xa1/0x143 [<c04715e9>] sys_write+0x3c/0x63 [<c0404eff>] syscall_call+0x7/0xb ======================= Code: 00 66 8b 7d fe 8b 44 24 10 83 7c 24 58 02 66 89 7c 24 6a 8b b8 88 01 00 00 19 db 8b 06 81 e3 d0 00 00 00 83 c3 18 83 7c 24 58 00 <8b> 40 18 89 44 24 28 75 08 0f 0b 83 01 4d a4 d9 f8 83 7e 04 00 EIP: [<f8d7c32f>] gfs2_block_map+0x8c0/0xb54 [gfs2] SS:ESP 0068:f22b9c4c <0>Kernel panic - not syncing: Fatal exception Expected results: Nodes should not panic. Additional info:
Fixing summary - since this seems to be gfs2_block_map
How reproducible: Easily Steps to reproduce: 1. d_io The tags that were running on the nodes which failed first were rwransynclarge and rwrandirectsmall.
The command line for rwransynclarge is: xiogen -S 24633 -f sync -i 30s -m random -s read,write,readv,writev -t 1b -T 40000b -F 400000b:/mnt/brawl/morph-01/rwransynclarge | xdoio -v I ran it on one node and the node panicked right away.
This is due to the patch for bz #307091 which isn't yet in RHEL, so marking as a dup of the original bz. *** This bug has been marked as a duplicate of 307091 ***