From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461) Description of problem: Last night one of our Compaq ML570 machines hung due to a kernel error. Please see additional information on the log file produced. I am not sure what caused this error, but it looks like a kernel bug somewhere. We have a qlogic 2200 Fiber Card, a Compaq Smart2 Array controller and a Compaq CISS Raid Controller. If you need any more configuration information, I will be happy to supply it. Version-Release number of selected component (if applicable): How reproducible: Didn't try Additional info: Feb 6 17:57:34 www2 kernel: ------------[ cut here ]------------ Feb 6 17:57:34 www2 kernel: kernel BUG at /usr/src/build/48551- i686/BUILD/kernel-2.4.9/linux/include/asm/pci.h:145! Feb 6 17:57:34 www2 kernel: invalid operand: 0000 Feb 6 17:57:34 www2 kernel: CPU: 0 Feb 6 17:57:34 www2 kernel: EIP: 0010: [e100:__insmod_e100_O/lib/modules/2.4.9-13/kernel/drivers/addon/e+- 1489448/96] Not tainted Feb 6 17:57:34 www2 kernel: EIP: 0010:[<f88385d8>] Not tainted Feb 6 17:57:34 www2 kernel: EFLAGS: 00010082 Feb 6 17:57:34 www2 kernel: eax: 00000058 ebx: 00000030 ecx: 00000001 edx: 00002ef2 Feb 6 17:57:34 www2 kernel: esi: 00000006 edi: 00000008 ebp: 00000001 esp: f7e3feb4 Feb 6 17:57:34 www2 kernel: ds: 0018 es: 0018 ss: 0018 Feb 6 17:57:34 www2 kernel: Process bdflush (pid: 6, stackpage=f7e3f000) Feb 6 17:57:34 www2 kernel: Stack: f883b540 00000091 c0f8a000 00000000 f6d4cc00 00000000 00008000 00008000 Feb 6 17:57:34 www2 kernel: f6d4cd60 c0f8b160 f707007c 000080e0 f883407f f707007c f6d4cd60 00000001 Feb 6 17:57:34 www2 kernel: f7078160 000080e0 f8839dae f707007c c0f8b160 f7070080 00000000 00000000 Feb 6 17:57:34 www2 kernel: Call Trace: [e100:__insmod_e100_O/lib/modules/2.4.9-13/kernel/drivers/addon/e+-1477312/96] qla2100_setup [qla2x00] 0x700 Feb 6 17:57:34 www2 kernel: Call Trace: [<f883b540>] qla2100_setup [qla2x00] 0x700 Feb 6 17:57:34 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1507201/96] qla2100_next [qla2x00] 0x4f Feb 6 17:57:34 www2 kernel: [<f883407f>] qla2100_next [qla2x00] 0x4f Feb 6 17:57:34 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1483346/96] qla2100_restart_queues [qla2x00] 0xde Feb 6 17:57:34 www2 kernel: [<f8839dae>] qla2100_restart_queues [qla2x00] 0xde Feb 6 17:57:34 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1512100/96] qla2100_queuecommand [qla2x00] 0x1dc Feb 6 17:57:34 www2 kernel: [<f8832d5c>] qla2100_queuecommand [qla2x00] 0x1dc Feb 6 17:57:35 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1602560/96] revalidate_scsidisk [sd_mod] 0xd20 Feb 6 17:57:35 www2 kernel: [<f881cc00>] revalidate_scsidisk [sd_mod] 0xd20 Feb 6 17:57:35 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1718729/96] scsi_release_command_R4808d578 [scsi_mod] 0x2b7 Feb 6 17:57:35 www2 kernel: [<f8800637>] scsi_release_command_R4808d578 [scsi_mod] 0x2b7 Feb 6 17:57:35 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1693616/96] scsi_sleep_R35962bf8 [scsi_mod] 0x1580 Feb 6 17:57:35 www2 kernel: [<f8806850>] scsi_sleep_R35962bf8 [scsi_mod] 0x1580 Feb 6 17:57:35 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1602560/96] revalidate_scsidisk [sd_mod] 0xd20 Feb 6 17:57:35 www2 kernel: [<f881cc00>] revalidate_scsidisk [sd_mod] 0xd20 Feb 6 17:57:35 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1687012/96] scsi_io_completion_R2bda9a0b [scsi_mod] 0x79c Feb 6 17:57:35 www2 kernel: [<f880821c>] scsi_io_completion_R2bda9a0b [scsi_mod] 0x79c Feb 6 17:57:35 www2 kernel: [e100:__insmod_e100_O/lib/modules/2.4.9- 13/kernel/drivers/addon/e+-1602560/96] revalidate_scsidisk [sd_mod] 0xd20 Feb 6 17:57:35 www2 kernel: [<f881cc00>] revalidate_scsidisk [sd_mod] 0xd20 Feb 6 17:57:35 www2 kernel: [generic_unplug_device+30/48] generic_unplug_device [kernel] 0x1e Feb 6 17:57:35 www2 kernel: [<c018119e>] generic_unplug_device [kernel] 0x1e Feb 6 17:57:35 www2 kernel: [__run_task_queue+73/96] __run_task_queue [kernel] 0x49 Feb 6 17:57:35 www2 kernel: [<c011b8d9>] __run_task_queue [kernel] 0x49 Feb 6 17:57:35 www2 kernel: [bdflush+156/176] bdflush [kernel] 0x9c Feb 6 17:57:35 www2 kernel: [<c013a83c>] bdflush [kernel] 0x9c Feb 6 17:57:35 www2 kernel: [_stext+0/48] stext [kernel] 0x0 Feb 6 17:57:35 www2 kernel: [<c0105000>] stext [kernel] 0x0 Feb 6 17:57:35 www2 kernel: [_stext+0/48] stext [kernel] 0x0 Feb 6 17:57:35 www2 kernel: [<c0105000>] stext [kernel] 0x0 Feb 6 17:57:35 www2 kernel: [kernel_thread+38/48] kernel_thread [kernel] 0x26 Feb 6 17:57:35 www2 kernel: [<c0105726>] kernel_thread [kernel] 0x26 Feb 6 17:57:35 www2 kernel: [bdflush+0/176] bdflush [kernel] 0x0 Feb 6 17:57:35 www2 kernel: [<c013a7a0>] bdflush [kernel] 0x0 Feb 6 17:57:35 www2 kernel: Feb 6 17:57:35 www2 kernel: Feb 6 17:57:35 www2 kernel: Code: 0f 0b 58 5a 8b 14 24 8b 04 1a 85 c0 74 11 05 00 00 00 40 31
Hmmm this oops seems half mutilated (yes I know syslog sometimes does that, grrr) It seems to be either the qlogic driver or the e100 driver you are using. I know both have some serious issues (and for the qlogic driver we've added a newer, fixed version in the 2.4.9-21 kernel we released). Please try using eepro100 instead of e100!
Does this still happen with eepro100 and qla2200 (from 2.4.9-21) ?
The problem has not re-occurred since I changed the NIC module from e100 to eepro100. I did not change the qla2x00 driver as it was from qLogic, nor did I upgrade the kernel. Since it appears that it was the NIC module, I am going to upgrade to the latest kernel. Thanks for your help.