Description of problem: Server crashes sporadic Version-Release number of selected component (if applicable): [root@koala ~]# uname -r 2.6.20-1.2952.fc6 [root@koala ~]# uname -a Linux koala.horrynet.de 2.6.20-1.2952.fc6 #1 SMP Wed May 16 18:18:22 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux Dell PowerEdge 1800 How reproducible: Steps to Reproduce: 1. seems to be a sporadic error Actual results: contents of /var/log/messages: Jun 21 09:46:34 koala kernel: Unable to handle kernel paging request at ffff810034186688 RIP: Jun 21 09:46:34 koala kernel: [<ffffffff8033fb9a>] radix_tree_insert+0x10e/0x18c Jun 21 09:46:34 koala kernel: PGD 8063 PUD 9063 PMD 80000000340001e3 PTE 700a0d62696c2f70 Jun 21 09:46:34 koala kernel: Oops: 0000 [1] SMP Jun 21 09:46:34 koala kernel: last sysfs file: /devices/pci0000:00/0000:00:02.0/0000:01:00.2/0000:03:07.0/irq Jun 21 09:46:34 koala kernel: CPU 3 Jun 21 09:46:34 koala kernel: Modules linked in: nfsd exportfs lockd nfs_acl autofs4 hidp rfcomm l2cap bluetooth sunrp c dm_mirror dm_multipath dm_mod video sbs i2c_ec i2c_core dock button battery asus_acpi backlight ac radeon drm ipv6 l p sg floppy e1000 e752x_edac serio_raw ide_cd iTCO_wdt edac_mc pcspkr parport_pc parport iTCO_vendor_support cdrom ata _piix libata mptspi mptscsih scsi_transport_spi mptbase shpchp aacraid sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci _hcd Jun 21 09:46:34 koala kernel: Pid: 2508, comm: smbd Not tainted 2.6.20-1.2952.fc6 #1 Jun 21 09:46:34 koala kernel: RIP: 0010:[<ffffffff8033fb9a>] [<ffffffff8033fb9a>] radix_tree_insert+0x10e/0x18c Jun 21 09:46:34 koala kernel: RSP: 0018:ffff810040595b28 EFLAGS: 00010002 Jun 21 09:46:34 koala kernel: RAX: 2000000000000032 RBX: ffff8100cd6c9a28 RCX: 000000000003280c Jun 21 09:46:34 koala kernel: RDX: ffff8100341864e0 RSI: 000000000003282e RDI: ffff8100cd6c9a28 Jun 21 09:46:34 koala kernel: RBP: ffff8100341864e0 R08: ffff81011fd19a90 R09: 0000000000000d73 Jun 21 09:46:34 koala kernel: R10: 0000000000000000 R11: 0000000000000002 R12: 0000000000000032 Jun 21 09:46:34 koala kernel: R13: 0000000000000002 R14: 0000000000000006 R15: ffff810002385d00 Jun 21 09:46:34 koala kernel: FS: 00002aaaae93edc0(0000) GS:ffff81011fd19940(0000) knlGS:0000000000000000 Jun 21 09:46:34 koala kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Jun 21 09:46:34 koala kernel: CR2: ffff810034186688 CR3: 0000000041807000 CR4: 00000000000006e0 Jun 21 09:46:34 koala kernel: Process smbd (pid: 2508, threadinfo ffff810040594000, task ffff81010fd05040) Jun 21 09:46:34 koala kernel: Stack: 000000000003282e ffff810002385d00 0000000000000000 ffff8100cd6c9a20 Jun 21 09:46:34 koala kernel: 000000000003282e 0000555555b7ea64 ffff810040595ee8 ffffffff8020c353 Jun 21 09:46:34 koala kernel: 0000000000000000 000000000003282e 0000000000000000 0000000000001000 Jun 21 09:46:34 koala kernel: Call Trace: Jun 21 09:46:34 koala kernel: [<ffffffff8020c353>] add_to_page_cache+0x3d/0x89 Jun 21 09:46:34 koala kernel: [<ffffffff8020fcfc>] generic_file_buffered_write+0x1c4/0x6fd Jun 21 09:46:34 koala kernel: [<ffffffff8021606f>] __generic_file_aio_write_nolock+0x378/0x3eb Jun 21 09:46:34 koala kernel: [<ffffffff80260b4d>] lock_kernel+0x2c/0x48 Jun 21 09:46:34 koala kernel: [<ffffffff80221559>] generic_file_aio_write+0x61/0xc1 Jun 21 09:46:34 koala kernel: [<ffffffff8803718e>] :ext3:ext3_file_write+0x16/0x94 Jun 21 09:46:34 koala kernel: [<ffffffff80217bae>] do_sync_write+0xc9/0x10c Jun 21 09:46:34 koala kernel: [<ffffffff802609ca>] _write_unlock_irq+0x9/0xc Jun 21 09:46:34 koala kernel: [<ffffffff80297bb4>] autoremove_wake_function+0x0/0x2e Jun 21 09:46:34 koala kernel: [<ffffffff8023894c>] fcntl_setlk+0x232/0x25f Jun 21 09:46:34 koala kernel: [<ffffffff8021649f>] vfs_write+0xce/0x177 Jun 21 09:46:34 koala kernel: [<ffffffff80240e61>] sys_pwrite64+0x50/0x70 Jun 21 09:46:34 koala kernel: [<ffffffff8022dc8b>] sys_fcntl+0x2da/0x2e6 Jun 21 09:46:34 koala kernel: [<ffffffff8025a11e>] system_call+0x7e/0x83 Jun 21 09:46:34 koala kernel: Jun 21 09:46:34 koala kernel: Jun 21 09:46:34 koala kernel: Code: 48 8b 54 c2 18 45 85 ed 75 a6 48 85 d2 b8 ef ff ff ff 75 5e Jun 21 09:46:34 koala kernel: RIP [<ffffffff8033fb9a>] radix_tree_insert+0x10e/0x18c Jun 21 09:46:35 koala kernel: RSP <ffff810040595b28> Jun 21 09:46:35 koala kernel: CR2: ffff810034186688 Jun 21 09:46:35 koala kernel: <3>BUG: sleeping function called from invalid context at kernel/rwsem.c:20 Jun 21 09:46:35 koala kernel: in_atomic():0, irqs_disabled():1 Jun 21 09:46:35 koala kernel: Jun 21 09:46:35 koala kernel: Call Trace: Jun 21 09:46:35 koala kernel: [<ffffffff80299d90>] down_read+0x15/0x23 Jun 21 09:46:35 koala kernel: [<ffffffff802a6065>] acct_collect+0x42/0x18e Jun 21 09:46:35 koala kernel: [<ffffffff802151ea>] do_exit+0x20b/0x832 Jun 21 09:46:35 koala kernel: [<ffffffff80262db8>] do_page_fault+0x74f/0x7ca Jun 21 09:46:35 koala kernel: [<ffffffff88021c3a>] :jbd:do_get_write_access+0x4d5/0x507 Jun 21 09:46:35 koala kernel: [<ffffffff80260eed>] error_exit+0x0/0x84 Jun 21 09:46:35 koala kernel: [<ffffffff8033fb9a>] radix_tree_insert+0x10e/0x18c Jun 21 09:46:35 koala kernel: [<ffffffff8020c353>] add_to_page_cache+0x3d/0x89 Jun 21 09:46:35 koala kernel: [<ffffffff8020fcfc>] generic_file_buffered_write+0x1c4/0x6fd Jun 21 09:46:35 koala kernel: [<ffffffff8021606f>] __generic_file_aio_write_nolock+0x378/0x3eb Jun 21 09:46:35 koala kernel: [<ffffffff80260b4d>] lock_kernel+0x2c/0x48 Jun 21 09:46:35 koala kernel: [<ffffffff80221559>] generic_file_aio_write+0x61/0xc1 Jun 21 09:46:35 koala kernel: [<ffffffff8803718e>] :ext3:ext3_file_write+0x16/0x94 Jun 21 09:46:35 koala kernel: [<ffffffff80217bae>] do_sync_write+0xc9/0x10c Jun 21 09:46:35 koala kernel: [<ffffffff802609ca>] _write_unlock_irq+0x9/0xc Jun 21 09:46:35 koala kernel: [<ffffffff80297bb4>] autoremove_wake_function+0x0/0x2e Jun 21 09:46:35 koala kernel: [<ffffffff8023894c>] fcntl_setlk+0x232/0x25f Jun 21 09:46:35 koala kernel: [<ffffffff8021649f>] vfs_write+0xce/0x177 Jun 21 09:46:35 koala kernel: [<ffffffff80240e61>] sys_pwrite64+0x50/0x70 Jun 21 09:46:35 koala kernel: [<ffffffff8022dc8b>] sys_fcntl+0x2da/0x2e6 Jun 21 09:46:35 koala kernel: [<ffffffff8025a11e>] system_call+0x7e/0x83 Jun 21 09:46:35 koala kernel: Jun 21 09:52:39 koala syslogd 1.4.1: restart.
49 63 c4 movslq %r12d,%rax 48 8b 54 c2 18 mov 0x18(%rdx,%rax,8),%rdx [objdump is broken, opcode 49 63 is actually movsxq] r12 == 0000000000000032 rax == 2000000000000032 This looks like a broken CPU to me: the bottom 32 bits of r12 were moved with sign extension to rax but one bit is now wrong in rax.
Yesterday i ran the full Dell hardware test for PE 1800 (i executed it in extended mode) and all was ok, no errors. Is there another tool I can test the CPUs with? Do you think that the Dell Test Tools might not be able to detect such an error? [root@koala proc]# cat cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5989.22 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 3 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.05 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.12 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 3 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.05 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: [root@koala proc]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5989.22 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 3 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.05 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.12 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 15 model : 4 model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.495 cache size : 2048 KB physical id : 3 siblings : 2 core id : 0 cpu cores : 1 fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl cid cx16 xtpr bogomips : 5985.05 clflush size : 64 cache_alignment : 128 address sizes : 36 bits physical, 48 bits virtual power management:
memtest86 is a good start for testing. You can run it by booting the Fedora install CD/DVD and selecting it. Also: http://www.ibm.com/developerworks/library/l-hw1/ Many people swear by this one but it could overheat the system: http://pages.sbcglobal.net/redelm/
(This is a mass-update to all current FC6 kernel bugs in NEW state) Hello, I'm reviewing this bug list as part of the kernel bug triage project, an attempt to isolate current bugs in the Fedora kernel. http://fedoraproject.org/wiki/KernelBugTriage I am CC'ing myself to this bug, however this version of Fedora is no longer maintained. Please attempt to reproduce this bug with a current version of Fedora (presently Fedora 8). If the bug no longer exists, please close the bug or I'll do so in a few days if there is no further information lodged. Thanks for using Fedora!
Per the previous comment in this bug, I am closing it as INSUFFICIENT_DATA, since no information has been lodged for over 30 days. Please re-open this bug or file a new one if you can provide the requested data, and thanks for filing the original report!