Created attachment 528604 [details] Good boot '/var/log/messages' file Description of problem: Kernel 2.6.40.6-0.fc16.i686.PAE generates massive amount of kernel messages related to '[Firmware Warn]: GHES: Failed to read error status block address for hardware error source', whereas kernel 2.6.38.6-26.rc1.fc15.i686.PAE does not. Version-Release number of selected component (if applicable): kernel-PAE-2.6.40.6-0.fc15.i686 How reproducible: Always Steps to Reproduce: 1. Install a minimal FC15 system. (Note: kernel-PAE-2.6.38.6-26.rc1.fc15.i686 is installed) 2. Boot into system. 3. Use yum to upgrade system. (Kernel is upgraded to kernel-PAE-2.6.40.6-0.fc15.i686) 4. Reboot into new kernel. 5. # tail -f /var/log/messages Actual results: See massive numbers of '[Firmware Warn]: GHES: Failed to read error status block address for hardware error source' errors being logged. Expected results: No errors, as is the case when system is rebooted under kernel-PAE-2.6.38.6-26.rc1.fc15.i686. Additional info: After inspecting the difference between 'messages' files from the two kernel boots (kernel-PAE-2.6.38.6-26.rc1.fc15.i686 is 'good.txt', whereas kernel-PAE-2.6.40.6-0.fc15.i686 is 'bad.txt') I see some trace output in the newer kernel boot. Please see attached files. Oct 17 12:38:53 www2 kernel: [ 0.008587] ------------[ cut here ]------------ Oct 17 12:38:53 www2 kernel: [ 0.008593] WARNING: at arch/x86/kernel/apic/apic.c:1237 setup_local_APIC+0xee/0x317() Oct 17 12:38:53 www2 kernel: [ 0.008595] Hardware name: PowerEdge R310 Oct 17 12:38:53 www2 kernel: [ 0.008596] Modules linked in: Oct 17 12:38:53 www2 kernel: [ 0.008598] Pid: 1, comm: swapper Not tainted 2.6.40.6-0.fc15.i686.PAE #1 Oct 17 12:38:53 www2 kernel: [ 0.008600] Call Trace: Oct 17 12:38:53 www2 kernel: [ 0.008604] [<c07f548c>] ? printk+0x2d/0x2f Oct 17 12:38:53 www2 kernel: [ 0.008608] [<c04436c5>] warn_slowpath_common+0x7c/0x91 Oct 17 12:38:53 www2 kernel: [ 0.008610] [<c07f05ad>] ? setup_local_APIC+0xee/0x317 Oct 17 12:38:53 www2 kernel: [ 0.008612] [<c07f05ad>] ? setup_local_APIC+0xee/0x317 Oct 17 12:38:53 www2 kernel: [ 0.008614] [<c04436fc>] warn_slowpath_null+0x22/0x24 Oct 17 12:38:53 www2 kernel: [ 0.008617] [<c07f05ad>] setup_local_APIC+0xee/0x317 Oct 17 12:38:53 www2 kernel: [ 0.008619] [<c07f548c>] ? printk+0x2d/0x2f Oct 17 12:38:53 www2 kernel: [ 0.008622] [<c042244f>] ? bigsmp_setup_apic_routing+0x20/0x22 Oct 17 12:38:53 www2 kernel: [ 0.008627] [<c0aac5ee>] native_smp_prepare_cpus+0x22f/0x2d2 Oct 17 12:38:53 www2 kernel: [ 0.008630] [<c0a9e7eb>] kernel_init+0x5d/0x136 Oct 17 12:38:53 www2 kernel: [ 0.008633] [<c0a9e78e>] ? start_kernel+0x353/0x353 Oct 17 12:38:53 www2 kernel: [ 0.008636] [<c080303e>] kernel_thread_helper+0x6/0x10 Oct 17 12:38:53 www2 kernel: [ 0.008641] ---[ end trace a7919e7f17c0a725 ]--- and... Oct 17 12:38:53 www2 kernel: [ 0.132101] ------------[ cut here ]------------ Oct 17 12:38:53 www2 kernel: [ 0.132109] WARNING: at arch/x86/kernel/apic/apic.c:1237 setup_local_APIC+0xee/0x317() Oct 17 12:38:53 www2 kernel: [ 0.132110] Hardware name: PowerEdge R310 Oct 17 12:38:53 www2 kernel: [ 0.132111] Modules linked in: Oct 17 12:38:53 www2 kernel: [ 0.132115] Pid: 0, comm: kworker/0:0 Tainted: G W 2.6.40.6-0.fc15.i686.PAE #1 Oct 17 12:38:53 www2 kernel: [ 0.132116] Call Trace: Oct 17 12:38:53 www2 kernel: [ 0.132121] [<c07f548c>] ? printk+0x2d/0x2f Oct 17 12:38:53 www2 kernel: [ 0.132125] [<c04436c5>] warn_slowpath_common+0x7c/0x91 Oct 17 12:38:53 www2 kernel: [ 0.132128] [<c07f05ad>] ? setup_local_APIC+0xee/0x317 Oct 17 12:38:53 www2 kernel: [ 0.132130] [<c07f05ad>] ? setup_local_APIC+0xee/0x317 Oct 17 12:38:53 www2 kernel: [ 0.132131] [<c04436fc>] warn_slowpath_null+0x22/0x24 Oct 17 12:38:53 www2 kernel: [ 0.132133] [<c07f05ad>] setup_local_APIC+0xee/0x317 Oct 17 12:38:53 www2 kernel: [ 0.132135] [<c07eb0ed>] ? fpu_init+0x77/0x95 Oct 17 12:38:53 www2 kernel: [ 0.132137] [<c07eccc3>] ? cpu_init+0x146/0x14e Oct 17 12:38:53 www2 kernel: [ 0.132139] [<c07ef70a>] start_secondary+0x105/0x259 Oct 17 12:38:53 www2 kernel: [ 0.132141] ---[ end trace a7919e7f17c0a726 ]--- Lastly, I also see some ACPI errors in the 'bad.txt' file, that I don't see in the 'good.txt' file. Oct 17 12:38:53 www2 kernel: [ 36.395336] ACPI Error: No handler for Region [IPMI] (f242d540) [IPMI] (20110413/evregion-373) Oct 17 12:38:53 www2 kernel: [ 36.395341] ACPI Error: Region IPMI (ID=7) has no handler (20110413/exfldio-292) Oct 17 12:38:53 www2 kernel: [ 36.395345] ACPI Error: Method parse/execution failed [\_SB_.PMI0._GHL] (Node f2440768), AE_NOT_EXIST (20110413/psparse-536) Oct 17 12:38:53 www2 kernel: [ 36.395353] ACPI Error: Method parse/execution failed [\_SB_.PMI0._PMC] (Node f2440708), AE_NOT_EXIST (20110413/psparse-536) Oct 17 12:38:53 www2 kernel: [ 36.395363] ACPI Exception: AE_NOT_EXIST, Evaluating _PMC (20110413/power_meter-773)
Created attachment 528605 [details] Bad boot '/var/log/messages' file
Have you found any solution to this problem? I have same issue on my Dell PowerEdge after updating to ubuntu oneiric (3.0 based pae kernel). https://bugs.launchpad.net/ubuntu/+bug/881164
Nothing... My issue is with Dell PowerEdge R310 Servers. Still waiting for someone to acknowledge bug report.
Any update on this????
Reported upstream, and to Dell.
Does the 2.6.41 update improve the situation at all ? There are 2 patches that went in (9fb0bfe, and b3b46d7) that might help.
Unfortunately no.
Similar problems running 2.6.41.1-1.fc15.i686.PAE on a Supermicro 5015B-MT. Built this system just this week. Now /var/log/messages is filling by the minute with these messages: Nov 23 14:52:03 newparis kernel: [ 999.468008] [Firmware Warn]: GHES: Failed to read error status block address for hardware error source: 9. Nov 23 14:52:03 newparis kernel: [ 999.476010] [Firmware Warn]: GHES: Failed to read error status block address for hardware error source: 10. Hoping there's a solution ( or an acceptable workaround ) soon. More details ( logs, dmidecode output, etc. ) can be provide if that helps.
We have rolled back to kernel-PAE-2.6.38.6-26.rc1.fc15, as this kernel doesn't generate these log entries.
(In reply to comment #9) > We have rolled back to kernel-PAE-2.6.38.6-26.rc1.fc15, as this kernel doesn't > generate these log entries. You should be able to work around this problem by adding "ghes.disable=1" to the kernel boot options.
Both rolling to 2.6.38.6 and running 2.6.41.1 with "ghes.disable=1" suffice as workarounds. Would be nice to see a real solution. Also understand this may require firmware update(s) from vendor ( Supermicro ). Thanks.
Using "ghes.disable=1" has proven to be a successful workaround for this issue. Thanks.