The issue here isn't the call to dmi_scan_* as mentioned jkarhune's comment #5 in the original BZ. The problem is that one of the args to strncmp is NULL and that leads to a NULL dereference. P.
Created attachment 303533 [details] RHEL5 fix for this issue [1/2]
This bugzilla has Keywords: Regression. Since no regressions are allowed between releases, it is also being proposed as a blocker for this release. Please resolve ASAP.
Created attachment 303534 [details] RHEL5 fix for this issue [2/2]
*** Bug 448937 has been marked as a duplicate of this bug. ***
Panic seen on systems that do not support DMI or have busted DMI tables. Panic occurs during powernowk8 driver load, and occurs on both AMD and Intel systems. BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000 printing eip: c041041c *pde = 00000000 Oops: 0000 [#1] SMP last sysfs file: Modules linked in: CPU: 3 EIP: 0060:[<c041041c>] Not tainted VLI EFLAGS: 00010202 (2.6.18-88.el5 #1) EIP is at powernowk8_init+0x5e/0x1c2 eax: 00000000 ebx: 00000000 ecx: 0000000e edx: 00000020 esi: 00000000 edi: c06242c3 ebp: 00000000 esp: dfaa0fa0 ds: 007b es: 007b ss: 0068 Process swapper (pid: 1, ti=dfaa0000 task=dfa9faa0 task.ti=dfaa0000) Stack: 00000000 c071bb38 00000000 c06ec5a8 c06e7fd8 c0404dee 00000202 c06ec42b 00000000 00000000 00000000 00000000 00000000 00000000 c06ec42b 00000000 00000000 c0405c3b 00000000 00000000 00000000 00000000 00000000 00000000 Call Trace: [<c06ec5a8>] init+0x17d/0x24a [<c0404dee>] ret_from_fork+0x6/0x1c [<c06ec42b>] init+0x0/0x24a [<c06ec42b>] init+0x0/0x24a [<c0405c3b>] kernel_thread_helper+0x7/0x10 ======================= Code: 83 3d 20 41 67 c0 01 75 40 83 3d 84 d4 76 c0 00 75 37 b8 01 00 00 00 bf c3 42 62 c0 e8 96 11 19 00 b9 0f 00 00 00 89 c6 49 78 08 <ac> ae 75 08 84 c0 75 f5 31 c0 eb 04 19 c0 0c 01 85 c0 75 0a c7 EIP: [<c041041c>] powernowk8_init+0x5e/0x1c2 SS:ESP 0068:dfaa0fa0 <0>Kernel panic - not syncing: Fatal exception The panic is due to a missing NULL check on the return from a dmi_scan_* call. Original patch did not include this check. A fix for this issue is patch 1/2 attached to this BZ. Patch 2/2, also attached to this BZ, prevents Intel boxes from running any part of the powernowk8 code, other than the check to see if the box is an AMD box or an Intel box -- this diverges us significantly from upstream, however, we have already diverged significantly from upstream in the init function and other areas of this driver. Both patches have been POSTed for review and will likely be included in RHEL5.3. P.
We are seeing this error on a lot of older hardware (usually PII and PIII's) as reported here "http://bugs.centos.org/view.php?id=2912". Is there any chance this will get fixed in a kernel update during 5.2 ? This bug prevents people from applying security updates of the kernel since they will have to stay at 2.6.18-53.1.21 (last working kernel).
This also affects fresh installs from the install CDs. On my DecTOP with an AMD GEODE chip, I would have to install 5.1, upgrade to 5.2 and then fall back to the 2.6.18-53.1.21 kernel. So not only do we need an updated kernel, we need new install CDs. The message I get from the kernel panic is: <7> spurious 8259A interrupt: IRQ7
in kernel-2.6.18-95.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
I can confirm that on my affected hardware (c.f. bug #439292) 2.6.18-95.el5 boots just fine.
Linux koala.lan 2.6.18-95.el5 #1 SMP Thu Jul 3 20:54:13 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux Boots and runs fine on this machine ( which was unable to boot the -92.1.6.el5 )
one side effect of this kernel (2.6.18-95.el5) is that my vmware-server setup is shot to hell. a Virtual Machine that previously took under 2 min to boot, is not taking upto 15 minutes. I've checked physcial drive i/o rate on the host and things seem fine. I am running 2.6.18-95.el5 on the host machine, the VM is EL4 ( both x86_64 ). I am running : [kbsingh@koala ~]$ rpm -q VMware-server VMware-server-1.0.4-56528 [kbsingh@koala ~]$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 37 model name : AMD Opteron(tm) Processor 250 stepping : 1 cpu MHz : 1000.000 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm bogomips : 2009.84 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp processor : 1 vendor_id : AuthenticAMD cpu family : 15 model : 37 model name : AMD Opteron(tm) Processor 250 stepping : 1 cpu MHz : 2411.146 cache size : 1024 KB fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm bogomips : 4821.51 TLB size : 1024 4K pages clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management: ts fid vid ttp [kbsingh@koala ~]$ free total used free shared buffers cached Mem: 6093108 879044 5214064 0 32488 495524 -/+ buffers/cache: 351032 5742076 Swap: 4192956 0 4192956 And, I am seeing things like this in top: 3807 kbsingh 5 -10 240m 118m 110m S 178 2.0 3:28.29 vmware-vmx if it matters, this machine is built around a MS-9620 MicroStar MotherBoard ( dmidecode dump attached with bug report ) Not sure if this is an issue with VMware itself, and something that needs to be reported there, but if there is any other info required I'd be happy to provide any feedback.
Created attachment 311493 [details] demidecode output from MS9620 MoBo
Karanbir, could you open up a *new* bugzilla with that information please and add me to the cc list? That seems like a completely new issue. Thanks, P.
This patch works on my DECtop AMD Geode based systems. Thanks, now can have an ISO for the ist CD built with this kernel so I can do straight 5.2 installs on problem hardware?
(In reply to comment #21) > This patch works on my DECtop AMD Geode based systems. > > Thanks, now can have an ISO for the ist CD built with this kernel so I can do > straight 5.2 installs on problem hardware? Red Hat does not respin ISOs in-between releases. You can install an earlier version of RHEL5 and upgrade to the latest kernel. P.
We have created a kernel that fixes this issue here for CentOS users: http://people.centos.org/hughesjr/kernel/5/bz_pre53/ The goal of this kernel is to keep a kernel as close as possible to the "released version" while fixing major issues for CentOS users that will be rolled into the 5.3 kernel. We will keep this version up to date with security patches as they are released.
Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: (x86) The powernowk8 driver was not performing sufficient checks on the number of running CPUs. Consequently, when the driver was started, a kernel oops error message may have been reported. In this update the powernowk8 driver verifies that the number of supported CPUs (supported_cpus) equals the number of online CPUs (num_online_cpus), which resolves this issue.
I understand the patch(es) that fix the bug reported in here were added to kernel 2.6.18-92.1.7.el5. It was not clear by just reading this bugzilla thread. The status "ON_QA" should be updated to reflect the fact (that the fix is already in 5.2) ??
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2009-0225.html