Bug 472844 - kernel panic when modprobe -r acpi_cpufreq on centrino platform with kernel newer than 2.6.18-118
Summary: kernel panic when modprobe -r acpi_cpufreq on centrino platform with kernel n...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.3
Hardware: i386
OS: Linux
medium
high
Target Milestone: rc
: ---
Assignee: Prarit Bhargava
QA Contact: Martin Jenner
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-11-25 01:17 UTC by Zhang Kexin
Modified: 2009-01-20 20:18 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 20:18:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
dmesg (30.00 KB, application/octet-stream)
2008-11-25 03:25 UTC, Zhang Kexin
no flags Details
cpuinfo (2.32 KB, application/octet-stream)
2008-11-25 03:27 UTC, Zhang Kexin
no flags Details
content of "cat sys/devices/system/cpu/cpu*/cpufreq/*" (916 bytes, application/octet-stream)
2008-11-25 03:30 UTC, Zhang Kexin
no flags Details
RHEL5 fix for this issue (868 bytes, patch)
2008-12-01 14:47 UTC, Prarit Bhargava
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2009:0225 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.3 kernel security and bug fix update 2009-01-20 16:06:24 UTC

Description Zhang Kexin 2008-11-25 01:17:49 UTC
Description of problem:


Version-Release number of selected component (if applicable):
2.6.18-119.el5PAE

How reproducible:
always

Steps to Reproduce:
1.modprobe -r acpi_cpufreq
2.
3.
  
Actual results:
kernel panic

Expected results:
module is unloaded correctly or report it can not be unload because it's in use

Additional info:
the machine is a centrino machine, for kernel version equal to or older than 2.6.18-118, speedstep-centrino is used and it is compiled in kernel. so acpi_cpufreq is not used on these kernels, if I issued "modprobe acpi_cpufreq" deliberately, it gives "FATAL: Error inserting acpi_cpufreq (/lib/modules/2.6.18-119.el5PAE/kernel/arch/i386/kernel/cpu/cpufreq/acpi_cpufreq.ko): Device or resource busy"

while on kernel newer than 2.6.18-118, the acpi_cpufreq driver has some updates to support most centrino systems, and centrino driver is compiled as module, so be default acpi_cpufreq is used as the cpufreq driver. see bz  449787

Comment 1 Zhang Kexin 2008-11-25 01:26:32 UTC
the machine I used is a rhts machine named hp-xw4800-01.rhts.bos.redhat.com. kernel 2.6.18-124 also has the same problem.

Comment 2 Zhang Kexin 2008-11-25 01:27:22 UTC
panic info:

[root@hp-xw4800-01 ~]# BUG: unable to handle kernel paging request at virtual address 076129ff
 printing eip:
c046d45c
*pde = 00000000
Oops: 0000 [#1]
SMP 
last sysfs file: /devices/pci0000:3f/0000:3f:00.0/irq
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 xfrm_nalgo crypto_api cpufreq_ondemand acpi_cpufreq dm_multipath scsi_dh video backlight sbs i2c_ec i2c_core button battery asus_acpi ac parport_pc lp parport floppy sr_mod cdrom snd_hda_intel snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device tg3 libphy snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_hwdep snd sg soundcore pcspkr serio_raw dm_snapshot dm_zero dm_mirror dm_log dm_mod ahci libata sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    3
EIP:    0060:[<c046d45c>]    Not tainted VLI
EFLAGS: 00010297   (2.6.18-119.el5PAE #1) 
EIP is at free_percpu+0x12/0x36
eax: 00000000   ebx: 00000000   ecx: 00000202   edx: 00000000
esi: 076129ff   edi: 00000000   ebp: f6856000   esp: f6856f54
ds: 007b   es: 007b   ss: 0068
Process modprobe (pid: 3373, ti=f6856000 task=f6ec4000 task.ti=f6856000)
Stack: f89ec380 00000020 c043d86f 69706361 7570635f 71657266 c0448000 00000000 
       080f10e0 00000081 40000003 f6ec4000 f6eed800 492b4e8a 1d173065 f6856fbc 
       00000000 000f10e0 f89ec380 00000880 f6856fa8 00000000 080f10e0 00000000 
Call Trace:
 [<c043d86f>] sys_delete_module+0x192/0x1bb
 [<c0448000>] audit_syscall_entry+0xb4/0x17d
 [<c0404f17>] syscall_call+0x7/0xb
 =======================
Code: 0b 89 c8 89 da e8 01 ff ff ff 8b 03 89 7c 83 14 40 89 03 56 9d 5b 5e 5f c3 56 89 c6 53 b8 40 b5 77 c0 f7 d6 e8 c2 78 07 00 eb 14 <8b> 04 9e e8 7a ff ff ff ba 40 b5 77 c0 89 d8 e8 c6 78 07 00 83 
EIP: [<c046d45c>] free_percpu+0x12/0x36 SS:ESP 0068:f6856f54
 <0>Kernel panic - not syncing: Fatal exception

Comment 3 Zhang Kexin 2008-11-25 01:29:47 UTC
[root@hp-xw4800-01 ~]# lsmod | grep acpi
acpi_cpufreq           14025  0

Comment 4 Zhang Kexin 2008-11-25 03:25:21 UTC
Created attachment 324558 [details]
dmesg

Comment 5 Zhang Kexin 2008-11-25 03:27:05 UTC
Created attachment 324559 [details]
cpuinfo

Comment 6 Zhang Kexin 2008-11-25 03:30:10 UTC
Created attachment 324560 [details]
content of "cat sys/devices/system/cpu/cpu*/cpufreq/*"

Comment 7 Song, Youquan 2008-11-25 04:14:46 UTC
Can you reproduce it on x86_64 version? 

It is weird because acpi_cpufreq reference number shoudld not be "0".

[root@hp-xw4800-01 ~]# lsmod | grep acpi
acpi_cpufreq           14025  0

Comment 8 Zhang Kexin 2008-11-25 04:51:19 UTC
yes, kernel panic also happen on x86_64 for the same machine.

[root@hp-xw4800-01 ~]# lsmod | grep acpi
acpi_cpufreq           14025  0

Comment 9 Zhang Kexin 2008-11-25 04:56:43 UTC
but on another machine intel-s3ea2-03.rhts.bos.redhat.com which has the same cpu family and model as hp-xw4800-01.rhts.bos.redhat.com.

cpu family      : 6
model           : 26

it does not have the problem, and 

[root@hp-xw4800-01 ~]# lsmod | grep acpi
acpi_cpufreq           14025  1

Comment 10 Song, Youquan 2008-11-25 06:19:01 UTC
From the dmesg, hp-xw4800-01.rhts.bos.redhat.com have some ACPI issue. 
You can update the new BIOS.   

From the "acpi_cpufreq           14025  0", the cpufreq subsystem is not correctly established.  
You can try to run /etc/init.d/cpuspeed stop, then modprobe -r cpufreq_ondemand governors etc., then modprobe acpi-cpufreq etc. 

ACPI Error (evgpe-0711): No handler or method for GPE[ 0], disabling event [20060707]
ACPI Error (evgpe-0711): No handler or method for GPE[ 1], disabling event [20060707]
ACPI Error (evgpe-0711): No handler or method for GPE[ 2], disabling event [20060707]
ACPI Error (evgpe-0711): No handler or method for GPE[ 6], disabling event [20060707]
ACPI Error (evgpe-0711): No handler or method for GPE[ 7], disabling event [20060707]
ACPI Error (evgpe-0711): No handler or method for GPE[ A], disabling event [20060707]
ACPI Error (evgpe-0711): No handler or method for GPE[ F], disabling event [20060707]

Comment 11 Zhang Kexin 2008-11-25 09:38:03 UTC
thanks.
Has filed a ticket asking to update the BIOS.
On old bios, did following: 
[root@hp-xw4800-01 ~]# /etc/init.d/cpuspeed stop
[root@hp-xw4800-01 ~]# modprobe -r cpufreq_ondemand
[root@hp-xw4800-01 ~]# modprobe -r cpufreq_powersave
[root@hp-xw4800-01 ~]# modprobe -r cpufreq_conservative
[root@hp-xw4800-01 ~]# modprobe -r acpi_cpufreq
still kernel panic.

Comment 12 Prarit Bhargava 2008-11-25 19:54:17 UTC
(In reply to comment #10)
> From the dmesg, hp-xw4800-01.rhts.bos.redhat.com have some ACPI issue. 
> You can update the new BIOS.   

<snip>

> ACPI Error (evgpe-0711): No handler or method for GPE[ 0], disabling event
> [20060707]
> ACPI Error (evgpe-0711): No handler or method for GPE[ 1], disabling event
> [20060707]
> ACPI Error (evgpe-0711): No handler or method for GPE[ 2], disabling event
> [20060707]
> ACPI Error (evgpe-0711): No handler or method for GPE[ 6], disabling event
> [20060707]
> ACPI Error (evgpe-0711): No handler or method for GPE[ 7], disabling event
> [20060707]
> ACPI Error (evgpe-0711): No handler or method for GPE[ A], disabling event
> [20060707]
> ACPI Error (evgpe-0711): No handler or method for GPE[ F], disabling event
> [20060707]

This is a known issue on the HP xw series systems.  A BIOS update currently will not resolve the GPE register errors.

A WAR for the GPE error has been submitted to RHKL and will ship in 5.4.

Comment 13 Prarit Bhargava 2008-11-26 14:15:01 UTC
Tracked this down to RHEL5 commit e12bfee9772aa06e05e40d08918ca838e21db7f0.

Must be some weird module ordering issue?

P.

Comment 14 Song, Youquan 2008-12-01 08:30:46 UTC
What's the status of this bug?
Can you reproduce it when build the upstream kernel such as 2.6.28-rc4 etc?

Comment 15 Prarit Bhargava 2008-12-01 11:09:30 UTC
Hi Youquan, I'm working this issue right now -- it does not happen upstream AFAICT.

P.

Comment 16 Prarit Bhargava 2008-12-01 14:47:14 UTC
Created attachment 325237 [details]
RHEL5 fix for this issue

Comment 19 Don Zickus 2008-12-09 21:05:29 UTC
in kernel-2.6.18-126.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 23 errata-xmlrpc 2009-01-20 20:18:59 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2009-0225.html


Note You need to log in before you can comment on or make changes to this bug.