Bug 443354

Summary: cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed gets BUG on Transmeta Crusoe CPU
Product: [Fedora] Fedora Reporter: CHIKAMA Masaki <masaki.chikama>
Component: kernelAssignee: Dave Jones <davej>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: low Docs Contact:
Priority: low    
Version: 9CC: kernel-maint, pfrields
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: 2.6.25.6-55.fc9 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-07-15 06:51:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description CHIKAMA Masaki 2008-04-21 02:57:20 UTC
Description of problem:
 Reading from /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
gets BUG on Transmeta(tm) Crusoe(tm) Processor TM5600.

Version-Release number of selected component (if applicable):
kernel-2.6.25-1.fc9

How reproducible:
always

Steps to Reproduce:
1. cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed 
2.
3.
  
Actual results:
 Segmentation fault

Expected results:
 600000 (current cpu Hz)


Additional info:

BUG: unable to handle kernel NULL pointer dereference at 00000014
IP: [<c059c473>] show_scaling_setspeed+0x9/0x28
*pde = 0dfdc067 *pte = 00000000
Oops: 0000 [#2] SMP
Modules linked in: autofs4 nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4
xt_state nf_conntrack xt_tcpudp iptable_filter ip_tables x_tables ipv6
snd_ali5451 snd_ac97_codec ac97_bus snd_seq_dummy pcspkr snd_seq_oss
snd_seq_midi_event i2c_ali15x3 snd_seq battery ac button snd_seq_device
i2c_ali1535 snd_pcm_oss i2c_core snd_mixer_oss 8139too firewire_ohci
firewire_core crc_itu_t snd_pcm 8139cp mii alim1535_wdt snd_timer snd soundcore
snd_page_alloc sg dm_snapshot dm_zero dm_mirror dm_mod pata_ali pata_acpi
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd ehci_hcd
[last unloaded: scsi_wait_scan]
 
Pid: 2021, comm: cat Tainted: G      D  (2.6.25-1.fc9.i686 #1)
EIP: 0060:[<c059c473>] EFLAGS: 00010286 CPU: 0
EIP is at show_scaling_setspeed+0x9/0x28
EAX: 00000000 EBX: ce894300 ECX: c059c46a EDX: cdfc0000
ESI: ce894300 EDI: c072eda0 EBP: cdfe4f30 ESP: cdfe4f2c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cat (pid: 2021, ti=cdfe4000 task=cdfe0000 task.ti=cdfe4000)
Stack: fffffffb cdfe4f48 c059de48 cdfc0000 c072ecd0 cdfc1100 ce3ae6c0 cdfe4f70
       c04bb588 00001000 08052000 cdfc1114 c072ecd0 ce89434c cdf9a960 c04bb505
       00001000 cdfe4f90 c0482d7c cdfe4f9c 08052000 00001000 cdf9a960 fffffff7
Call Trace:
 [<c059de48>] ? show+0x45/0x5e
 [<c04bb588>] ? sysfs_read_file+0x83/0xe0
 [<c04bb505>] ? sysfs_read_file+0x0/0xe0
 [<c0482d7c>] ? vfs_read+0x87/0x12b
 [<c0482eb9>] ? sys_read+0x3b/0x60
 [<c0405bf2>] ? syscall_call+0x7/0xb
 =======================
Code: c4 0c 48 75 0f 8b 4b 28 89 d8 8b 55 f4 ff 51 18 89 f1 eb 05 b9 ea ff ff ff
8d 65 f8 89 c8 5b 5e 5d c3 55 89 e5 53 89 c3 8b 40 28 <8b> 48 14 85 c9 75 0f 68
f2 dd 6e c0 52 e8 9c 7f f5 ff 5a 59 eb
EIP: [<c059c473>] show_scaling_setspeed+0x9/0x28 SS:ESP 0068:cdfe4f2c

# cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineTMx86
cpu family      : 5
model           : 4
model name      : Transmeta(tm) Crusoe(tm) Processor TM5600
stepping        : 3
cpu MHz         : 600.000
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr cx8 cmov mmx longrun constant_tsc up
bogomips        : 1196.15
clflush size    : 32

Comment 1 Chuck Ebbert 2008-04-21 20:34:02 UTC
This:

Oops: 0000 [#2] SMP
            ^^

means there was an earlier oops. What did that one say?

Comment 2 CHIKAMA Masaki 2008-04-22 00:54:10 UTC
Oh, sorry. Here is the first oops at boot time.

BUG: unable to handle kernel NULL pointer dereference at 00000014
IP: [<c059c473>] show_scaling_setspeed+0x9/0x28
*pde = 0e027067 *pte = 00000000
Oops: 0000 [#1] SMP
Modules linked in: ipv6 snd_ali5451 snd_ac97_codec ac97_bus snd_seq_dummy pcspkr
snd_seq_oss snd_seq_midi_event snd_seq i2c_ali15x3 battery ac snd_seq_device
button i2c_ali1535 snd_pcm_oss i2c_core snd_mixer_oss firewire_ohci
firewire_core 8139too snd_pcm crc_itu_t 8139cp mii alim1535_wdt snd_timer snd
soundcore snd_page_alloc sg dm_snapshot dm_zero dm_mirror dm_mod pata_ali
pata_acpi ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd
ehci_hcd [last unloaded: scsi_wait_scan]
                                                                                
Pid: 1265, comm: cpuspeed Not tainted (2.6.25-1.fc9.i686 #1)
EIP: 0060:[<c059c473>] EFLAGS: 00010286 CPU: 0
EIP is at show_scaling_setspeed+0x9/0x28
EAX: 00000000 EBX: ce891300 ECX: c059c46a EDX: ce23d000
ESI: ce891300 EDI: c072eda0 EBP: ce3e5f30 ESP: ce3e5f2c
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process cpuspeed (pid: 1265, ti=ce3e5000 task=ce236000 task.ti=ce3e5000)
Stack: fffffffb ce3e5f48 c059de48 ce23d000 c072ecd0 ce2d18c0 ce3b06c0 ce3e5f70
       c04bb588 00001000 b7ffb000 ce2d18d4 c072ecd0 ce89134c ce2b6a00 c04bb505
       00001000 ce3e5f90 c0482d7c ce3e5f9c b7ffb000 ce9a69f8 ce2b6a00 fffffff7
Call Trace:
 [<c059de48>] ? show+0x45/0x5e
 [<c04bb588>] ? sysfs_read_file+0x83/0xe0
 [<c04bb505>] ? sysfs_read_file+0x0/0xe0
 [<c0482d7c>] ? vfs_read+0x87/0x12b
 [<c0482eb9>] ? sys_read+0x3b/0x60
 [<c0405bf2>] ? syscall_call+0x7/0xb
 =======================
Code: c4 0c 48 75 0f 8b 4b 28 89 d8 8b 55 f4 ff 51 18 89 f1 eb 05 b9 ea ff ff ff
8d 65 f8 89 c8 5b 5e 5d c3 55 89 e5 53 89 c3 8b 40 28 <8b> 48 14 85 c9 75 0f 68
f2 dd 6e c0 52 e8 9c 7f f5 ff 5a 59 eb
EIP: [<c059c473>] show_scaling_setspeed+0x9/0x28 SS:ESP 0068:ce3e5f2c
---[ end trace a012c7af20d2cc0a ]---


And this oops seems to occurs only Transmeta cpu.
In linux-2.6.git/drivers/cpufreq/cpufreq.c,

static ssize_t show_scaling_setspeed(struct cpufreq_policy *policy, char *buf)
{
        if (!policy->governor->show_setspeed)
                return sprintf(buf, "<unsupported>\n");
                                                                                
        return policy->governor->show_setspeed(policy, buf);
}
                                                                               
but if CPU has longrun, no governor is set in bellow function.
 
static int cpufreq_parse_governor (char *str_governor, unsigned int *policy,
                                struct cpufreq_governor **governor)
...

        if (cpufreq_driver->setpolicy) {
                if (!strnicmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
                        *policy = CPUFREQ_POLICY_PERFORMANCE;
                        err = 0;
                } else if (!strnicmp(str_governor, "powersave",
                                                CPUFREQ_NAME_LEN)) {
                        *policy = CPUFREQ_POLICY_POWERSAVE;
                        err = 0;
                }


Comment 3 CHIKAMA Masaki 2008-05-01 06:00:38 UTC
If cpu specific cpufreq driver(i.e. longrun) has "setpolicy" function,
governor object isn't set into cpufreq_policy object at 
"__cpufreq_set_policy" function in driver/cpufreq/cpufreq.c .

This cause null object access at "store_scaling_setspeed" and
"show_scaling_setspeed" function in driver/cpufreq/cpufreq.c 
when reading or writing through /sys interface
(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed)

Here is a proposed patch to check governor against vanilla 2.6.25.
It works for me.


--- cpufreq.c.org       2008-05-01 13:55:19.000000000 +0900
+++ cpufreq.c   2008-05-01 13:59:34.000000000 +0900
@@ -607,7 +607,7 @@ static ssize_t store_scaling_setspeed(st
        unsigned int freq = 0;
        unsigned int ret;
  
-       if (!policy->governor->store_setspeed)
+       if (!policy->governor || !policy->governor->store_setspeed)
                return -EINVAL;
  
        ret = sscanf(buf, "%u", &freq);
@@ -621,7 +621,7 @@ static ssize_t store_scaling_setspeed(st
  
 static ssize_t show_scaling_setspeed(struct cpufreq_policy *policy, char *buf)
 {
-       if (!policy->governor->show_setspeed)
+       if (!policy->governor || !policy->governor->show_setspeed)
                return sprintf(buf, "<unsupported>\n");
  
        return policy->governor->show_setspeed(policy, buf);


Comment 4 Bug Zapper 2008-05-14 09:49:38 UTC
Changing version to '9' as part of upcoming Fedora 9 GA.
More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping