Created attachment 704059 [details] panic.log Description of problem: kernel 3.9.0-0.rc0.git11.1.fc19.x86_64 panics on boot on VMware Fusion. Version-Release number of selected component (if applicable): 3.9.0-0.rc0.git11.1.fc19.x86_64 How reproducible: 100% Steps to Reproduce: 1. Boot Fedora rawhide on VMware Fusion. 2. 3. Actual results: kernel panic Expected results: no kernel panic Additional info: [ 5.740065] Intel P-state driver initializing. [ 5.741503] Intel pstate controlling: cpu 0 [ 5.742983] divide error: 0000 [#1] SMP [ 5.744284] Modules linked in: [ 5.745225] CPU 0 [ 5.745612] Pid: 1, comm: swapper/0 Not tainted 3.9.0-0.rc0.git11.1.fc19.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform [ 5.750186] RIP: 0010:[<ffffffff8158562d>] [<ffffffff8158562d>] intel_pstate_set_policy+0x6d/0x130 [ 5.752686] RSP: 0018:ffff880033535bb0 EFLAGS: 00010246 [ 5.754478] RAX: 0000000000000000 RBX: ffff880033535c10 RCX: 0000000000000000 [ 5.757091] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 5.759356] RBP: ffff880033535bc0 R08: a3d70a3d70a3d70b R09: 0000000000000064 [ 5.761265] R10: 0000000000000000 R11: 0000000000000064 R12: ffff88002ace2760 [ 5.763102] R13: 0000000000000000 R14: 0000000000000040 R15: 0000000000000040 [ 5.765004] FS: 0000000000000000(0000) GS:ffff880036600000(0000) knlGS:0000000000000000 [ 5.767496] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.769486] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000407f0 [ 5.771679] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5.773942] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 5.775722] Process swapper/0 (pid: 1, threadinfo ffff880033534000, task ffff880033538000) [ 5.778419] Stack: [ 5.779495] 0000000000000000 ffff880033535c10 ffff880033535bf0 ffffffff8157d773 [ 5.781799] ffff88002ace2808 ffff88002ace2760 0000000000000000 0000000000000040 [ 5.784320] ffff880033535d88 ffffffff8157dd74 ffff880000000000 0000000000000282 [ 5.786624] Call Trace: [ 5.787644] [<ffffffff8157d773>] __cpufreq_set_policy+0x103/0x280 [ 5.789789] [<ffffffff8157dd74>] cpufreq_add_dev_interface+0x294/0x350 [ 5.791991] [<ffffffff8157daa0>] ? cpufreq_update_policy+0x1b0/0x1b0 [ 5.793740] [<ffffffff8157ec4f>] cpufreq_add_dev+0x4af/0x5e0 [ 5.795496] [<ffffffff8146ed49>] subsys_interface_register+0xa9/0xf0 [ 5.797269] [<ffffffff8157c63f>] cpufreq_register_driver+0x9f/0x200 [ 5.799054] [<ffffffff81f5ea0c>] ? intel_pstate_setup+0x2f/0x2f [ 5.800932] [<ffffffff81f5ea9c>] intel_pstate_init+0x90/0x143 [ 5.802796] [<ffffffff8100210a>] do_one_initcall+0x10a/0x160 [ 5.804657] [<ffffffff81f1f09c>] kernel_init_freeable+0x17c/0x202 [ 5.806830] [<ffffffff81f1e88b>] ? do_early_param+0x88/0x88 [ 5.808404] [<ffffffff816fb6f0>] ? rest_init+0x150/0x150 [ 5.809862] [<ffffffff816fb6fe>] kernel_init+0xe/0x190 [ 5.811507] [<ffffffff8172502c>] ret_from_fork+0x7c/0xb0 [ 5.813352] [<ffffffff816fb6f0>] ? rest_init+0x150/0x150 [ 5.814812] Code: 41 ba 64 00 00 00 8b 7b 1c 31 d2 41 b9 64 00 00 00 49 b8 0b d7 a3 70 3d 0a d7 a3 41 bb 64 00 00 00 41 0f af c2 45 31 d2 44 89 d1 <f7> f7 85 c0 41 0 f 48 c2 48 63 f0 83 fe 64 49 0f 4f f1 89 35 b3
Happens on MS Hyper-V, too.
Reported upstream. http://thread.gmane.org/gmane.linux.kernel.cpufreq/9515 If it doesn't get resolved relatively soon, we'll just turn the driver off.
Follow-up to comment #0. 3.9.0-0.rc0.git11.1.fc19.x86_64 - bad 3.9.0-0.rc0.git7.1.fc19.x86_64 - bad 3.9.0-0.rc0.git6.1.fc19.x86_64 - bad 3.9.0-0.rc0.git3.1.fc19.x86_64 - bad 3.8.0-0.rc7.git3.1.fc19.x86_64 - good
The oops is coming from a new driver. The new driver was turned on in 3.9.0-rc0.git2.1.fc19. Anything after that will have the issue.
The cause seems to be the VM returning a bad value for the highest pstate (frequency). I do not have either of the VMs ATM to test with the patch below should fix the issue. If someone could try it while I aquire one of the VMs to test against I would appreciate it. commit 8f93b00ec8a5851e46c7244591a5a6c85ba91bb8 Author: Dirk Brandewie <dirk.brandewie> Date: Mon Mar 4 10:14:42 2013 -0800 cpufreq/intel_pstate: Do not load on VM that do not report max P state. It seems some VMs support the P state MSRs but return zeros. Fail gracefully if we are running in this environment. https://bugzilla.redhat.com/show_bug.cgi?id=916833 Reported-by: jwboyer Signed-off-by: Dirk Brandewie <dirk.j.brandewie> --- drivers/cpufreq/intel_pstate.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 096fde0..2bfd083 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -662,6 +662,9 @@ static int intel_pstate_set_policy(struct cpufreq_policy *policy) cpu = all_cpu_data[policy->cpu]; + if (!policy->cpuinfo.max_freq) + return -ENODEV; + intel_pstate_get_min_max(cpu, &min, &max); limits.min_perf_pct = (policy->min * 100) / policy->cpuinfo.max_freq;
I added the patch to the 3.9-rc1 build going out today. Vinson and Tomasz, please test tomorrow's rawhide build.
Created attachment 705325 [details] kernel-3.9.0-0.rc1.git0.1.fc19 in kvm Does not fix the problem. Oops happens a bit later, but still inside the same driver. See attached screenshot. BTW, it happens in KVM, too, so should be easy to reproduce. Guest CPU must be set to maximally similar to host's.
that looks like a different trace. looks like it's trying to remove sysfs nodes that weren't set up.
[ 2.215291] Intel P-state driver initializing. [ 2.216657] Intel pstate controlling: cpu 0 [ 2.217986] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 2.220395] IP: [<ffffffff8120f667>] sysfs_remove_group+0x17/0xf0 [ 2.222119] PGD 0 [ 2.223479] Oops: 0000 [#1] SMP [ 2.224655] Modules linked in: [ 2.225950] CPU 0 [ 2.226400] Pid: 1, comm: swapper/0 Not tainted 3.9.0-0.rc1.git0.1.fc19.x86_64 #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform [ 2.230217] RIP: 0010:[<ffffffff8120f667>] [<ffffffff8120f667>] sysfs_remove_group+0x17/0xf0 [ 2.232925] RSP: 0018:ffff88003da43e40 EFLAGS: 00010286 [ 2.234939] RAX: 00000000ffffffed RBX: 0000000000000000 RCX: 0000000000000006 [ 2.236687] RDX: 0000000000000000 RSI: ffffffff81cb37a0 RDI: 0000000000000000 [ 2.238422] RBP: ffff88003da43e58 R08: ffffffff81cb2a70 R09: ffff88003d9cb188 [ 2.240167] R10: 0000000000003472 R11: 0000000000000000 R12: ffffffff81d44112 [ 2.242110] R13: ffffffff81cb37a0 R14: 0000000000000000 R15: 0000000000000000 [ 2.244157] FS: 0000000000000000(0000) GS:ffff88003f600000(0000) knlGS:0000000000000000 [ 2.246695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2.248150] CR2: 0000000000000030 CR3: 0000000001c0c000 CR4: 00000000000407f0 [ 2.249898] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 2.251614] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 2.253378] Process swapper/0 (pid: 1, threadinfo ffff88003da42000, task ffff88003da48000) [ 2.255831] Stack: [ 2.256962] ffffffff81ce4420 ffffffff81d44112 0000000000000000 ffff88003da43e80 [ 2.259576] ffffffff814ef401 ffffffff81ce4420 ffffffff81d44112 0000000000000000 [ 2.261982] ffff88003da43e98 ffffffff81d4423e ffffffff81e2c7d0 ffff88003da43ec8 [ 2.264378] Call Trace: [ 2.265262] [<ffffffff81d44112>] ? intel_pstate_setup+0x2f/0x2f [ 2.267048] [<ffffffff814ef401>] intel_pstate_exit+0x21/0xc0 [ 2.268934] [<ffffffff81d44112>] ? intel_pstate_setup+0x2f/0x2f [ 2.270679] [<ffffffff81d4423e>] intel_pstate_init+0x12c/0x141 [ 2.272286] [<ffffffff8100210a>] do_one_initcall+0x10a/0x160 [ 2.273863] [<ffffffff81d0704f>] kernel_init_freeable+0x177/0x1fa [ 2.275768] [<ffffffff81d06887>] ? do_early_param+0x88/0x88 [ 2.277614] [<ffffffff816296e0>] ? rest_init+0x80/0x80 [ 2.279284] [<ffffffff816296ee>] kernel_init+0xe/0x190 [ 2.280596] [<ffffffff8164e0ec>] ret_from_fork+0x7c/0xb0 [ 2.281922] [<ffffffff816296e0>] ? rest_init+0x80/0x80 [ 2.283239] Code: e0 d7 ff ff 5b 41 5c 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 55 49 89 f5 41 54 53 48 8b 16 48 89 fb <4c> 8b 67 30 48 85 d2 0f 84 7c 00 00 00 31 f6 4c 89 e7 e8 d2 e2 [ 2.291231] RIP [<ffffffff8120f667>] sysfs_remove_group+0x17/0xf0 [ 2.293064] RSP <ffff88003da43e40> [ 2.294126] CR2: 0000000000000030 [ 2.295294] hpet1: lost 4 rtc interrupts [ 2.296486] ---[ end trace 7af6af4f7a24259e ]---
Here is the patch that fixes the error path problem exposed by the fix to the divide error. Both patches have been sent to Rafael for merge. Sorry for the two step process for getting this fixed, commit 98dbb39cd642e8e34bf773aacba464e219a1a718 Author: Dirk Brandewie <dirk.brandewie> Date: Tue Mar 5 08:19:02 2013 -0800 cpufreq/intel_pstate: Fix intel_pstate_init() error path If cpufreq_register_driver() fails just free memory that has been allocated and return. intel_pstate_exit() function is removed sine we are built-in only now there is no reason for a module exit proceedure. Reported-by:Konrad Rzeszutek Wilk <konrad.wilk> Signed-off-by: Dirk Brandewie <dirk.j.brandewie> --- drivers/cpufreq/intel_pstate.c | 39 +++++++++++---------------------------- 1 files changed, 11 insertions(+), 28 deletions(-) diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c index 2bfd083..f6dd1e7 100644 --- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -750,37 +750,11 @@ static struct cpufreq_driver intel_pstate_driver = { .owner = THIS_MODULE, }; -static void intel_pstate_exit(void) -{ - int cpu; - - sysfs_remove_group(intel_pstate_kobject, - &intel_pstate_attr_group); - debugfs_remove_recursive(debugfs_parent); - - cpufreq_unregister_driver(&intel_pstate_driver); - - if (!all_cpu_data) - return; - - get_online_cpus(); - for_each_online_cpu(cpu) { - if (all_cpu_data[cpu]) { - del_timer_sync(&all_cpu_data[cpu]->timer); - kfree(all_cpu_data[cpu]); - } - } - - put_online_cpus(); - vfree(all_cpu_data); -} -module_exit(intel_pstate_exit); - static int __initdata no_load; static int __init intel_pstate_init(void) { - int rc = 0; + int cpu, rc = 0; const struct x86_cpu_id *id; if (no_load) @@ -805,7 +779,16 @@ static int __init intel_pstate_init(void) intel_pstate_sysfs_expose_params(); return rc; out: - intel_pstate_exit(); + get_online_cpus(); + for_each_online_cpu(cpu) { + if (all_cpu_data[cpu]) { + del_timer_sync(&all_cpu_data[cpu]->timer); + kfree(all_cpu_data[cpu]); + } + } + + put_online_cpus(); + vfree(all_cpu_data); return -ENODEV; } device_initcall(intel_pstate_init);
(In reply to comment #10) > Here is the patch that fixes the error path problem exposed by the fix to > the divide error. Both patches have been sent to Rafael for merge. Sorry > for the two step process for getting this fixed, Applied to rawhide as well. Will be in tomorrow's rawhide.
kernel-3.9.0-0.rc1.git0.3.fc19 boot successfully on my Hyper-V and KVM.
Thanks again for testing everyone.