Bug 690928

Summary: kernel-2.6.39-0.rc0.git11.0.fc16.x86_64 fails to boot tripped by microcode.ko
Product: [Fedora] Fedora Reporter: Michal Jaegermann <michal>
Component: microcode_ctlAssignee: Anton Arapov <anton>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: anton, freddy, gansalmon, itamar, john.ellson, jonathan, karo1170, kernel-maint, madhu.chinakonda, me, nobody, robatino, stevenward666, usdanskys, vietzke
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: microcode_ctl-1.17-15.fc16 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-30 07:18:59 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michal Jaegermann 2011-03-25 20:22:13 UTC
Description of problem:

An attempt to boot kernel-2.6.39-0.rc0.git11.0.fc16.x86_64 ends up with what looks like an infinite loop of failures when attempting, probably, to start udev. A screen is flooded with hard to read messages as the whole screen scrolls by jumping all the time and I could not find a way to at least stop it for a while.

AFAICT there are complaints about attempts to use module microcode.ko with "Invalid argument" followed on the next line by something like:
"CPU0: Family 15 is not supported".

I can also notice that in earlier boot stages there are at least two
oopses, or maybe warnings, with some backtraces.  What they are I have no way to tell as they quickly disappear from a screen with no possibility to even look at these.  Nothing like that shows up with 2.6.38-0.rc7.git2.3.fc16.x86_64.

This is what /proc/cpuinfo shows on a test machine:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 5
model name      : AMD Opteron(tm) Processor 142
stepping        : 1
cpu MHz         : 1600.062
cache size      : 1024 KB
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext lm 3dnowext 3dnow up nopl
bogomips        : 3200.12
TLB size        : 1024 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 40 bits physical, 48 bits virtual
power management: ts ttp



Version-Release number of selected component (if applicable):
kernel-2.6.39-0.rc0.git11.0.fc16.x86_64

How reproducible:
always

Comment 1 Vinzenz Vietzke 2011-03-31 17:40:19 UTC
I can confirm this bug on a Lenovo Thinkpad Edge 13 (AMD).

Comment 2 Karsten Roch 2011-04-06 21:01:53 UTC
Same problem appears here on a AMD Athlon(tm) 64 X2 Dual Core Processor 4200+ with latest kernel 2.6.39-0.rc1.git5.0.fc16

Comment 3 Michal Jaegermann 2011-04-07 20:03:07 UTC
kernel-2.6.39-0.rc2.git0.0.fc16.x86_64 is equally broken too.

Comment 4 John Ellson 2011-04-14 13:16:28 UTC
See also: #694390 and #690930 -- possible dups

Comment 5 STEVEN WARD 2011-04-16 22:57:45 UTC
I want to report the same problem with my processor as well.

Here is the information of the processor that I use:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 6
model           : 10
model name      : AMD Athlon(tm) XP 3000+
stepping        : 0
cpu MHz         : 2100.349
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 mmx fxsr sse syscall mmxext 3dnowext 3dnow up
bogomips        : 4200.69
clflush size    : 32
cache_alignment : 32
address sizes   : 34 bits physical, 32 bits virtual
power management: ts

The error message that scrolls on the screen is:

modprobe: FATAL: Error inserting microcode (/lib/modules/2.6.39-0.rc2.git3.0.fc16.i686/kernel/arch/x86/kernel/microcode.ko): Invalid argument

I have tried compiling various kernel release-candidates and the latest patches from kernel.org,but to no avail

Regards,
       STEVE555

Comment 6 Tobias Florek 2011-04-18 06:30:41 UTC
*** Bug 694390 has been marked as a duplicate of this bug. ***

Comment 7 Steven Usdansky 2011-04-20 12:28:02 UTC
~$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 107
model name	: AMD Athlon(tm) 64 X2 Dual Core Processor 5400+
stepping	: 2
cpu MHz		: 1000.000
cache size	: 512 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good nopl extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch lbrv
bogomips	: 2000.03
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc 100mhzsteps

processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 107
model name	: AMD Athlon(tm) 64 X2 Dual Core Processor 5400+
stepping	: 2
cpu MHz		: 1000.000
cache size	: 512 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good nopl extd_apicid pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy 3dnowprefetch lbrv
bogomips	: 2000.03
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc 100mhzsteps

Comment 8 Steven Usdansky 2011-04-20 12:30:48 UTC
My apologies for the noise above - still broken with kernel-2.6.39-0.rc3.git2.0.fc16.x86_64

Comment 9 Steven Usdansky 2011-04-20 15:38:47 UTC
In an attempt to work around the problem, I booted to an older kernel (2.6.38-0.rc7.git2.3.fc16) and renamed 2.6.39-0.rc3.git2.0.fc16's microcode.ko file. Rebooted with 2.6.39-0.rc3.git2.0.fc16, and everything seems to be working properly.

Comment 10 STEVEN WARD 2011-04-20 23:43:59 UTC
I can confirm with Steven Usdansky's comment above mine for the workaround does work.

I booted into my 2.6.38.2-9.fc15 kernel, renamed microcode.ko to microcode.ko.old that is in /lib/modules/2.6.39-0.rc3.git2.0.fc16.i686/kernel/arch/x86/kernel/,and then re-booted into that kernel.

The good news is the kernel booted fine.

Regards,
        STEVE555

Comment 11 Itamar Reis Peixoto 2011-04-20 23:49:32 UTC
the problem seems to be related to amd processors

I have the same problem with this motherboard


http://www.msi.com/product/mb/K8MM-V.html

Comment 12 Michal Jaegermann 2011-04-21 01:50:13 UTC
(In reply to comment #10)
> I can confirm with Steven Usdansky's comment above mine for the workaround does
> work.

With this catch that 2.6.39 kernels resurect "inconsistent lock state" bug 537697. See https://bugzilla.redhat.com/show_bug.cgi?id=537697#c12 and a new attachment for more.

This was really noted in comment 0 but it was impossible to read there.

Comment 13 Freddy Willemsen 2011-06-28 15:44:46 UTC
I can confirm this bug with all the rc's of kernel 3 so far (tried kernel-3.0-0.rc5.git0.1 just now). Using microcode.ko from a 2.6.38 kernel with the 3.0.0 kernel works.

cat /proc/cpuinfo 
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 107
model name	: AMD Athlon(tm) X2 Dual Core Processor BE-2400


kernel-3.0-0.rc5.git0.1 works fine on my Dell laptop (Intel Inside).

Comment 14 Freddy Willemsen 2011-06-28 16:43:25 UTC
Using the workaround mentioned in #690930 (removing microcode_ctl), kernel-3.0-0.rc5.git0.1 works on my F15 install.

Comment 15 Michal Jaegermann 2011-07-24 17:58:50 UTC
Apparently "a fix" for Intel, as discussed in bug 690930, infected rawhide as well and none of current kernels really boots on AMD machines.  This makes not that easy to workaround the problem.

It seems that the the real discussion is happening in comments to bug 690930 and this is clearly the same problem so lets make that a duplicate.

*** This bug has been marked as a duplicate of bug 690930 ***