Bug 1353103

Summary: Kernel 4.5.7-202 doesn't boot on Samsung Notebook 9
Product: [Fedora] Fedora Reporter: Andy Grover <agrover>
Component: microcode_ctlAssignee: Anton Arapov <anton>
Status: CLOSED ERRATA QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 23CC: agrover, andreas.tunek, anton, bugzilla77, chorn, coquelin.max, craig, edgar.hoch, el, erik-fedora, gansalmon, gilani, hugh, imbacen, itamar, jhocutt, j.kylmala, jonathan, kernel-maint, madhu.chinakonda, mail, marco.laverdiere, mchehab, mhorauer, poros, qguo, rc556677, richard.c.jankowski, sbonazzo, sergio, xaviblas, yjog, zukka77
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: microcode_ctl-2.1-13.fc24 microcode_ctl-2.1-13.fc23 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1361183 (view as bug list) Environment:
Last Closed: 2016-07-22 18:21:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1361183    
Attachments:
Description Flags
dmidecode output
none
dnf history since working kernel was installed none

Description Andy Grover 2016-07-06 06:26:00 UTC
Created attachment 1176738 [details]
dmidecode output

Description of problem:
Doesn't boot. I just get a blinking cursor even after removing 'rhgb quiet' from kernel cmdline.

This laptop is skylake with intel graphics, see dmi info.

4.5.7-200 works.

Version-Release number of selected component (if applicable):
kernel-4.5.7-202.x64

How reproducible:
always

Additional info:
UEFI issue? (Some failure mode where there's no boot/kernel output at all)

Comment 1 Andy Grover 2016-07-06 06:39:07 UTC
installed 4.6.3-200 from koji and it doesn't work either.

Comment 2 Josh Boyer 2016-07-06 12:02:36 UTC
If 4.5.7-200 works, that makes very little sense.  The changes between -200 and -202 are very very limited in scope to CVE fixes only.

What else was updated between the time -200 and -202 were installed?  It should be in your update log.

Comment 3 Andy Grover 2016-07-06 17:04:44 UTC
This looks suspicious to me:

initramfs-4.5.6-200.fc23.x86_64.img:                     gzip compressed data, max compression, from Unix
initramfs-4.5.7-200.fc23.x86_64.img:                     gzip compressed data, max compression, from Unix
initramfs-4.5.7-202.fc23.x86_64.img:                     ASCII cpio archive (SVR4 with no CRC)
initramfs-4.6.3-200.fc23.x86_64.img:                     ASCII cpio archive (SVR4 with no CRC)

why would this have changed?

Comment 4 Andy Grover 2016-07-06 17:06:34 UTC
Created attachment 1176950 [details]
dnf history since working kernel was installed

Comment 5 Josh Boyer 2016-07-06 17:12:22 UTC
(In reply to Andy Grover from comment #3)
> This looks suspicious to me:
> 
> initramfs-4.5.6-200.fc23.x86_64.img:                     gzip compressed
> data, max compression, from Unix
> initramfs-4.5.7-200.fc23.x86_64.img:                     gzip compressed
> data, max compression, from Unix
> initramfs-4.5.7-202.fc23.x86_64.img:                     ASCII cpio archive
> (SVR4 with no CRC)
> initramfs-4.6.3-200.fc23.x86_64.img:                     ASCII cpio archive
> (SVR4 with no CRC)
> 
> why would this have changed?

That's actually a good find.  So the file type indicated by 4.5.6 and 4.5.7-200 is indicative of an initramfs that lacks early microcode updates tacked on.  The latter two indicate that it does have microcode tacked one.

Looking at your dnf log, we find that microcode-ctl was updated between 4.5.7-200 and 4.5.7-202.

    Upgraded microcode_ctl-2:2.1-10.fc23.x86_64                       @updates

I'm now wondering if that is explicitly the problem here and the ucode that is loaded early (and it is very early in the boot process) is causing the issues.

Comment 6 Josh Boyer 2016-07-06 17:15:38 UTC
Anton, have you heard anything about the new Intel microcode update causing boot issues on skylake platforms?

Comment 7 Josh Boyer 2016-07-06 17:29:02 UTC
Andy, can you try booting with 'dis_ucode_ldr' added to the kernel command line?

Comment 8 Andy Grover 2016-07-06 18:07:34 UTC
(In reply to Josh Boyer from comment #7)
> Andy, can you try booting with 'dis_ucode_ldr' added to the kernel command
> line?

Works!

Comment 9 Josh Boyer 2016-07-06 18:14:58 UTC
(In reply to Andy Grover from comment #8)
> (In reply to Josh Boyer from comment #7)
> > Andy, can you try booting with 'dis_ucode_ldr' added to the kernel command
> > line?
> 
> Works!

Well, that's both good and bad.  It's good because we know the cause.  It's bad because if I'm understanding the bugs I found in Arch and Debian, the only way to fix it is via a BIOS/UEFI update for your machine and that has to come from your vendor.

So dis_ucode_ldr is the workaround, but I'm not sure there's going to be a solution beyond "update your firmware when the vendor fixes it."

Comment 11 Andy Grover 2016-07-06 19:00:00 UTC
OK thanks for the info. FWIW my vendor did have a bios update, and it now reports:

microcode: CPU0 sig=0x406e3, pf=0x80, revision=0x88

whereas before revision was 0x82. Still needs the dis_ucode_ldr to boot.

OK I'll stay on top of future vendor updates.

Comment 12 Anton Arapov 2016-07-07 09:00:05 UTC
Andy, Josh, ... there is indeed no way to fix this until fixed by Intel in microcode. Do we want to revert this change? Or can we temporary blacklist/disable it?

Comment 13 Josh Boyer 2016-07-07 11:25:22 UTC
I'm not aware of anyway to blacklist it.  The only ways I know how to disable it are per-machine solutions, like the dis_ucode_ldr cmdline option or rebuilding the initramfs to not have the microcode included.

Comment 14 Joonas Kylmälä 2016-07-07 13:05:38 UTC
I had this same problem on Lenovo ThinkPad x260 laptop (the same processor: skylake i5-6200U) with 4.6.3-300.fc24.x86_64 kernel. Upgrading BIOS/UEFI and reinstalling Fedora 24 with UEFI mode let me boot the system normally. Not sure if the BIOS/UEFI upgrade did the trick or changing the Fedora to UEFI mode but according to you it seems like the BIOS/UEFI upgrade did the trick. Also, to note, with the 4.5.5-300.fc24.x86_64 kernel Fedora booted normally with the old BIOS/UEFI version, so some regression has happened between 4.5.5-300 and 4.6.3-300.

Comment 15 Josh Boyer 2016-07-07 13:10:29 UTC
(In reply to Joonas Kylmälä from comment #14)
> I had this same problem on Lenovo ThinkPad x260 laptop (the same processor:
> skylake i5-6200U) with 4.6.3-300.fc24.x86_64 kernel. Upgrading BIOS/UEFI and
> reinstalling Fedora 24 with UEFI mode let me boot the system normally. Not
> sure if the BIOS/UEFI upgrade did the trick or changing the Fedora to UEFI
> mode but according to you it seems like the BIOS/UEFI upgrade did the trick.
> Also, to note, with the 4.5.5-300.fc24.x86_64 kernel Fedora booted normally
> with the old BIOS/UEFI version, so some regression has happened between
> 4.5.5-300 and 4.6.3-300.

No, this is not a kernel problem.  What happened in your case is that you reinstalled.  The installation media uses an initramfs that does not contain the problematic microcode.

There is nothing we can do in the kernel to fix this.

Comment 16 ell1e 2016-07-07 15:38:16 UTC
Lenovo Thinkpad Yoga 260 is also affected. What do I need to do to fix this? Upgrade the UEFI/BIOS firmware?

Comment 17 Josh Boyer 2016-07-07 15:38:45 UTC
*** Bug 1353586 has been marked as a duplicate of this bug. ***

Comment 18 Josh Boyer 2016-07-07 15:53:20 UTC
(In reply to Jonas Thiem from comment #16)
> Lenovo Thinkpad Yoga 260 is also affected. What do I need to do to fix this?
> Upgrade the UEFI/BIOS firmware?

If one is available, it is certainly worth a try updating it.

Comment 19 Andy Grover 2016-07-07 17:09:06 UTC
ok let me get this straight:

1. CPUs have bugs
2. Intel fixes some of these bugs with microcode updates that need to be reloaded every time after poweroff
3. The system firmware can install a microcode update
4. microcode_ctl tries to install some (more recent?) microcode update

Josh, Anton: You're saying the reason for this problem is the microcode we're installing in step 4 is bad? Or the new version is assuming some precondition is met by the firmware so that I need a new firmware rev to work with the most current microcode rev?

The current solutions are to either
a. Add 'dis_ucode_ldr' to the kernel command line so #4 is skipped
b. uninstall microcode_ctl and rebuild initrd with 'dracut -f --kver 4.5.7-202.fc23.x86_64'

with the understanding that we now might not have the latest, greatest microcode (we're solely relying on our firmware to install it, which it may not)

Yes?

Comment 20 Josh Boyer 2016-07-07 17:16:59 UTC
(In reply to Andy Grover from comment #19)
> ok let me get this straight:

Pretty close.  Some clarifications.

> 1. CPUs have bugs
> 2. Intel fixes some of these bugs with microcode updates that need to be
> reloaded every time after poweroff
> 3. The system firmware can install a microcode update
> 4. microcode_ctl tries to install some (more recent?) microcode update

Should be:

3. The system firmware is often shipped with microcode included, and loads it during system initialization before even starting the bootloader, etc.
4. Future microcode can be released stand-alone, which can then be loaded by the kernel very early in kernel boot which (normally) allows a machine to get the latest microcode without having to do a full system firmware update.
5. microcode_ctl is the package that distributes said microcode releases
6. Dracut will include microcode in the initramfs if it is present at the time of initramfs creation, and the kernel will load it from that extremely early in boot.

> Josh, Anton: You're saying the reason for this problem is the microcode
> we're installing in step 4 is bad? Or the new version is assuming some
> precondition is met by the firmware so that I need a new firmware rev to
> work with the most current microcode rev?

It is difficult to tell which one of those scenarios is true.  It might be a bit of both, but the latter is given more credence since the new ucode seems to work with some firmware updates from some vendors. 

> The current solutions are to either
> a. Add 'dis_ucode_ldr' to the kernel command line so #4 is skipped
> b. uninstall microcode_ctl and rebuild initrd with 'dracut -f --kver
> 4.5.7-202.fc23.x86_64'

(or downgrade microcode_ctl rather than uninstall it)

and

c. find a system firmware update from your vendor and apply that to see if it works.

> with the understanding that we now might not have the latest, greatest
> microcode (we're solely relying on our firmware to install it, which it may
> not)
> 
> Yes?

You essentially have that all correct, yes.

Comment 21 Erik van Pienbroek 2016-07-07 18:46:36 UTC
I can confirm that this issue also exists for the HP Elitebook 850 G3.
BIOS versions 1.04 and 1.05 are affected by this bug. Updating to BIOS version 1.07 resolves the issue

Comment 22 Adam Williamson 2016-07-07 21:15:46 UTC
*** Bug 1352700 has been marked as a duplicate of this bug. ***

Comment 23 Martin Horauer 2016-07-08 08:03:23 UTC
A BIOS update for my Lenovo T460s fixed this issue.

http://thinkwiki.de/BIOS-Update_ohne_optisches_Laufwerk_unter_Linux

Comment 24 mlaverdiere 2016-07-08 12:23:39 UTC
On an Asus UX305CA, upgrading the BIOS to version 300 has solved the non-booting problem with kernel 4.6.3 on Fedora 24 (I have always been able to boot with kernel 4.5.7 though).

Comment 25 Josh Boyer 2016-07-08 12:32:42 UTC
*** Bug 1351943 has been marked as a duplicate of this bug. ***

Comment 26 Josh Boyer 2016-07-15 13:20:26 UTC
*** Bug 1353061 has been marked as a duplicate of this bug. ***

Comment 27 Richard Chan 2016-07-18 02:47:29 UTC
*** Bug 1357317 has been marked as a duplicate of this bug. ***

Comment 28 Richard Chan 2016-07-18 03:14:11 UTC
On an Asus UX305UA, removing "load_video" from grub menu works.
Curious: why does load_video trigger the failure?

UX305UA does not have an updated BIOS :-(

BOOT_IMAGE=/vmlinuz-4.6.4-301.fc24.x86_64 root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap video=1920x1080 LANG=en_US.UTF-8


microcode_ctl-2.1-12.fc24.x86_64


[    0.000000] microcode: microcode updated early to revision 0x8a, date = 2016-04-06
[    0.724234] microcode: CPU0 sig=0x406e3, pf=0x80, revision=0x8a
[    0.724608] microcode: CPU1 sig=0x406e3, pf=0x80, revision=0x8a
[    0.724976] microcode: CPU2 sig=0x406e3, pf=0x80, revision=0x8a
[    0.725475] microcode: CPU3 sig=0x406e3, pf=0x80, revision=0x8a
[    0.725857] microcode: Microcode Update Driver: v2.01 <tigran.co.uk>, Peter Oruba

Comment 29 Richard Chan 2016-07-18 04:28:45 UTC
Sorry for the noise, load_video, was a red herring.

This "success" was achieved by booting into 4.5.7-300 and warm booting into 4.6.4-301 worked. Then the early microcode seemed to work.

Comment 30 Rich Jankowski 2016-07-19 00:24:17 UTC
The BIOS update fixed this issue on my 2016 X1 Carbon.

Comment 31 Josh Boyer 2016-07-19 14:08:00 UTC
*** Bug 1357862 has been marked as a duplicate of this bug. ***

Comment 32 Sandro Bonazzola 2016-07-20 19:27:18 UTC
(In reply to Martin Horauer from comment #23)
> A BIOS update for my Lenovo T460s fixed this issue.
> 
> http://thinkwiki.de/BIOS-Update_ohne_optisches_Laufwerk_unter_Linux

Can you provide english instructions?
I've a T460s and have the same exact issue.

BIOS Information
        Vendor: LENOVO
        Version: N1CET37W (1.05 )
        Release Date: 01/15/2016

Comment 33 Martin Horauer 2016-07-20 19:47:01 UTC
If you have Windows on your T460s go to the official Lenovo page and download the latest BIOS update along with their update utility.

If not (as in my case) you can use the commands listed on the above german wiki page. The steps are:

(1) Download the latest BIOS Update, e.g.: 

https://download.lenovo.com/pccbbs/mobiles/n1cur06w.iso

(2) Obtain the update tool for Linux:

wget https://userpages.uni-koblenz.de/~krienke/ftp/noarch/geteltorito/geteltorito.pl

(3) Create a bootable image:

geteltorito -o thinkpadbios.img n1cur06w.iso

(4) Place an empty USB stick in your computer and perform the following command (you'll need to replace sdX with the device name for your USB stick showing up).

sudo dd if=thinkpadbios.img of=/dev/sdX bs=1M
sync

(5) Boot from the USB stick and do the BIOS update.

Cross your fingers and you are (hopefully) done.

Comment 34 Martin Horauer 2016-07-20 19:57:53 UTC
(In reply to Sandro Bonazzola from comment #32)
> (In reply to Martin Horauer from comment #23)
> > A BIOS update for my Lenovo T460s fixed this issue.
> > 
> > http://thinkwiki.de/BIOS-Update_ohne_optisches_Laufwerk_unter_Linux
> 
> Can you provide english instructions?
> I've a T460s and have the same exact issue.
> 
> BIOS Information
>         Vendor: LENOVO
>         Version: N1CET37W (1.05 )
>         Release Date: 01/15/2016

Sorry I should have replied. See my comment 33.

Comment 35 Stefan Midjich 2016-07-21 07:32:10 UTC
I had the same problem om Thinkpad x260 but I followed the instructions of  Martin Horauer, replacing the lenovo BIOS update ISO with the one for my own laptop model. 

After the BIOS upgrade I deleted the workaround dis_ucode_ldr from the boot params and everything worked fine even with latest kernel.

Comment 36 Christian Horn 2016-07-21 08:04:14 UTC
(In reply to Martin Horauer from comment #23)
> A BIOS update for my Lenovo T460s fixed this issue.

+1
Installing the currently available Bios 1.13 on the T460s fixes the issue.  The T460s here was shipped just 2 weeks ago, but still with a Bios from March which had the issue.

Comment 37 Fedora Update System 2016-07-21 10:17:29 UTC
microcode_ctl-2.1-13.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-17e40fd8da

Comment 38 Fedora Update System 2016-07-21 10:17:54 UTC
microcode_ctl-2.1-13.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a596f3c268

Comment 39 Fedora Update System 2016-07-21 18:48:28 UTC
microcode_ctl-2.1-13.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a596f3c268

Comment 40 Fedora Update System 2016-07-21 18:52:16 UTC
microcode_ctl-2.1-13.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-17e40fd8da

Comment 41 Andreas Tunek 2016-07-22 06:39:54 UTC
microcode_ctl-2.1-13.fc24 and Linux 4.6.4-301.fc24 works together.

Comment 42 Andreas Tunek 2016-07-22 06:40:57 UTC
On my Asus UX305CA with old bios.

Comment 43 Fedora Update System 2016-07-22 18:21:24 UTC
microcode_ctl-2.1-13.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.

Comment 44 imbacen 2016-07-23 10:51:11 UTC
Just to clarify, can I safely update via dnf now or do I need to apply the microcode first?

Comment 45 Edgar Hoch 2016-07-23 23:20:09 UTC
(In reply to imbacen from comment #44)
> Just to clarify, can I safely update via dnf now or do I need to apply the
> microcode first?

Yes, you can safely update via dnf.

Package microcode_ctl-2.1-13.fc24 contains no package specific scriptlet so an update only changes the files on disk. Microcode is only load on boot, so nothing changes to the cpu on package installation.

To use the new microcode, initrd for the kernel to boot needs to be (re)created after the new microcode_ctl package is installed. This is done automatically by a new installed kernel package, or you can do it manually using dracut (see "man dracut"). For example, for the current running kernel, run

sudo dracut --force

Comment 46 Sergio Basto 2016-07-30 18:50:38 UTC
microcode_ctl-2.1-13.fc23 does not fixed my issue just after downgrade to microcode_ctl-2.1-10.fc23.x86_64

Comment 47 Sergio Basto 2016-08-15 02:14:57 UTC
(In reply to Sergio Monteiro Basto from comment #46)
> microcode_ctl-2.1-13.fc23 does not fixed my issue just after downgrade to
> microcode_ctl-2.1-10.fc23.x86_64

After many my case doesn't not change with microcode_ctl upgrade or downgrade .

Comment 48 Fedora Update System 2016-08-18 00:51:25 UTC
microcode_ctl-2.1-13.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Comment 49 xavi 2018-07-23 12:27:00 UTC
Hello all, I have a T430u and I can boot on:

4.9.0-0
4.9.0-4

But I cannot boot on:

4.9.0-6
4.9.0-7

I noticed I can boot in 4.9.0-7 on recovery mode. The only difference seem to be the "single" param in linux kernel boot. With that param I can boot.

I tried installing microcode and writing the dis_unicode_ldr but have no success, it cannot boot.

I also don't find the ISO bios update of my model for updating bios without having windows. Is it possible using virtualbox?

My system is a Debian and there's relevant data:

$ uname -a
Linux d2015 4.9.0-7-amd64 #1 SMP Debian 4.9.110-1 (2018-07-05) x86_64 GNU/Linux

# dmidecode -s bios-version
H6ET24WW (1.05 )

$ dpkg -l |grep microcode
ii  amd64-microcode                                             3.20160316.3                                                amd64        Processor microcode firmware for AMD CPUs
ii  intel-microcode                                             3.20180425.1~deb9u1                                         amd64        Processor microcode firmware for Intel CPUs
ii  iucode-tool                                                 2.1.1-1                                                     amd64        Intel processor microcode tool
ii  microcode.ctl                                               1.18~0+nmu2                                                 amd64        Intel IA32/IA64 CPU Microcode Utility (transitional package)


Thanks.

Comment 50 xavi 2018-07-23 12:46:38 UTC
(In reply to xavi from comment #49)
> Hello all, I have a T430u and I can boot on:
> 
> 4.9.0-0
> 4.9.0-4
> 
> But I cannot boot on:
> 
> 4.9.0-6
> 4.9.0-7
> 
> I noticed I can boot in 4.9.0-7 on recovery mode. The only difference seem
> to be the "single" param in linux kernel boot. With that param I can boot.
> 
> I tried installing microcode and writing the dis_unicode_ldr but have no
> success, it cannot boot.
> 
> I also don't find the ISO bios update of my model for updating bios without
> having windows. Is it possible using virtualbox?
> 
> My system is a Debian and there's relevant data:
> 
> $ uname -a
> Linux d2015 4.9.0-7-amd64 #1 SMP Debian 4.9.110-1 (2018-07-05) x86_64
> GNU/Linux
> 
> # dmidecode -s bios-version
> H6ET24WW (1.05 )
> 
> $ dpkg -l |grep microcode
> ii  amd64-microcode                                             3.20160316.3
> amd64        Processor microcode firmware for AMD CPUs
> ii  intel-microcode                                            
> 3.20180425.1~deb9u1                                         amd64       
> Processor microcode firmware for Intel CPUs
> ii  iucode-tool                                                 2.1.1-1     
> amd64        Intel processor microcode tool
> ii  microcode.ctl                                               1.18~0+nmu2 
> amd64        Intel IA32/IA64 CPU Microcode Utility (transitional package)
> 
> 
> Thanks.

Wow! I noticed now that just removing the "quiet" from kernel boot line, it works.