Bug 1353103 - Kernel 4.5.7-202 doesn't boot on Samsung Notebook 9
Kernel 4.5.7-202 doesn't boot on Samsung Notebook 9
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: microcode_ctl (Show other bugs)
23
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Anton Arapov
Fedora Extras Quality Assurance
:
: 1351943 1352700 1353061 1353586 1357317 1357862 (view as bug list)
Depends On:
Blocks: 1361183
  Show dependency treegraph
 
Reported: 2016-07-06 02:26 EDT by Andy Grover
Modified: 2016-08-17 20:51 EDT (History)
33 users (show)

See Also:
Fixed In Version: microcode_ctl-2.1-13.fc24 microcode_ctl-2.1-13.fc23
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1361183 (view as bug list)
Environment:
Last Closed: 2016-07-22 14:21:36 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
dmidecode output (23.18 KB, text/plain)
2016-07-06 02:26 EDT, Andy Grover
no flags Details
dnf history since working kernel was installed (26.85 KB, text/plain)
2016-07-06 13:06 EDT, Andy Grover
no flags Details

  None (edit)
Description Andy Grover 2016-07-06 02:26:00 EDT
Created attachment 1176738 [details]
dmidecode output

Description of problem:
Doesn't boot. I just get a blinking cursor even after removing 'rhgb quiet' from kernel cmdline.

This laptop is skylake with intel graphics, see dmi info.

4.5.7-200 works.

Version-Release number of selected component (if applicable):
kernel-4.5.7-202.x64

How reproducible:
always

Additional info:
UEFI issue? (Some failure mode where there's no boot/kernel output at all)
Comment 1 Andy Grover 2016-07-06 02:39:07 EDT
installed 4.6.3-200 from koji and it doesn't work either.
Comment 2 Josh Boyer 2016-07-06 08:02:36 EDT
If 4.5.7-200 works, that makes very little sense.  The changes between -200 and -202 are very very limited in scope to CVE fixes only.

What else was updated between the time -200 and -202 were installed?  It should be in your update log.
Comment 3 Andy Grover 2016-07-06 13:04:44 EDT
This looks suspicious to me:

initramfs-4.5.6-200.fc23.x86_64.img:                     gzip compressed data, max compression, from Unix
initramfs-4.5.7-200.fc23.x86_64.img:                     gzip compressed data, max compression, from Unix
initramfs-4.5.7-202.fc23.x86_64.img:                     ASCII cpio archive (SVR4 with no CRC)
initramfs-4.6.3-200.fc23.x86_64.img:                     ASCII cpio archive (SVR4 with no CRC)

why would this have changed?
Comment 4 Andy Grover 2016-07-06 13:06 EDT
Created attachment 1176950 [details]
dnf history since working kernel was installed
Comment 5 Josh Boyer 2016-07-06 13:12:22 EDT
(In reply to Andy Grover from comment #3)
> This looks suspicious to me:
> 
> initramfs-4.5.6-200.fc23.x86_64.img:                     gzip compressed
> data, max compression, from Unix
> initramfs-4.5.7-200.fc23.x86_64.img:                     gzip compressed
> data, max compression, from Unix
> initramfs-4.5.7-202.fc23.x86_64.img:                     ASCII cpio archive
> (SVR4 with no CRC)
> initramfs-4.6.3-200.fc23.x86_64.img:                     ASCII cpio archive
> (SVR4 with no CRC)
> 
> why would this have changed?

That's actually a good find.  So the file type indicated by 4.5.6 and 4.5.7-200 is indicative of an initramfs that lacks early microcode updates tacked on.  The latter two indicate that it does have microcode tacked one.

Looking at your dnf log, we find that microcode-ctl was updated between 4.5.7-200 and 4.5.7-202.

    Upgraded microcode_ctl-2:2.1-10.fc23.x86_64                       @updates

I'm now wondering if that is explicitly the problem here and the ucode that is loaded early (and it is very early in the boot process) is causing the issues.
Comment 6 Josh Boyer 2016-07-06 13:15:38 EDT
Anton, have you heard anything about the new Intel microcode update causing boot issues on skylake platforms?
Comment 7 Josh Boyer 2016-07-06 13:29:02 EDT
Andy, can you try booting with 'dis_ucode_ldr' added to the kernel command line?
Comment 8 Andy Grover 2016-07-06 14:07:34 EDT
(In reply to Josh Boyer from comment #7)
> Andy, can you try booting with 'dis_ucode_ldr' added to the kernel command
> line?

Works!
Comment 9 Josh Boyer 2016-07-06 14:14:58 EDT
(In reply to Andy Grover from comment #8)
> (In reply to Josh Boyer from comment #7)
> > Andy, can you try booting with 'dis_ucode_ldr' added to the kernel command
> > line?
> 
> Works!

Well, that's both good and bad.  It's good because we know the cause.  It's bad because if I'm understanding the bugs I found in Arch and Debian, the only way to fix it is via a BIOS/UEFI update for your machine and that has to come from your vendor.

So dis_ucode_ldr is the workaround, but I'm not sure there's going to be a solution beyond "update your firmware when the vendor fixes it."
Comment 11 Andy Grover 2016-07-06 15:00:00 EDT
OK thanks for the info. FWIW my vendor did have a bios update, and it now reports:

microcode: CPU0 sig=0x406e3, pf=0x80, revision=0x88

whereas before revision was 0x82. Still needs the dis_ucode_ldr to boot.

OK I'll stay on top of future vendor updates.
Comment 12 Anton Arapov 2016-07-07 05:00:05 EDT
Andy, Josh, ... there is indeed no way to fix this until fixed by Intel in microcode. Do we want to revert this change? Or can we temporary blacklist/disable it?
Comment 13 Josh Boyer 2016-07-07 07:25:22 EDT
I'm not aware of anyway to blacklist it.  The only ways I know how to disable it are per-machine solutions, like the dis_ucode_ldr cmdline option or rebuilding the initramfs to not have the microcode included.
Comment 14 Joonas Kylmälä 2016-07-07 09:05:38 EDT
I had this same problem on Lenovo ThinkPad x260 laptop (the same processor: skylake i5-6200U) with 4.6.3-300.fc24.x86_64 kernel. Upgrading BIOS/UEFI and reinstalling Fedora 24 with UEFI mode let me boot the system normally. Not sure if the BIOS/UEFI upgrade did the trick or changing the Fedora to UEFI mode but according to you it seems like the BIOS/UEFI upgrade did the trick. Also, to note, with the 4.5.5-300.fc24.x86_64 kernel Fedora booted normally with the old BIOS/UEFI version, so some regression has happened between 4.5.5-300 and 4.6.3-300.
Comment 15 Josh Boyer 2016-07-07 09:10:29 EDT
(In reply to Joonas Kylmälä from comment #14)
> I had this same problem on Lenovo ThinkPad x260 laptop (the same processor:
> skylake i5-6200U) with 4.6.3-300.fc24.x86_64 kernel. Upgrading BIOS/UEFI and
> reinstalling Fedora 24 with UEFI mode let me boot the system normally. Not
> sure if the BIOS/UEFI upgrade did the trick or changing the Fedora to UEFI
> mode but according to you it seems like the BIOS/UEFI upgrade did the trick.
> Also, to note, with the 4.5.5-300.fc24.x86_64 kernel Fedora booted normally
> with the old BIOS/UEFI version, so some regression has happened between
> 4.5.5-300 and 4.6.3-300.

No, this is not a kernel problem.  What happened in your case is that you reinstalled.  The installation media uses an initramfs that does not contain the problematic microcode.

There is nothing we can do in the kernel to fix this.
Comment 16 Jonas Thiem 2016-07-07 11:38:16 EDT
Lenovo Thinkpad Yoga 260 is also affected. What do I need to do to fix this? Upgrade the UEFI/BIOS firmware?
Comment 17 Josh Boyer 2016-07-07 11:38:45 EDT
*** Bug 1353586 has been marked as a duplicate of this bug. ***
Comment 18 Josh Boyer 2016-07-07 11:53:20 EDT
(In reply to Jonas Thiem from comment #16)
> Lenovo Thinkpad Yoga 260 is also affected. What do I need to do to fix this?
> Upgrade the UEFI/BIOS firmware?

If one is available, it is certainly worth a try updating it.
Comment 19 Andy Grover 2016-07-07 13:09:06 EDT
ok let me get this straight:

1. CPUs have bugs
2. Intel fixes some of these bugs with microcode updates that need to be reloaded every time after poweroff
3. The system firmware can install a microcode update
4. microcode_ctl tries to install some (more recent?) microcode update

Josh, Anton: You're saying the reason for this problem is the microcode we're installing in step 4 is bad? Or the new version is assuming some precondition is met by the firmware so that I need a new firmware rev to work with the most current microcode rev?

The current solutions are to either
a. Add 'dis_ucode_ldr' to the kernel command line so #4 is skipped
b. uninstall microcode_ctl and rebuild initrd with 'dracut -f --kver 4.5.7-202.fc23.x86_64'

with the understanding that we now might not have the latest, greatest microcode (we're solely relying on our firmware to install it, which it may not)

Yes?
Comment 20 Josh Boyer 2016-07-07 13:16:59 EDT
(In reply to Andy Grover from comment #19)
> ok let me get this straight:

Pretty close.  Some clarifications.

> 1. CPUs have bugs
> 2. Intel fixes some of these bugs with microcode updates that need to be
> reloaded every time after poweroff
> 3. The system firmware can install a microcode update
> 4. microcode_ctl tries to install some (more recent?) microcode update

Should be:

3. The system firmware is often shipped with microcode included, and loads it during system initialization before even starting the bootloader, etc.
4. Future microcode can be released stand-alone, which can then be loaded by the kernel very early in kernel boot which (normally) allows a machine to get the latest microcode without having to do a full system firmware update.
5. microcode_ctl is the package that distributes said microcode releases
6. Dracut will include microcode in the initramfs if it is present at the time of initramfs creation, and the kernel will load it from that extremely early in boot.

> Josh, Anton: You're saying the reason for this problem is the microcode
> we're installing in step 4 is bad? Or the new version is assuming some
> precondition is met by the firmware so that I need a new firmware rev to
> work with the most current microcode rev?

It is difficult to tell which one of those scenarios is true.  It might be a bit of both, but the latter is given more credence since the new ucode seems to work with some firmware updates from some vendors. 

> The current solutions are to either
> a. Add 'dis_ucode_ldr' to the kernel command line so #4 is skipped
> b. uninstall microcode_ctl and rebuild initrd with 'dracut -f --kver
> 4.5.7-202.fc23.x86_64'

(or downgrade microcode_ctl rather than uninstall it)

and

c. find a system firmware update from your vendor and apply that to see if it works.

> with the understanding that we now might not have the latest, greatest
> microcode (we're solely relying on our firmware to install it, which it may
> not)
> 
> Yes?

You essentially have that all correct, yes.
Comment 21 Erik van Pienbroek 2016-07-07 14:46:36 EDT
I can confirm that this issue also exists for the HP Elitebook 850 G3.
BIOS versions 1.04 and 1.05 are affected by this bug. Updating to BIOS version 1.07 resolves the issue
Comment 22 Adam Williamson 2016-07-07 17:15:46 EDT
*** Bug 1352700 has been marked as a duplicate of this bug. ***
Comment 23 Martin Horauer 2016-07-08 04:03:23 EDT
A BIOS update for my Lenovo T460s fixed this issue.

http://thinkwiki.de/BIOS-Update_ohne_optisches_Laufwerk_unter_Linux
Comment 24 mlaverdiere 2016-07-08 08:23:39 EDT
On an Asus UX305CA, upgrading the BIOS to version 300 has solved the non-booting problem with kernel 4.6.3 on Fedora 24 (I have always been able to boot with kernel 4.5.7 though).
Comment 25 Josh Boyer 2016-07-08 08:32:42 EDT
*** Bug 1351943 has been marked as a duplicate of this bug. ***
Comment 26 Josh Boyer 2016-07-15 09:20:26 EDT
*** Bug 1353061 has been marked as a duplicate of this bug. ***
Comment 27 Richard Chan 2016-07-17 22:47:29 EDT
*** Bug 1357317 has been marked as a duplicate of this bug. ***
Comment 28 Richard Chan 2016-07-17 23:14:11 EDT
On an Asus UX305UA, removing "load_video" from grub menu works.
Curious: why does load_video trigger the failure?

UX305UA does not have an updated BIOS :-(

BOOT_IMAGE=/vmlinuz-4.6.4-301.fc24.x86_64 root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap video=1920x1080 LANG=en_US.UTF-8


microcode_ctl-2.1-12.fc24.x86_64


[    0.000000] microcode: microcode updated early to revision 0x8a, date = 2016-04-06
[    0.724234] microcode: CPU0 sig=0x406e3, pf=0x80, revision=0x8a
[    0.724608] microcode: CPU1 sig=0x406e3, pf=0x80, revision=0x8a
[    0.724976] microcode: CPU2 sig=0x406e3, pf=0x80, revision=0x8a
[    0.725475] microcode: CPU3 sig=0x406e3, pf=0x80, revision=0x8a
[    0.725857] microcode: Microcode Update Driver: v2.01 <tigran@aivazian.fsnet.co.uk>, Peter Oruba
Comment 29 Richard Chan 2016-07-18 00:28:45 EDT
Sorry for the noise, load_video, was a red herring.

This "success" was achieved by booting into 4.5.7-300 and warm booting into 4.6.4-301 worked. Then the early microcode seemed to work.
Comment 30 Rich Jankowski 2016-07-18 20:24:17 EDT
The BIOS update fixed this issue on my 2016 X1 Carbon.
Comment 31 Josh Boyer 2016-07-19 10:08:00 EDT
*** Bug 1357862 has been marked as a duplicate of this bug. ***
Comment 32 Sandro Bonazzola 2016-07-20 15:27:18 EDT
(In reply to Martin Horauer from comment #23)
> A BIOS update for my Lenovo T460s fixed this issue.
> 
> http://thinkwiki.de/BIOS-Update_ohne_optisches_Laufwerk_unter_Linux

Can you provide english instructions?
I've a T460s and have the same exact issue.

BIOS Information
        Vendor: LENOVO
        Version: N1CET37W (1.05 )
        Release Date: 01/15/2016
Comment 33 Martin Horauer 2016-07-20 15:47:01 EDT
If you have Windows on your T460s go to the official Lenovo page and download the latest BIOS update along with their update utility.

If not (as in my case) you can use the commands listed on the above german wiki page. The steps are:

(1) Download the latest BIOS Update, e.g.: 

https://download.lenovo.com/pccbbs/mobiles/n1cur06w.iso

(2) Obtain the update tool for Linux:

wget https://userpages.uni-koblenz.de/~krienke/ftp/noarch/geteltorito/geteltorito.pl

(3) Create a bootable image:

geteltorito -o thinkpadbios.img n1cur06w.iso

(4) Place an empty USB stick in your computer and perform the following command (you'll need to replace sdX with the device name for your USB stick showing up).

sudo dd if=thinkpadbios.img of=/dev/sdX bs=1M
sync

(5) Boot from the USB stick and do the BIOS update.

Cross your fingers and you are (hopefully) done.
Comment 34 Martin Horauer 2016-07-20 15:57:53 EDT
(In reply to Sandro Bonazzola from comment #32)
> (In reply to Martin Horauer from comment #23)
> > A BIOS update for my Lenovo T460s fixed this issue.
> > 
> > http://thinkwiki.de/BIOS-Update_ohne_optisches_Laufwerk_unter_Linux
> 
> Can you provide english instructions?
> I've a T460s and have the same exact issue.
> 
> BIOS Information
>         Vendor: LENOVO
>         Version: N1CET37W (1.05 )
>         Release Date: 01/15/2016

Sorry I should have replied. See my comment 33.
Comment 35 Stefan Midjich 2016-07-21 03:32:10 EDT
I had the same problem om Thinkpad x260 but I followed the instructions of  Martin Horauer, replacing the lenovo BIOS update ISO with the one for my own laptop model. 

After the BIOS upgrade I deleted the workaround dis_ucode_ldr from the boot params and everything worked fine even with latest kernel.
Comment 36 Christian Horn 2016-07-21 04:04:14 EDT
(In reply to Martin Horauer from comment #23)
> A BIOS update for my Lenovo T460s fixed this issue.

+1
Installing the currently available Bios 1.13 on the T460s fixes the issue.  The T460s here was shipped just 2 weeks ago, but still with a Bios from March which had the issue.
Comment 37 Fedora Update System 2016-07-21 06:17:29 EDT
microcode_ctl-2.1-13.fc24 has been submitted as an update to Fedora 24. https://bodhi.fedoraproject.org/updates/FEDORA-2016-17e40fd8da
Comment 38 Fedora Update System 2016-07-21 06:17:54 EDT
microcode_ctl-2.1-13.fc23 has been submitted as an update to Fedora 23. https://bodhi.fedoraproject.org/updates/FEDORA-2016-a596f3c268
Comment 39 Fedora Update System 2016-07-21 14:48:28 EDT
microcode_ctl-2.1-13.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-a596f3c268
Comment 40 Fedora Update System 2016-07-21 14:52:16 EDT
microcode_ctl-2.1-13.fc24 has been pushed to the Fedora 24 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-17e40fd8da
Comment 41 Andreas Tunek 2016-07-22 02:39:54 EDT
microcode_ctl-2.1-13.fc24 and Linux 4.6.4-301.fc24 works together.
Comment 42 Andreas Tunek 2016-07-22 02:40:57 EDT
On my Asus UX305CA with old bios.
Comment 43 Fedora Update System 2016-07-22 14:21:24 EDT
microcode_ctl-2.1-13.fc24 has been pushed to the Fedora 24 stable repository. If problems still persist, please make note of it in this bug report.
Comment 44 imbacen 2016-07-23 06:51:11 EDT
Just to clarify, can I safely update via dnf now or do I need to apply the microcode first?
Comment 45 Edgar Hoch 2016-07-23 19:20:09 EDT
(In reply to imbacen from comment #44)
> Just to clarify, can I safely update via dnf now or do I need to apply the
> microcode first?

Yes, you can safely update via dnf.

Package microcode_ctl-2.1-13.fc24 contains no package specific scriptlet so an update only changes the files on disk. Microcode is only load on boot, so nothing changes to the cpu on package installation.

To use the new microcode, initrd for the kernel to boot needs to be (re)created after the new microcode_ctl package is installed. This is done automatically by a new installed kernel package, or you can do it manually using dracut (see "man dracut"). For example, for the current running kernel, run

sudo dracut --force
Comment 46 Sergio Monteiro Basto 2016-07-30 14:50:38 EDT
microcode_ctl-2.1-13.fc23 does not fixed my issue just after downgrade to microcode_ctl-2.1-10.fc23.x86_64
Comment 47 Sergio Monteiro Basto 2016-08-14 22:14:57 EDT
(In reply to Sergio Monteiro Basto from comment #46)
> microcode_ctl-2.1-13.fc23 does not fixed my issue just after downgrade to
> microcode_ctl-2.1-10.fc23.x86_64

After many my case doesn't not change with microcode_ctl upgrade or downgrade .
Comment 48 Fedora Update System 2016-08-17 20:51:25 EDT
microcode_ctl-2.1-13.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.