Bug 1413191 - Update fedora 25 breaks UEFI - MOK - drive cant boot
Update fedora 25 breaks UEFI - MOK - drive cant boot
Status: NEW
Product: Fedora
Classification: Fedora
Component: shim (Show other bugs)
25
x86_64 Linux
unspecified Severity urgent
: ---
: ---
Assigned To: Matthew Garrett
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-01-13 16:14 EST by tux4me
Modified: 2017-06-14 21:36 EDT (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description tux4me 2017-01-13 16:14:56 EST
Description of problem:

Updating a clean install of Fedora 25 workstation, with secure boot enabled all time, break the shim stuff.
I get a MOK screen, where i have tried all .efi files in the efi device.
I never get the system to boot at all, i get MOK screen and then BIOS.

dosfsck says .efi files is damaged, and i tried to let it fix it and rebooted, without luck. 

In MOK screen, loading the grubx64.efi reported a "corrupted volume"

Version-Release number of selected component (if applicable):



How reproducible:

I did a clean install, rebooted servel times, and then i did a 
"sudo dnf update" and aprox. 1000 packages got installed, including a new kernel
without problems.
Then i did a reboot, a power off-on, still getting the MOK -> BIOS screens.

I did it all 2 times, with exactly the same problem.
Installs went fine, but update broke it all.

3th time without secure boot, was without any problems.

Steps to Reproduce:
1. Install Fedora 25 with secure boot on
2. sudo dnf update
3. reboot

Actual results:

Dead computer

Expected results:

Alive and kicking computer 

Additional info:

Using M.2 Intel Pro 6000p 128gb as boot drive.

motherboard - Asus Z170I PRO GAMING WIFI - Mini ITX
Comment 1 Doug Alfonso 2017-02-23 23:01:11 EST
Am seeing same results with Intel NUC6i5SYK and Intel SSD600p 256gb M.2 PCIe as boot drive.

Steps to reproduce:

1. Install Fedora 25 from live USB stick.
2. Run the Software Updater and let it "Reboot and Install" all of the packages it deems in need of update.
3. Wait for the update to finish and reboot again automatically at end of update installation.

Result:

Always boots into MOK management after updates are installed. Selecting to continue the boot at any point after that (either directly or after trying to enroll keys/hash) returns message that no boot device can be found.  This happens whether Secure Boot is enabled or disabled in the UEFI bios. 


If I stop at step 1 and never run the software update, I can shut down and reboot as often as I like without this happening.
Comment 2 Phil Rak 2017-04-02 12:21:03 EDT
Getting the same results on a Dell Precision 5510 with Intel SSD600p as well. I follow the same steps to reproduce. I can successfully reboot and install software before running dnf update. After dnf update, boot to mok manger.
Comment 3 Maksim Zubkov 2017-04-10 16:01:38 EDT
Getting the same results on a Acer Swift 3 (intel SSDPEKKW256G7).
After install all ok. After dnf update system goes to MokManager.
Comment 4 Gregory 2017-04-18 09:52:37 EDT
I seem to be running into the same issue.

System: Intel i5-7600 + ASRock H270ac/n motherboard + Intel 6-series NVME SSD.

Installed Fedora with default partition settings.

Original installation uses grub* packages with fc24 suffix (grub-efi, grub, grub-tools). These install and work properly.

Updating these packages to version 2.02-0.38.fc25 breaks the system.
The boot process reports corrupt \BOOT\EFI\fedora\grubx64.efi file.

Current workaround: running dnf upgrade -x 'grub*' to exclude these packages from the upgrade.
Comment 5 Jarkko Torvinen 2017-04-27 13:08:20 EDT
Had the same problem with Intel 600p nvme ssd, switched to Samsung 960 EVO and it worked fine.
Comment 6 Gregory 2017-05-06 12:26:26 EDT
In case of Intel SSD this ticket tracks a fs corruption issue caused by a bug in firmware: https://bugzilla.redhat.com/show_bug.cgi?id=1402533
Comment 7 Lukas Zapletal 2017-06-07 08:24:31 EDT
Hit hard with this on my custom Ryzen 1700 build with Intel 600p 256GB NVMe, my EFI partition is corrupt and boot loading randomly crashes and misbehaves (not booting, black screens), this was hard to track down. Terrible experience, thumb down Intel. /CCed
Comment 8 songmn 2017-06-09 11:32:22 EDT
I know that the new 121 FW for the 600p has resolved the following issue for me:
https://bugzilla.redhat.com/show_bug.cgi?id=1402533

Have you all tried this new FW for this issue?  Does the issue still occur?  I have not seen this issue, so just wondering

https://downloadcenter.intel.com/download/26491?v=t
Comment 9 Scott Bauer 2017-06-09 13:21:52 EDT
Can everyone that is having trouble confirm the FW revision they're running?


You can do so by grabbing a copy of nvme-cli (it's in packages) or from https://github.com/linux-nvme/nvme-cli

and running (as root) `nvme list` and pasting the output here. I ask because I am currently testing fedora/debian installs on 121 and I can't repro. The issues everyone is having seems similar to bug 1402533 so I want to confirm we're all operating on the same firmware.
Comment 10 Geoffrey Mills 2017-06-11 05:26:45 EDT
The original bug 1402533 still exists in the latest 121C firmware, my Centos VM's corrupted the filesystem within several days after applying 121C FW.

FWIW running ESXi6.0U3 on Intel 600P and all Centos VMs corrupt the superblock within a week. The Windows VMs work fine though. 
Raised this issue on the Intel community forums several months ago then found the Redhat bug report. Can't help thinking this is the same issue?
Comment 11 Scott Bauer 2017-06-12 12:22:45 EDT
Geoffrey,

What's your VM setup? Looking to try and repro the issue. Going to try and setup what you have and see if I can get the corruption.
Comment 12 Geoffrey Mills 2017-06-13 01:34:17 EDT
Scott
Nothing special, Supermicro X10SDV-TLN4F with Supermicro ECC 64Gb RAM running VMWare ESXi 6.0 U3. This has a multiple HDDs attached: Intel NVMe 60P 512Gb, Crucial MX300 SSD, Western Digital Red 1Tb SATA HDD. I am using Centos 6 & 7 x64 OS VMs. In February I moved the Centos 7 x64 VMs onto the Intel 600P and they ran very well! But within 7 days this started corrupting them so they have been moved back to the Crucial SSD HDD and they have been running OK since then. Interestingly the sole Windows 10 VM continues to run OK on the Intel 600P.
FWIW - ESXi boots from the Intel 600P and runs well. FWIW I also installed Centos plus Docker in February and this died the same way. The VMs are configured with VMware paravirtual or LSI controller, makes no difference.

I think the 121C firmware is improved, the new Centos VM I made last week is still running. But the Centos VM running Graylog died within 4 days.
Note that you need to reboot the VMs to test if the filesystem is corrupt
Comment 13 Randy 2017-06-14 21:36:11 EDT
Still happens with a Dell Vostro 3700

Note You need to log in before you can comment on or make changes to this bug.