Bug 1524685
Summary: | grub interprets Apple file system (APFS) wrongly and goes into interactive mode (Bad for an SNO node where an MCP might cause a reboot) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 8 | Reporter: | basilicum <web02> | ||||
Component: | grub2 | Assignee: | Bootloader engineering team <bootloader-eng-team> | ||||
Status: | CLOSED WONTFIX | QA Contact: | Release Test Team <release-test-team-automation> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 8.8 | CC: | extras-orphan, jaredzq, jaredz, jonathan, lkundrak, martinrsssf, pjones, vcojot | ||||
Target Milestone: | rc | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-06-15 03:03:09 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
basilicum
2017-12-11 20:47:13 UTC
This message is a reminder that Fedora 26 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 26. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '26'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 26 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. This bug is independant of Fedora version as it is the boot loader. Obviously it can be dependant upon the version of Grub that ships with a certain Fedora version *** Bug 1619042 has been marked as a duplicate of this bug. *** This bug exists in Fedora 27 grub2 boot loader that is installed during anaconda install. Here is my observation ony my Macbook Pro with OSX and Linux (2 hard disks, reported as 3 disks): At the grub menu, press 'c' to enter grub command line, and type 'ls'. It lists the following: (hd0) (hd1) (hd1,gpt3) (hd1,gpt2) (hd1,gpt1) (hd2) (hd2,gpt8) (hd2,gpt7) ... This shows 3 disks, hd0, hd1, hd2, even though there is no 3rd disk, and no partition is listed for hd0. All partitions show hd1 as first disk and hd2 as second disk. The Linux that is in the second disk (hd1,gpt6) boots fine, but OSX that is in first disk (hd0,gpt3) does not. When I modify to boot (hd1,gpt3), it does boot OSX fine. One more observation: I installed OSX High Sierra on 2nd disk, and now no first disk is even detected, so 'ls' in grub boot prompt shows (hd0) (hd0,gpt8) (hd0,gpt7).. Looks like OS X changed something that renders grub to not detect the first disk. I didn't change any partition scheme at all. Now I can only boot OSX on the 2nd disk but cannot boot OSX in 1st disk. Linux on the second disk still boots perfectly fine. This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. I am re-opening this bug because it impacts GRUB on OpenShift SNO where one of the disks has a MacOS APFS partition. The consequence of this bug is that SNO nodes that should reboot without human interaction go into interactive mode at the GRUB menu and require human intervention. I am testing SNO 4.12 (based on RHEL 8.6) on a Mac Pro x86_64 machine. The machine has 3 SSDs which are as follows: /dev/sda (Apple SSD) /dev/nvme0n1 (TopoLVM - RedHat LVM storage Operator) /dev/nvme1n1 (OCP SNO 4.12.19) everything works fine and I can reboot/switch from OCP to MacOS with efibootmgr: [root@neraka ~]# efibootmgr BootCurrent: 0000 BootOrder: 0000,0001 Boot0000* Red Hat Enterprise Linux Boot0001* rEFInd Boot Manager Boot0080* Mac OS X Boot0081* Mac OS X Here's the disk config: [root@neraka ~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931.9G 0 disk |-sda1 8:1 0 200M 0 part |-sda2 8:2 0 238.6G 0 part `-sda3 8:3 0 693G 0 part sr0 11:0 1 1024M 0 rom nvme0n1 259:0 0 1.8T 0 disk |-datavg-thin--pool--1_tmeta 253:0 0 840M 0 lvm | `-datavg-thin--pool--1-tpool 253:2 0 1.7T 0 lvm | |-datavg-thin--pool--1 253:3 0 1.7T 1 lvm | |-datavg-af11ef7c--d568--42a3--830c--6d7e2102bd15 | | 253:4 0 40G 0 lvm /var/lib/kubelet/pods/7c785320-a7a6-4765-8f5b-671d3d33baeb/volume-subpaths/pvc-ecdeed7a-78ca- | `-datavg-bdfefdc5--ba97--40ce--84b3--81d31a4ec402 | 253:5 0 10G 0 lvm /var/lib/kubelet/pods/b56bb23f-22c3-41f3-a208-9638747ac29e/volume-subpaths/pvc-e1a0d307-92cf- `-datavg-thin--pool--1_tdata 253:1 0 1.7T 0 lvm `-datavg-thin--pool--1-tpool 253:2 0 1.7T 0 lvm |-datavg-thin--pool--1 253:3 0 1.7T 1 lvm |-datavg-af11ef7c--d568--42a3--830c--6d7e2102bd15 | 253:4 0 40G 0 lvm /var/lib/kubelet/pods/7c785320-a7a6-4765-8f5b-671d3d33baeb/volume-subpaths/pvc-ecdeed7a-78ca- `-datavg-bdfefdc5--ba97--40ce--84b3--81d31a4ec402 253:5 0 10G 0 lvm /var/lib/kubelet/pods/b56bb23f-22c3-41f3-a208-9638747ac29e/volume-subpaths/pvc-e1a0d307-92cf- nvme1n1 259:1 0 931.5G 0 disk |-nvme1n1p1 259:2 0 1M 0 part |-nvme1n1p2 259:3 0 127M 0 part |-nvme1n1p3 259:4 0 384M 0 part /boot `-nvme1n1p4 259:5 0 931G 0 part /sysroot The -PROBLEM- is that unless I 'wipe' /dev/sda and MacOS, GRUB from /dev/nvme1n1p2 (OCP 4.12) barfs on the APFS partition on /dev/sda and goes into interactive mode: error: ../../grub-core/disk/eft/efidisk.c:612: fatture reading sector @x1dd164f0 from "hd0". error: ../../grub-core/disk/eft/eftdisk.c:612: failure reading sector @x1dd16480 from "hd0". error: ../../grub-core/disk/eft/efidisk.c:612: failure reading sector @x1dd164f0 from "hd1". error: ../../grub-core/disk/eft/eftdisk.c:612: failure reading sector @x1dd16480 from "hd1". error: ../../grub-core/disk/eft/efidisk.c:612: failure reading sector @x1dd164f0 from "hd2". error: ../../grub-core/disk/eft/eftdisk.c:612: failure reading sector @x1dd16480 from "hd2". error: ../../grub-core/disk/eft/efidisk.c:612: failure reading sector @x1dd164f0 from "hd3". error: ../../grub-core/disk/eft/eftdisk.c:612: failure reading sector @x1dd16480 from "hd3". error: ../../grub-core/disk/eft/efidisk.c:612: failure reading sector @x1dd164f0 from "hd4". error: ../../grub-core/disk/eft/eftdisk.c:612: failure reading sector @x1dd16480 from "hd4". error: ../../grub-core/disk/eft/efidisk.c:612: failure reading sector @x1dd164f0 from "hd5". error: ../../grub-core/disk/eft/eftdisk.c:612: failure reading sector @x1dd16480 from "hd5". after that, pressing 'q' resumes normal boot and the system boots fine into OCP. I only have 3 drives in this machine, why is GRUB complaining about hd4, hd5 and the rest? Furthermore, if I ask GRUB to enter a command shell, I see this: grub> ls (proc) (hd0) (hd1) (hd2) (hd3) (hd4) (hd5) (hd6) (hd6,msdos1) (hd7) (hd7, gpt3) (hd7,gpt2) (hd7, gpt1) (hd8) (hd9) (hd9, gpt4) (hd9, gpt3) (hd9, gpt2) (hd9,gpt1) error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd164f0 from `hd0'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd16480 from `hd0'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd164f0 from `hd1'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd16480 from `hd1'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd164f0 from `hd2'. error: ../../grub-core/disk/efi/efidisk.c:612;failure reading sector 0x1dd16480 from `hd2'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd164f0 from `hd3'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd16480 from `hd3'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd164f0 from `hd4'. error: ../../grub-core/disk/efi/efidisk.c:612: failure reading sector 0x1dd16480 from `hd4'. --MORE-- GRUB is completely confused by the Apple APFS partition on /dev/sda. As much as I can understand/recognize that h6, h7 and hd9 must be my flash drives (they show partitions), where are hd0,hd1,hd2,hd3, hd4, hd5 and hd8 coming from? this is what fdisk shows: [root@neraka ~]# fdisk -l /dev/sda Disk /dev/sda: 931.9 GiB, 1000555581440 bytes, 1954210120 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 8394576E-EF09-4FD1-8BBE-4DF0182F5102 Device Start End Sectors Size Type /dev/sda1 40 409639 409600 200M EFI System /dev/sda2 409640 500671783 500262144 238.6G unknown /dev/sda3 500671784 1953947935 1453276152 693G Apple HFS/HFS+ (This is MacOS 12.6.6 - aka Monterrey) At this point, if I 'sgdisk -Z /dev/sda', then OpenShift's GRUB no longer complains and boot proceeds normally without user interaction. Can someone please fix GRUB so that unknown data tricks it into thinking it has many more disks than there actually are in the machine? [root@neraka ~]# efibootmgr -v BootCurrent: 0000 BootOrder: 0000,0001 Boot0000* Red Hat Enterprise Linux HD(2,GPT,d36bfc93-9920-4346-9c56-bd7c57bdb0bb,0x1000,0x3f800)/File(\EFI\redhat\shimx64.efi) Boot0001* rEFInd Boot Manager PciRoot(0x0)/Pci(0x1c,0x4)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/Sata(0,0,0)/HD(1,GPT,403d1eeb-03d6-4a97-940b-034d7b8c5950,0x28,0x64000)/File(\EFI\refind\refind_x64.efi).. Boot0080* Mac OS X PciRoot(0x0)/Pci(0x1c,0x4)/Pci(0x0,0x0)/Pci(0x2,0x0)/Pci(0x0,0x0)/Sata(0,0,0)/HD(2,GPT,47056646-5996-469c-b7c5-269e306d85f1,0x64028,0x1dd16500)/VenMedia(be74fcf7-0b7c-49f3-9147-01f4042e6842,f7d858ec229b9242b3ba7d0677ff2f50)/File(\D65D8AEE-85B7-4276-8E6D-2198B0B8A76E\System\Library\CoreServices\boot.efi) Boot0081* Mac OS X Ata(0,1,0)/HD(2,GPT,9cbd5a47-e8e6-44ad-83b2-14ab83db3b2d,0x64028,0x55b7c0) Created attachment 1970911 [details]
grub2 error from /dev/nvme1n1p2 when MacOS is present on /dev/sda
I've found that toggling /dev/sda2 (type Unknown but holding MacOS) makes GRUB happy (but breaks MacOS which then becomes unbootable) so here's the (ugly) workaround I've found: 1) take a backup copy of the GPT partition table on /dev/sda (4 sectors should be enough) using Linux. /dev/sda is the MacOS disk. # dd if=/dev/sda of=/root/bootsect.bin count=4 2) change the type of /dev/sda2 from 'Unknown' to anything else using fdisk: Command (m for help): p Disk /dev/sda: 931.9 GiB, 1000555581440 bytes, 1954210120 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 8394576E-EF09-4FD1-8BBE-4DF0182F5102 Device Start End Sectors Size Type /dev/sda1 40 409639 409600 200M EFI System /dev/sda2 409640 500671783 500262144 238.6G unknown /dev/sda3 500671784 1953947935 1453276152 693G Apple HFS/HFS+ Command (m for help): t Partition number (1-3, default 3): 2 Partition type (type L to list all types): 42 Changed type of partition 'unknown' to 'Apple boot'. Command (m for help): p Disk /dev/sda: 931.9 GiB, 1000555581440 bytes, 1954210120 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: 8394576E-EF09-4FD1-8BBE-4DF0182F5102 Device Start End Sectors Size Type /dev/sda1 40 409639 409600 200M EFI System /dev/sda2 409640 500671783 500262144 238.6G Apple boot /dev/sda3 500671784 1953947935 1453276152 693G Apple HFS/HFS+ Command (m for help): w The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks. this makes GRUB happy on the next OCP reboot but makes MacOS unbootable. 3) to boot MacOS, simply restore the GPT partition table on your MacOS disk from Linux and reboot: # dd if=/root/bootsect.bin of=/dev/sda this quite ugly but until GRUB is fixed, that will be good enough.. This is not supported hardware for RHEL. If you can reproduce this on a supported RHEL configuration, feel free to reopen. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |