Bug 1393846
Summary: | no Fedora boot menu in Mac OS X dual boot install | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Kamil Páral <kparal> | ||||||||||||||
Component: | python-blivet | Assignee: | Blivet Maintenance Team <blivet-maint-list> | ||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||
Severity: | unspecified | Docs Contact: | |||||||||||||||
Priority: | unspecified | ||||||||||||||||
Version: | 25 | CC: | abdel.g.martinez.l, awilliam, blivet-maint-list, bugzilla, jan.public, jones.peter.busi, mcatanzaro+wrong-account-do-not-cc, mjg59, pbrobinson, pjones, pschindl, randy, robatino, sgallagh | ||||||||||||||
Target Milestone: | --- | ||||||||||||||||
Target Release: | --- | ||||||||||||||||
Hardware: | Unspecified | ||||||||||||||||
OS: | Unspecified | ||||||||||||||||
Whiteboard: | AcceptedBlocker | ||||||||||||||||
Fixed In Version: | python-blivet-2.1.6-4.fc25 | Doc Type: | If docs needed, set a value | ||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||
Clone Of: | Environment: | ||||||||||||||||
Last Closed: | 2016-11-18 08:24:01 UTC | Type: | Bug | ||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||
Embargoed: | |||||||||||||||||
Bug Depends On: | |||||||||||||||||
Bug Blocks: | 1277289 | ||||||||||||||||
Attachments: |
|
Description
Kamil Páral
2016-11-10 12:50:11 UTC
Created attachment 1219380 [details]
anaconda.log
Created attachment 1219381 [details]
journal.log
Created attachment 1219382 [details]
program.log
Created attachment 1219383 [details]
storage.log
Here's the current disk layout. EFI files seem to be present on both sda1 (created by OS X) and sda4 (created by anaconda): NAME KNAME MAJ:MIN FSTYPE LABEL UUID PARTTYPE PARTLABEL PARTUUID SIZE sda sda 8:0 465.8G ├─sda4 sda4 8:4 ext4 49e3c90e-1b3a-4683-b524-f5c1fe4799e5 0fc63daf-8483-4772-8e79-3d69d8477de4 02929385-0ba4-4ef5-9cdc-346d57644266 1G ├─sda2 sda2 8:2 hfsplus Macintosh HD 22d105d1-8ca7-3db7-9344-c049f6395d31 48465300-0000-11aa-aa11-00306543ecac Linux HFS+ ESP 00002421-6ff9-0000-8653-0000d3570000 201.6G ├─sda5 sda5 8:5 LVM2_member oNYsn8-I6rz-UFKi-nugM-Kh93-gF7S-yGFS8B e6d6d379-f507-44c2-a23c-238f2a3df928 c6adf901-a85f-4fc2-bdeb-d7403e9ab38f 262.4G │ ├─fedora-home dm-1 253:1 ext4 068c289e-bb9a-47f1-a39b-f642fae45069 208.6G │ ├─fedora-root dm-2 253:2 ext4 3b3727d1-f827-4171-b1bf-789a500caa2d 50G │ └─fedora-swap dm-0 253:0 swap 75189c2d-6785-45b1-aff1-1b6cb1348c4e 3.9G ├─sda3 sda3 8:3 hfsplus Recovery HD 2184e460-197b-373b-8a3f-082bc91e7981 426f6f74-0000-11aa-aa11-00306543ecac Recovery HD cb168b2b-fe91-4a39-a2d3-56ab52c63ad5 619.9M └─sda1 sda1 8:1 vfat EFI 2860-11F4 c12a7328-f81f-11d2-ba4b-00a0c93ec93b EFI system partition 0000758e-50ea-0000-cd18-00006d180000 200M Created attachment 1219384 [details]
files present on sda1
Created attachment 1219385 [details]
files present on sda4
Proposing as a blocker: "The installer must be able to install into free space alongside an existing OS X installation, install and configure a bootloader that will boot Fedora. " https://fedoraproject.org/wiki/Fedora_25_Final_Release_Criteria#OS_X_dual_boot It would be nice if somebody could try this with a more recent version of OS X. Chris, any chance you can try this? I'd be inclined to vote not a blocker as long as the OSX boot isn't destroyed. It's unfortunate, certainly. But it's not wrecking their existing system. I'm only willing to go with that if we either agree to remove the release criterion or get further testing that indicates this is system-specific. I don't really like that we have specific criterion for one set of hardware that is actively hostile to Linux, honestly. I forgot to mention it was a Mac Mini a tested with. Fairly old, as you can tell by the OS X version (bought during 2011-2012, most probably). https://lists.fedoraproject.org/pipermail/test/2014-August/122496.html was the proposal and discussion for the criterion. I can't test this, I'm traveling without a Mac. From the attached log: 13:12:57,814 INFO program: rsync: failed to set times on "/mnt/sysimage/boot/efi": Read-only file system (30) For whatever reason, the file system is being mounted read-only. Also, there is no mkfs.hfsplus, which means the installer is using a pre-existing hfsplus volume. Can you do something like 'dmesg | grep hfs' and let's see if it's finding a journal? The kernel doesn't have hfsplus journal support, so by default it will mount hfsplus volumes that have a journal read only. For a very long time now, Mac OS only creates hfsplus volumes with journals. The installer mkfs command is supposed to create one without a journal, so that the default mount is read write. I'd say it's not entirely clear it's a blocker because the requirement is that we're installing into free space, which could be interpreted as meaning reuse of an existing hfsplus ESP isn't supported - even though that's what the installer does by default. "The installer must be able to install into free space alongside an existing OS X installation, install and configure a bootloader that will boot Fedora." (In reply to Stephen Gallagher from comment #12) > I don't really like that we have specific criterion for one set of hardware > that is actively hostile to Linux, honestly. This logic doesn't work for me. Everything is actively hostile to everything else. Fedora is actively hostile to even its own installations, e.g. bug 825236. So if this is going to be a metric for wiping away the Mac criteria, fine, then wipe away the Windows criteria too; and continue to put our head in the sand when the Fedora installer, without asking or informing the user, obliterates the ability to boot previously installed Linux installations. So I *think* I know what the actual bug is here. It was introduced by this commit, and specifically relates to the highlighted line: https://github.com/rhinstaller/blivet/commit/368a4db6141c7fdcb31ed45fe6be207ccc08ad30#diff-c0cef2bf2f989e2f94b5d1cca9c8115eL1112 that commit changed up how we do format type detection. In the previous code it was in one giant handleUdevDeviceFormat() function, whose conditions for marking a device as 'macefi' were: elif format_type == "hfsplus": if isinstance(device, PartitionDevice): macefi = formats.getFormat("macefi") if macefi.minSize <= device.size <= macefi.maxSize and \ device.partedPartition.name == macefi.name: format_designator = "macefi" that is, if it's got an HFS+ filesystem and it's bigger than 50MiB (the macefi format has no maxSize) and the partition's GPT 'name' is "Linux HFS+ ESP" (that's the MacEFIFS format's 'name'), then we decide it's a macefi device. In the new code, we use the 'populator helpers' approach. Code is: https://github.com/rhinstaller/blivet/blob/2.1-devel/blivet/populator/helpers/boot.py#L53-L55 MacEFIFormatPopulator inherits from BootFormatPopulator, and specifies a _type_specifier and _base_type_specifier for the parent classmethod match() , which is what actually decides if a format populator 'matches' and hence we decide the device is of that format. Note the match() logic: return (udev.device_get_format(data) == cls._base_type_specifier and isinstance(device, PartitionDevice) and (device.bootable or not cls._bootable) and fmt.min_size <= device.size <= fmt.max_size) and note that the MacEFIFormatPopulator class does *not* set _bootable to True. So that means we've still got the 'is it HFS+' and 'is it in the valid size range' conditions, but that's all. Importantly, we've lost the condition about the GPT 'name' of the partition from the old code. If I'm right, this means blivet will decide absolutely *any* HFS+ partition it sees that's larger than 50MiB is a 'macefi' partition, try to use it as /boot/efi for the install, and go badly wrong. We're actually very lucky that HFS+ partitions with journals only mount read-only, otherwise we'd be copying files and stuff to people's OS X partitions when they try to install. We'd better hope to God no-one tries an install on a system with an HFS+ partition that happens to have journalling disabled. (In reply to Adam Williamson from comment #17) > So I *think* I know what the actual bug is here. It was introduced by this > commit, and specifically relates to the highlighted line: > > https://github.com/rhinstaller/blivet/commit/ > 368a4db6141c7fdcb31ed45fe6be207ccc08ad30#diff- > c0cef2bf2f989e2f94b5d1cca9c8115eL1112 > > that commit changed up how we do format type detection. Why is something this significant changing between beta and final? > We're actually very lucky that HFS+ partitions with journals only mount > read-only, otherwise we'd be copying files and stuff to people's OS X > partitions when they try to install. We'd better hope to God no-one tries an > install on a system with an HFS+ partition that happens to have journalling > disabled. mactel-boot writes a bunch of identically named dummy files to trick the firmware into seeing the ESP as if it's actually a macOS installation. It'll overwrite things like the kernel for example. This seems bad but journaled HFS Plus is mandatory as a macOS boot volume; via the GUI there is no way to create a non-journaled HFS Plus volume for many years now, although it can be removed via CLI. I can't think of a reason why anyone would do this. So it'd be a rare case indeed. Nevertheless it would be a data loss bug, however hypothetical - but then we have a real macOS data loss bug that isn't hypothetical, and in that case upstream blames the user for it, so I'm just going to shrug at the concern in this bug. "Why is something this significant changing between beta and final?" It wasn't. It changed between 24 and 25. No-one tested OS X dual boot at all until today, it's been broken since whenever blivet 2.x landed, I think. (In reply to Adam Williamson from comment #19) > "Why is something this significant changing between beta and final?" > > It wasn't. It changed between 24 and 25. No-one tested OS X dual boot at all > until today, it's been broken since whenever blivet 2.x landed, I think. I've done multiple installations of Fedora 25 along side OS X into free space and never ran into this bug, although I can't tell right now what compose I last tried because the evidence is 1000 miles away. Huhh. Well. That's interesting, and makes me doubt my assessment of the bug a bit. But it's really hard to tell without logs. When you do have time, could you please see if you can reproduce the bug with RC-1.2 and also see when's the last time it worked, so we can triage? And post some logs, and stuff? You know the drill... Also, if that's the case, why didn't you file results in the matrix? I checked, and testcase_stats shows zero runs of the OSX dual boot test before RC-1.2. (In reply to Adam Williamson from comment #21) > Huhh. Well. That's interesting, and makes me doubt my assessment of the bug > a bit. But it's really hard to tell without logs. When you do have time, > could you please see if you can reproduce the bug with RC-1.2 and also see > when's the last time it worked, so we can triage? And post some logs, and > stuff? You know the drill... I probably won't have a Mac to test on for two or three weeks. > Also, if that's the case, why didn't you file results in the matrix? I > checked, and testcase_stats shows zero runs of the OSX dual boot test before > RC-1.2. Dunno. Maybe I was being lazy, or possibly I tested with a nightly that wasn't a current test. https://www.happyassassin.net/updates/1393846.0.img is my attempt to fix this, assuming I'm right about the problem. PR is https://github.com/rhinstaller/blivet/pull/523 When you did your attempts, were you starting from a clean OS X install with no previous Fedora install alongside it? Or was there an existing Fedora install? Additionally, a couple of images to hopefully make testing this possible on non-Macs, if you fake up a partition layout: https://www.happyassassin.net/updates/1393846.0.fakemac-nofix.img https://www.happyassassin.net/updates/1393846.0.fakemac-fix.img They both make blivet always decide the system is a Mactel (Intel Mac). The former just does that, otherwise it matches RC-1.2. The latter also includes my patch. OK, so yeah, I did my best to test this with a 'fake Mac' setup. In a UEFI VM, I completely wiped the hard disk, then did: fdisk /dev/vda g # new gpt label n # new partition 1 # number <ret> # default first sector +200M # size t # set type 1 # EFI system partition n # new partition 2 # number <ret> # default first sector +2G # size t # set type 2 # partition number (now there's more than one) 38 # Apple HFS/HFS+ w # write q # quit Then I did: mkfs.vfat /dev/vda1 mkfs.hfsplus /dev/vda2 The idea being to fake up something like a macOS install: a regular EFI system partition, then a biggish HFS+ partition to represent the macOS partition. I booted my fakemac-nofix.img and verified that it detected vda2 as a 'macefi' partition, and ran the install process and indeed it mounted it as /boot/efi and tried to write to it. In fact it succeeded - I guess because I created the filesystem with 'mkfs.hfsplus' it didn't have journalling, so it was mounted read-write. Then I repeated the entire setup process, and booted my fakemac-fix.img image. Now it detects vda2 as 'hfsplus'. I ran the install process, and this time it left vda2 alone and created a new 200MB hfsplus vda3 and mounted *that* as /boot/efi and wrote to it. After the install process completed, I tried booting the nofix image and checking the detection again: it detects both HFS+ partitions as macefi. Then I did the same with the fix image: it detects vda2 as hfsplus and vda3 as macefi, as expected. So, that backs up both my theory and my patch, so far as I can manage. Changing component to the identified cause. The original Mac I reported this from has this parted output: Model: ATA TOSHIBA MK5065GS (scsi) Disk /dev/sda: 500GB Sector size (logical/physical): 512B/512B Partition Table: gpt Disk Flags: Number Start End Size File system Name Flags 1 20.5kB 210MB 210MB fat32 EFI system partition boot, esp 2 210MB 217GB 216GB hfs+ Linux HFS+ ESP 3 217GB 217GB 650MB hfs+ Recovery HD 4 217GB 218GB 1074MB ext4 5 218GB 500GB 282GB lvm So the theories about "boot" flag placed on MacOS main partition were not correct, it seems. (In reply to Adam Williamson from comment #23) > https://www.happyassassin.net/updates/1393846.0.img is my attempt to fix this, It fixed the problem on my Mac Mini. Installation now succeeds. The final layout is this: Number Start End Size File system Name Flags 1 20.5kB 210MB 210MB fat32 EFI system partition boot, esp 2 210MB 283GB 282GB hfs+ Customer 3 283GB 283GB 650MB hfs+ Recovery HD 4 283GB 283GB 210MB hfs+ Linux HFS+ ESP 5 283GB 285GB 1074MB ext4 6 285GB 500GB 216GB lvm I'm surprised it creates another ESP as sda4 with hfs+, when ESP already existed as sda1 with fat32. But it works. The only problem is that MacOS can't be booted from grub (errors printed), I have to use one-time boot menu to boot into MacOS. Discussed at mini blocker review during go/no-go meeting [1]. This bug was accepted as Final Blocker - by adamw's analysis of the cause, this seems like a clear violation of "The installer must be able to install into free space alongside an existing OS X installation, install and configure a bootloader that will boot Fedora." that will affect all dual-boot OS X installs [1] https://meetbot.fedoraproject.org/fedora-meeting-2/2016-11-10/f25-final-gono-go-meeting.2016-11-10-17.00.html "So the theories about "boot" flag placed on MacOS main partition were not correct, it seems" Yeah, I realized that in the middle of the discussion: it was just a mis-read on my part, I was looking in the wrong class :) "I'm surprised it creates another ESP as sda4 with hfs+, when ESP already existed as sda1 with fat32." This is actually intended: it's the trick we use to get the Apple firmware to show Fedora in the graphical boot menu. The fact that we label the partition as an 'ESP' is a bit misleading, but I think we have to do that for the Linux tools to work with it. It's not really an ESP at all. The firmware doesn't boot from it. What it is, basically, is a fake macOS partition. As I understand it (thanks pjones), the Apple firmware has this wacky setup where it looks for HFS+ partitions whose filesystems include the expected directories for an ESP and have some key files in the places where macOS puts them, then it takes the relevant files from them and 'blesses' them into the actual ESP. If we just install our files directly into the *real* ESP, then you can boot Fedora somehow or other (press a special key on boot or something), but you won't see it in the nice graphical boot menu. So we're basically being great politicians, lying out of both sides of our mouths: we lie to the Linux tools that the partition is an ESP (by marking it as such and mounting it at /boot/efi), and we lie to the firmware that it's a macOS partition (by formatting it as HFS+ and dumping files into all the special locations that it looks for). And by doing that, it works. Isn't technology great? That's my outsider's understanding of it, btw, I might have it slightly garbled, but that's basically it. It's all as designed. One thing that would be great if you could check - can you try installing again over the top of the successful install, again with my fix? When you do that, it should re-use the existing fake-ESP, not create another new one. (In reply to Adam Williamson from comment #30) > One thing that would be great if you could check - can you try installing > again over the top of the successful install, again with my fix? When you do > that, it should re-use the existing fake-ESP, not create another new one. Jan Sedlak tried it and it re-used that partition (sda4, from comment 28), and not created a new one. python-blivet-2.1.6-4.fc25 has been submitted as an update to Fedora 25. https://bodhi.fedoraproject.org/updates/FEDORA-2016-80f65e5670 Can you please re-confirm this with RC-1.3? Thanks. (Most importantly from a state *without* Fedora already installed, just OS X, but also testing install over existing Fedora alongside OS X is handy). Installed F25 alongside a default Mac OS X installation, worked fine. I installed F25 in a dual-boot schema along with macOS Sierra (MacBook Pro Retina, 15-inch, Mid 2014) and it worked fine. python-blivet-2.1.6-4.fc25 has been pushed to the Fedora 25 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-80f65e5670 Setting back to verified. Can people also karma the update, if the fix is good and it still works fine in other ways? Thanks. python-blivet-2.1.6-4.fc25 has been pushed to the Fedora 25 stable repository. If problems still persist, please make note of it in this bug report. |