Bug 2137803 - latest f36 kernels fail to boot on macOS hosts
Summary: latest f36 kernels fail to boot on macOS hosts
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-10-26 09:26 UTC by Christophe Fergeau
Modified: 2022-11-09 15:42 UTC (History)
20 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
logs for the amd64 crash (42.30 KB, text/plain)
2022-10-26 11:33 UTC, Christophe Fergeau
no flags Details
logs for the arm64 crash (22.76 KB, text/plain)
2022-10-26 11:34 UTC, Christophe Fergeau
no flags Details

Description Christophe Fergeau 2022-10-26 09:26:19 UTC
1. Please describe the problem:

I'm running f36 virtual machines on macOS using Apple's virtualization framework ( https://developer.apple.com/documentation/virtualization ).
Everything was working fine both on x86_64 and M1 macbooks until recently.
Starting with kernel 5.19 on aarch64 (M1), and kernel 6.0 on x86_64, my VMs fail to start very early in the boot process.

And unfortunately, this is about all the information I can provide in this bug report. With the virtualization framework, I only start getting logs fairly in the boot process (around the time udev enumerate devices/load modules), and the VM dies long before this. The only information I'm getting from Apple's APIs is "vz.VirtualizationError" with no additional logs.

I'll also open a ticket with Apple, but maybe there are recent kernel/build changes which could explain this sudden failure?

Fwiw, the same kernel boots fine using qemu + Apple's hypervisor framework (which is different - lower level - from Apple's virtualization framework).

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Reproducing the issue requires a macbook machine. Then you can get a kernel/initrd from a fedora VM, and use https://github.com/evansm7/vftool to try to start a VM using this kernel/initrd (preferrably on x86_64 - on a m1 you need an uncompressed kernel).


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

I tried vmlinuz-6.1.0-0.rc2.21.fc38.x86_64 and it also happens with this kernel version

6. Are you running any modules that not shipped with directly Fedora's kernel?:

I'm not using any additional modules

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

As explained above, I unfortunately could not get any logs :-/

Comment 1 Christophe Fergeau 2022-10-26 11:31:13 UTC
I managed to find some logs in macOS Console application.
The failure on Intel macbooks is:

Exception Type:                EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes:              0x0000000000000001, 0x0000000000000000
Exception Note:                EXC_CORPSE_NOTIFY

Termination Signal:        Illegal instruction: 4
Termination Reason:        Namespace SIGNAL, Code 0x4
Terminating Process:      exc handler [2265]

Thread 5 crashed with X86 Thread State (64-bit):
    rax: 0x0000000000000000    rbx: 0x000070000044a758    rcx: 0x0000000000000000    rdx: 0x0000000000000000
    rdi: 0x0000000000000000    rsi: 0x000000001f08000c    rbp: 0x000070000044a670    rsp: 0x000070000044a660
      r8: 0x0000000000000000      r9: 0x0000000000000000    r10: 0x0000000000000000    r11: 0x0000000000000000
    r12: 0x0000000000000002    r13: 0x00007f853480e000    r14: 0x00007f8533f06e70    r15: 0x000000000000ffff
    rip: 0x0000000107bf5fa7    rfl: 0x0000000000010206    cr2: 0x0000000109c3f000
    
Logical CPU:          2
Error Code:            0x00000000
Trap Number:          6

Thread 5 instruction stream:
    01 74 09 48 8b 7d f0 e8-a4 04 00 00 48 89 df e8    .t.H.}......H...
    40 03 00 00 0f 0b 66 2e-0f 1f 84 00 00 00 00 00    @.....f.........
    90 90 90 90 55 48 89 e5-41 56 53 48 89 fb 0f b6    ....UH..AVSH....
    17 4c 8d 77 01 f6 c2 01-74 0a 48 8b 73 10 48 8b    .L.w....t.H.s.H.
    53 08 eb 06 48 d1 ea 4c-89 f6 48 8b 3d 9f 80 01    S...H..L..H.=...
    00 e8 3a fe f5 ff f6 03-01 74 04 4c 8b 73 10 4c    ..:......t.L.s.L
    89 f7 e8 a9 04 00 00[0f]0b 55 48 89 e5 53 48 81    .........UH..SH.	<==
    ec 08 01 00 00 49 89 fa-84 c0 74 2c 0f 29 85 40    .....I....t,.).@
    ff ff ff 0f 29 8d 50 ff-ff ff 0f 29 95 60 ff ff    ....).P....).`..
    ff 0f 29 9d 70 ff ff ff-0f 29 65 80 0f 29 6d 90    ..).p....)e..)m.
    0f 29 75 a0 0f 29 7d b0-48 8d 85 10 ff ff ff 48    .)u..)}.H......H
    89 70 08 48 89 50 10 48-89 48 18 4c 89 40 20 4c    .p.H.P.H.H.L.@ L
    
Thread 5 last branch register state not available.

Comment 2 Christophe Fergeau 2022-10-26 11:33:07 UTC
The arm64 crash is slightly different:

Exception Type:        EXC_BREAKPOINT (SIGTRAP)
Exception Codes:       0x0000000000000001, 0x0000000100acfef8
Exception Note:        EXC_CORPSE_NOTIFY

Termination Reason:    Namespace SIGNAL, Code 5 Trace/BPT trap: 5
Terminating Process:   exc handler [4064]

Thread 9 crashed with ARM Thread State (64-bit):
    x0: 0x0000000000000000   x1: 0x0000000000000000   x2: 0x0000000000000000   x3: 0x0000000000000000
    x4: 0x0000000000000000   x5: 0x0000000000000000   x6: 0x0000000000000000   x7: 0x0000000000000000
    x8: 0xedfbad7054b9004c   x9: 0xedfbad7054b9004c  x10: 0xaf957ffe418fd437  x11: 0x0000000000400000
   x12: 0x0000000000400000  x13: 0x0000000000c00001  x14: 0x0000000028cd9729  x15: 0x0000000000f9ef58
   x16: 0xfffffffffffffff4  x17: 0x0000000218a17f10  x18: 0x0000000000000000  x19: 0x000000016f946bd8
   x20: 0x000000016f946c38  x21: 0x000000013df17f10  x22: 0x000000000000c025  x23: 0x0000000000000020
   x24: 0x000000000000e801  x25: 0x0000000000074009  x26: 0x000000013df17e98  x27: 0x000000016f946dd0
   x28: 0x00000000623a0049   fp: 0x000000016f946bc0   lr: 0xad27800100acfef8
    sp: 0x000000016f946ba0   pc: 0x0000000100acfef8 cpsr: 0x60001000
   far: 0x0000000100d50000  esr: 0xf2000001 (Breakpoint) brk 1

Still no kernel-side details :-/

Comment 3 Christophe Fergeau 2022-10-26 11:33:48 UTC
Created attachment 1920483 [details]
logs for the amd64 crash

Comment 4 Christophe Fergeau 2022-10-26 11:34:16 UTC
Created attachment 1920484 [details]
logs for the arm64 crash

Comment 5 Christophe Fergeau 2022-11-04 11:52:16 UTC
I bisected this problem to https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6cd514e58f12b211d638dbf6f791fa18d854f09c , at least for x86_64/macOS11 machines. If I revert this patch and rebuild the latest fedora kernel, my virtual machine successfully boots.

Comment 6 Kai-Heng Feng 2022-11-07 02:02:05 UTC
Can you please attach `lspci -vv` without the offending commit?

Comment 7 Christophe Fergeau 2022-11-07 16:49:02 UTC
I built upstream commit 8f71a2b3f435 with "PCI: Clear PCI_STATUS when setting up device" reverted, and started a VM on my x86_64 macbook. lspci -vv is:

00:00.0 Host bridge: Apple Inc. Device f020
	Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

00:01.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev 01)
	Subsystem: Red Hat, Inc. Device 0041
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at c0000100 (32-bit, non-prefetchable) [size=64]
	Region 1: Memory at c00003c0 (32-bit, non-prefetchable) [size=32]
	Region 2: Memory at c00003e0 (32-bit, non-prefetchable) [size=16]
	Region 3: Memory at c00003f0 (32-bit, non-prefetchable) [size=16]
	Region 4: Memory at c0000140 (32-bit, non-prefetchable) [size=64]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci

00:05.0 Communication controller: Red Hat, Inc. Virtio console (rev 01)
	Subsystem: Red Hat, Inc. Device 0043
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at c0000180 (32-bit, non-prefetchable) [size=64]
	Region 1: Memory at c0000400 (32-bit, non-prefetchable) [size=16]
	Region 2: Memory at c0000410 (32-bit, non-prefetchable) [size=16]
	Region 3: Memory at c0000420 (32-bit, non-prefetchable) [size=16]
	Region 4: Memory at c0000000 (32-bit, non-prefetchable) [size=128]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci

00:06.0 Mass storage controller: Red Hat, Inc. Virtio block device (rev 01)
	Subsystem: Red Hat, Inc. Device 0042
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at c00001c0 (32-bit, non-prefetchable) [size=64]
	Region 1: Memory at c0000200 (32-bit, non-prefetchable) [size=64]
	Region 2: Memory at c0000430 (32-bit, non-prefetchable) [size=16]
	Region 3: Memory at c0000440 (32-bit, non-prefetchable) [size=16]
	Region 4: Memory at c0000240 (32-bit, non-prefetchable) [size=64]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci

00:07.0 Communication controller: Red Hat, Inc. Virtio socket (rev 01)
	Subsystem: Red Hat, Inc. Device 0053
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at c0000280 (32-bit, non-prefetchable) [size=64]
	Region 1: Memory at c0000450 (32-bit, non-prefetchable) [size=16]
	Region 2: Memory at c0000460 (32-bit, non-prefetchable) [size=16]
	Region 3: Memory at c0000470 (32-bit, non-prefetchable) [size=16]
	Region 4: Memory at c0000080 (32-bit, non-prefetchable) [size=128]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci

00:08.0 Network and computing encryption device: Red Hat, Inc. Virtio RNG (rev 01)
	Subsystem: Red Hat, Inc. Device 0044
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at c00002c0 (32-bit, non-prefetchable) [size=64]
	Region 1: Memory at c0000480 (32-bit, non-prefetchable) [size=16]
	Region 2: Memory at c0000490 (32-bit, non-prefetchable) [size=16]
	Region 3: Memory at c00004a0 (32-bit, non-prefetchable) [size=16]
	Region 4: Memory at c0000300 (32-bit, non-prefetchable) [size=64]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci

00:09.0 Memory controller: Red Hat, Inc. Virtio memory balloon (rev 01)
	Subsystem: Red Hat, Inc. Device 0045
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 64
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at c0000340 (32-bit, non-prefetchable) [size=64]
	Region 1: Memory at c00004b0 (32-bit, non-prefetchable) [size=16]
	Region 2: Memory at c00004c0 (32-bit, non-prefetchable) [size=16]
	Region 3: Memory at c00004d0 (32-bit, non-prefetchable) [size=16]
	Region 4: Memory at c0000380 (32-bit, non-prefetchable) [size=64]
	Capabilities: <access denied>
	Kernel driver in use: virtio-pci

00:1f.0 ISA bridge: Intel Corporation 82801IR (ICH9R) LPC Interface Controller
	Subsystem: Intel Corporation Device 8086
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Kernel modules: lpc_ich

Comment 8 Kai-Heng Feng 2022-11-08 06:27:45 UTC
So "Cap" is flagged, can you please try to change the line to the following:
"pci_write_config_word(dev, PCI_STATUS, 0xffef);"

Comment 9 Christophe Fergeau 2022-11-08 13:09:00 UTC
The hypervisor still exits with an error with this change. I tried multiple variations of "pci_write_config_word(dev, PCI_STATUS, $CONST);", (0xff00, 0x00ff, 0x0007, 0x000c, ...), they all caused a failure to happen. Only `pci_write_config_word(dev, PCI_STATUS, 0x0);` allows me to start a VM.

Comment 10 Kai-Heng Feng 2022-11-09 12:34:46 UTC
Hmm, it feels like it's a bug in the VM stack? Seems like writing anything to PCI_STATUS is prohibited? 

Does `sudo setpci -s 00:1f.0 STATUS=0xffff` crash the VM too? Also please all the other devices too.

Have you got any reply from Apple's bug tracker?

Comment 11 Christophe Fergeau 2022-11-09 15:09:29 UTC
> Does `sudo setpci -s 00:1f.0 STATUS=0xffff` crash the VM too?

Yes that also kills the VM :-/

> Also please all the other devices too.

`setpci STATUS=0xffff` worked fine on the other devices

> Have you got any reply from Apple's bug tracker?

Not yet, last time I did that, it took them a few weeks to answer. However, in the mean time I tested macOS 13 which has been released a few weeks ago, and on this version I cannot reproduce this kernel issue, they must have fixed something in their hypervisor.
I don't expect everyone will upgrade to macOS 13 right away, so it would still be nice to avoid this kernel regression for macOS 12 users.

Comment 12 Kai-Heng Feng 2022-11-09 15:42:17 UTC
OK, so seems like it's really a bug in their VM. Hopefully Apple can fix it in macOS 12 so the patch can be reinstated...


Note You need to log in before you can comment on or make changes to this bug.