Bug 1630443

Summary: Armv7 guest fails to boot on AArch64 host with 4.18.x
Product: [Fedora] Fedora Reporter: Paul Whalen <pwhalen>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 28CC: airlied, bskeggs, ewk, hdegoede, ichavero, itamar, jarodwilson, jglisse, john.j5live, jonathan, josef, kernel-maint, labbott, linville, mchehab, mjg59, pbrobinson, steved
Target Milestone: ---   
Target Release: ---   
Hardware: aarch64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-4.18.9-300.fc29.aarch64 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-09-26 15:17:55 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
boot log with kernel ratelimiting disabled none

Description Paul Whalen 2018-09-18 16:24:00 UTC
Description of problem:
Attempting to boot an existing armv7 guest on an aarch64 host with a 4.18.x kernel fails with:

[  144.344787] systemd-journald[493]: Failed to send WATCHDOG=1 notification message: Connection refused
[  214.344533] systemd-journald[493]: Failed to send WATCHDOG=1 notification message: Transport endpoint is not connected

Version-Release number of selected component (if applicable):
kernel-4.18.x

How reproducible:
Everytime

Steps to Reproduce:

On an aarch64 host with 4.18.x kernel 

1. curl -O https://dl.fedoraproject.org/pub/fedora/linux/releases/28/Spins/armhfp/images/Fedora-Minimal-armhfp-28-1.1-sda.raw.xz
2. unxz Fedora-Minimal-armhfp-28-1.1-sda.raw.xz
3. virt-builder --get-kernel Fedora-Minimal-armhfp-28-1.1-sda.raw
4. sudo mv Fedora-Minimal-armhfp-28-1.1-sda.raw initramfs-4.16.3-301.fc28.armv7hl.img vmlinuz-4.16.3-301.fc28.armv7hl /var/lib/libvirt/images/
5. sudo virt-install --name Fedora-Minimal-armhfp-28-1.1-sda.raw --ram 4096 --arch armv7l --import --os-variant fedora22 \
                     --disk /var/lib/libvirt/images/Fedora-Minimal-armhfp-28-1.1-sda.raw \
                     --boot kernel=/var/lib/libvirt/images/vmlinuz-4.16.3-301.fc28.armv7hl,initrd=/var/lib/libvirt/images/initramfs-4.16.3-301.fc28.armv7hl.img,kernel_args="console=ttyAMA0 rw root=LABEL=_/ rootwait"

Actual results:
[  144.344787] systemd-journald[493]: Failed to send WATCHDOG=1 notification message: Connection refused
[  214.344533] systemd-journald[493]: Failed to send WATCHDOG=1 notification message: Transport endpoint is not connected


Additional info:

Working as expected with 4.17.x

Comment 1 Paul Whalen 2018-09-18 16:42:36 UTC
Adding systemd.log_level=debug to the kernel args ended with a kernel panic

..
    3.932954] Checked W+X mappings: passed, no W+X pages found
[    3.935144] rodata_test: all tests were successful
[    3.963237] systemd[1]: systemd 238 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid)
[    3.972250] systemd[1]: Detected virtualization qemu.
[    3.974204] systemd[1]: Detected architecture arm.
[    3.975991] systemd[1]: Running in initial RAM disk.

Welcome to Fedora 28 (Twenty Eight) dracut-047-8.git20180305.fc28 (Initramfs)!

[    3.994367] Core dump to |/bin/false pipe failed
[  OK  ] Reached target Initrd Root Device.
[  OK  ] Listening on Journal Socket.
[    4.027361] Core dump to |/bin/false pipe failed
[    4.038244] systemd: 128 output lines suppressed due to ratelimiting
[    4.040676] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[    4.040676] 
[    4.044280] CPU: 0 PID: 1 Comm: systemd Not tainted 4.16.3-301.fc28.armv7hl #1
[    4.047343] Hardware name: Generic DT based system
[    4.049396] [<c0311d9c>] (unwind_backtrace) from [<c030c57c>] (show_stack+0x18/0x1c)
[    4.052584] [<c030c57c>] (show_stack) from [<c0aa6960>] (dump_stack+0x80/0xa0)
[    4.055617] [<c0aa6960>] (dump_stack) from [<c0351c10>] (panic+0xc8/0x260)
[    4.058529] [<c0351c10>] (panic) from [<c0356814>] (do_exit+0x5c8/0xac8)
[    4.061315] [<c0356814>] (do_exit) from [<c0356db4>] (do_group_exit+0x64/0xe0)
[    4.064341] [<c0356db4>] (do_group_exit) from [<c0361624>] (get_signal+0x60c/0x640)
[    4.067454] [<c0361624>] (get_signal) from [<c030b994>] (do_signal+0x80/0x3cc)
[    4.070447] [<c030b994>] (do_signal) from [<c030be6c>] (do_work_pending+0x68/0xc8)
[    4.072900] [<c030be6c>] (do_work_pending) from [<c030106c>] (slow_work_pending+0xc/0x20)
[    4.074833] Exception stack(0xee0f7fb0 to 0xee0f7ff8)
[    4.076039] 7fa0:                                     014f41f0 014f41f8 00000008 0000002c
[    4.077967] 7fc0: 014bed5c 00000011 00000000 014f41f0 00000012 014f41f8 00000008 b6f72aa4
[    4.079898] 7fe0: 014f41f0 bec29640 b6e25794 b6a4fd08 800e0010 ffffffff
[    4.081490] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004
[    4.081490]

Comment 2 Paul Whalen 2018-09-18 17:16:44 UTC
Created attachment 1484456 [details]
boot log with kernel ratelimiting disabled

boot log with kernel ratelimiting disabled using "systemd.log_level=debug systemd.log_target=kmsg log_buf_len=1M printk.devkmsg=on"

Comment 3 Paul Whalen 2018-09-21 20:22:40 UTC
This is working again with kernel-4.18.9-300.fc29.

Comment 4 Paul Whalen 2018-09-26 15:16:05 UTC
Looking at the changelog, perhaps this fixed it:

arm64: KVM: Only force FPEXC32_EL2.EN if trapping FPSIMD
commit 7d14919c0d475a795c0127631ac8ecb2b0f31831 upstream.

I think it can be closed now.

Comment 5 Laura Abbott 2018-09-26 15:17:55 UTC
Thanks for looking. I'll close this and it can be reopened if it shows up again.

Comment 6 Peter Robinson 2018-09-26 15:19:10 UTC
Yes, so that fixes the following upstream stable changes : e6b673b741ea and looking at the original it looks about correct for the problems seen.

    KVM: arm64: Optimise FPSIMD handling to reduce guest/host thrashing
    
    This patch refactors KVM to align the host and guest FPSIMD
    save/restore logic with each other for arm64.  This reduces the
    number of redundant save/restore operations that must occur, and
    reduces the common-case IRQ blackout time during guest exit storms
    by saving the host state lazily and optimising away the need to
    restore the host state before returning to the run loop.