Bug 2314079 - Fedora 33+ aarch64 QEMU emulation slowdown
Summary: Fedora 33+ aarch64 QEMU emulation slowdown
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: rawhide
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2024-09-22 10:39 UTC by nucleo
Modified: 2024-10-01 21:55 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2024-09-25 16:55:39 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description nucleo 2024-09-22 10:39:29 UTC
Fedora aarch64 QEMU emulation slowed down starting from Fedora 33.

Reproducible: Always

Steps to Reproduce:
Compare Fedora-Cloud 32 and 33 aarch64 images in qemu-system-aarch64 (any version including 9.1) and running command in qemu-aarch64-static.
Actual Results:  
Fedora 32: Startup finished in 4.518s (kernel) + 9.704s (initrd) + 13.090s (userspace) = 27.313s
Fedora 33: Startup finished in 4.755s (kernel) + 19.132s (initrd) + 17.163s (userspace) = 41.051s

Even more significant slowdown in qemu-aarch64-static, when running 'yum repolist'

F32: real    0m1.455s
F33: real    0m10.371s


Could Pointer Authentication & Branch Target Enablement F33 feature be the cause of a significant slowdown in Fedora 33+ aarch64 QEMU emulation?

https://bugzilla.redhat.com/show_bug.cgi?id=1847148

Comment 1 Peter Robinson 2024-09-22 10:45:08 UTC
> Compare Fedora-Cloud 32 and 33 aarch64 images in qemu-system-aarch64 (any
> version including 9.1) and running command in qemu-aarch64-static.

So just to be clear here, running a F-32 VM is fast and F-33 is slow on a F-41/qemu 9.1 host? As in the slow down is on the VM? Can you add commands etc for easier reproduction?

Comment 2 Peter Robinson 2024-09-22 10:46:04 UTC
Adding jlinton for PAC/BTI clarification.

Comment 3 Richard W.M. Jones 2024-09-22 11:00:12 UTC
What's the exact version of qemu?

Comment 4 nucleo 2024-09-22 11:07:55 UTC
Steps to reproduce this (only qemu-user-static-aarch64 because steps for booting more complicated):

# uname -a
Linux rawhide 6.11.0-63.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Sep 15 17:14:12 UTC 2024 x86_64 GNU/Linux

# rpm -q qemu-user-static-aarch64 
qemu-user-static-aarch64-9.1.0-2.fc42.x86_64
(slowdown also in any previous QEMU versions)

Download aarch64 images
https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/32/Cloud/aarch64/images/Fedora-Cloud-Base-32-1.6.aarch64.raw.xz
https://archives.fedoraproject.org/pub/archive/fedora/linux/releases/33/Cloud/aarch64/images/Fedora-Cloud-Base-33-1.2.aarch64.raw.xz


# xz -kd Fedora-Cloud-Base-32-1.6.aarch64.raw.xz
# losetup /dev/loop0  Fedora-Cloud-Base-32-1.6.aarch64.raw
# kpartx -va /dev/loop0
# mount /dev/mapper/loop0p2 /mnt/
# chroot /mnt
# time yum repolist
repo id                                                             repo name
fedora                                                              Fedora 32 - aarch64
fedora-cisco-openh264                                               Fedora 32 openh264 (From Cisco) - aarch64
fedora-modular                                                      Fedora Modular 32 - aarch64
updates                                                             Fedora 32 - aarch64 - Updates
updates-modular                                                     Fedora Modular 32 - aarch64 - Updates

real    0m1.677s
user    0m1.561s
sys     0m0.063s
# exit
# umount /mnt 
# kpartx -d /dev/loop0
# losetup -d /dev/loop0 




# xz -kd Fedora-Cloud-Base-33-1.2.aarch64.raw.xz
# losetup /dev/loop0  Fedora-Cloud-Base-33-1.2.aarch64.raw
# kpartx -va /dev/loop0
# mount /dev/mapper/loop0p2 /mnt/
# chroot /mnt
# time yum repolist
repo id                                                             repo name
fedora                                                              Fedora 33 - aarch64
fedora-cisco-openh264                                               Fedora 33 openh264 (From Cisco) - aarch64
fedora-modular                                                      Fedora Modular 33 - aarch64
updates                                                             Fedora 33 - aarch64 - Updates
updates-modular                                                     Fedora Modular 33 - aarch64 - Updates

real    0m11.415s
user    0m11.243s
sys     0m0.067s
# exit
# umount /mnt
# kpartx -d /dev/loop0
# losetup -d /dev/loop0

Comment 5 Jeremy Linton 2024-09-24 18:06:21 UTC
Arm architectural features are usually designed with an eye towards how they can be efficiently implemented in HW, this means that if they are emulated in software the overhead can frequently be significant. Which is why say, the arm software models have flags which enable or disable individual feature emulation. For example the PAC algorithm can be changed to something that is more friendly to software emulation. Obviously then, as more features are enabled then the software must emulate more behavior. Frequently picking a simpler v8.0 cpu target will be considerably faster than picking one with all the architectural features enabled and emulated fully because the overhead of doing additional PAC computations, or checking page properties and BTI landing pads, or other security checks (ex:MTE!) which are largely transparent in HW now need additional software validation which slows the overall emulation. So, for the end user they have to decide if they aren't running in a HW accelerated environment like KVM/etc whether these security features are worth the emulation overhead and enable/disable them as needed. I don't believe there is a 2-10x slowdown going from F32-F33 on HW so that should help to narrow down the problem.

Comment 6 Jeremy Linton 2024-09-25 16:54:19 UTC
Since this is binfmt, you have to adjust the cpu/emulation selection via an environment variable since by design it selects the emulation for max compatibility, which is intentional. If you adjust that, your problem here will go away.

As this is all by design, I think this bug should be closed as NOTABUG

Comment 7 nucleo 2024-09-26 12:23:00 UTC
With QEMU_CPU=cortex-a76 binfmt runs much faster but when Fedora-Cloud image booted in vm with cortex-a76 CPU startup of 33 two times slower than 32.

Comment 8 Jeremy Linton 2024-10-01 21:52:04 UTC
a76 is a v8.2 core, you might try something like a cortex-a57. That said, while the existence of PAC/BTI might add some extra overhead, its possible the boot sequence is something else causing a slowdown. With an a57, is the F32/33 repoquery time closer to the original 1.67 time?

Comment 9 Jeremy Linton 2024-10-01 21:55:49 UTC
Oh! libvirt isn't going to use an environment variable for cpu selection, that needs to be setup using libvirt/etc specific methods.


Note You need to log in before you can comment on or make changes to this bug.