Bug 1651348 - Armv7 guest fails on AArch64 - "cpu.c:906: arm_cpu_realizefn: Assertion no_aa32"
Summary: Armv7 guest fails on AArch64 - "cpu.c:906: arm_cpu_realizefn: Assertion no_aa32"
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: rawhide
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Fedora Virtualization Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: ARMTracker
TreeView+ depends on / blocked
 
Reported: 2018-11-19 19:08 UTC by Paul Whalen
Modified: 2018-12-12 12:50 UTC (History)
13 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-12-12 12:50:31 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Paul Whalen 2018-11-19 19:08:23 UTC
Description of problem:
Attempting to boot an armv7 guest on an aarch64 host with aarch32 capabilities (Mustang xgene) fails with:

qemu-system-aarch64: /builddir/build/BUILD/qemu-3.1.0-rc1/target/arm/cpu.c:906: arm_cpu_realizefn: Assertion `no_aa32 || cpu_isar_feature(arm_div, cpu)' failed.
2018-11-19 17:04:36.735+0000: shutting down, reason=failed


Version-Release number of selected component (if applicable):
qemu-system-aarch64-3.1.0-0.1.rc1.fc30.aarch64

How reproducible:
Everytime.

Steps to Reproduce:

sudo virt-install \
    --name f30-armhfp --ram 4096 --arch armv7l --os-variant fedora22 \
    --disk /var/lib/libvirt/images/f30-armhfp.raw,bus=virtio,format=raw,size=8 \
    --location=https://kojipkgs.fedoraproject.org/compose//rawhide/Fedora-Rawhide-20181118.n.0/compose/Everything/armhfp/os/ \
    --extra-args="console=ttyAMA0 rw"

WARNING  Couldn't configure UEFI: Did not find any UEFI binary path for arch 'armv7l'
WARNING  Your VM may not boot successfully.

Starting install...
Retrieving file vmlinuz...                                                                                                                                             | 7.3 MB  00:00:01     
Retrieving file initrd.img...                                                                                                                                          |  54 MB  00:00:04     
Allocating 'f30-armhfp.raw'                                                                                                                                            | 8.0 GB  00:00:00     
ERROR    internal error: qemu unexpectedly closed the monitor: qemu-system-aarch64: /builddir/build/BUILD/qemu-3.1.0-rc1/target/arm/cpu.c:906: arm_cpu_realizefn: Assertion `no_aa32 || cpu_isar_feature(arm_div, cpu)' failed.

Comment 1 Jeremy Linton 2018-12-07 22:49:32 UTC
Well, I haven't hit this problem, although I have hit a bunch of other ones. I'm on a mustang now with kernel 4.20.0-0.rc5.git2, qemu 3.0.91 (qemu-3.1.0-0.1.rc1.fc30).

I'm on a b0 though, I wonder if that has something to do with it? There are a couple problems with the A3's (AFAIK, and I don't think fedora is carrying any of the workarounds). Let me see if I can reproduce it on an older mustang.

Comment 2 Jeremy Linton 2018-12-07 23:16:14 UTC
Actually, I have reproduced it on the mustang now.

Comment 3 Jeremy Linton (ARM) 2018-12-10 23:02:04 UTC
So, this is almost assuredly caused by https://github.com/qemu/qemu/commit/0f8d06f16c9d1041d728d09d464462ebe713c662.

Which is odd, as I've been struggling to understand how it works, given that on all the machines I've tested aa64pfr0_el1 is being trapped by the kernel and emulated to userspace, with the single detail that its _NOT_ indicating aarch32 support at any exception levels. That is because the kernel is sanitizing the feature registers, but because its not marked FTR_VISIBLE the default value being exported is aarch64 only support.

Comment 4 Richard Henderson 2018-12-11 17:06:55 UTC
Well, that isn't happening here:

(gdb) fin
Run till exit from #0  kvm_arm_get_host_cpu_features (
    ahcf=ahcf@entry=0xaaaaab804870 <arm_host_cpu_features>)
    at /home/rth/qemu/qemu/target/arm/kvm64.c:483
0x0000aaaaaaec76b8 in kvm_arm_set_cpu_features_from_host (cpu=0xffffbc529010)
    at /home/rth/qemu/qemu/target/arm/kvm.c:155
155	            return;
Value returned is $3 = true
(gdb) p/x arm_host_cpu_features.isar.id_aa64pfr0
$4 = 0x122

However, 

$ grep VERSION /etc/os-release 
VERSION="18.10 (Cosmic Cuttlefish)"
$ uname -a
Linux chuckanut 4.18.0-11-generic #12-Ubuntu SMP Tue Oct 23 19:24:51 UTC 2018 aarch64 aarch64 aarch64 GNU/Linux
$ head -8 /proc/cpuinfo 
processor	: 0
BogoMIPS	: 100.00
Features	: fp asimd evtstrm cpuid
CPU implementer	: 0x50
CPU architecture: 8
CPU variant	: 0x0
CPU part	: 0x000
CPU revision	: 1

I'll see about trying a kernel newer than 4.18, and see if that's where the change came in.

We have missed the boat for fixing this for the qemu-3.1 release.  Which is annoying.
This should probably get an upstream launchpad bug report too.

Comment 5 Jeremy Linton 2018-12-11 21:32:06 UTC
Hmm (i'm not a qemu expert), so some of what this was based on was dumping the values from arm_cpu_realizefn, where it turns out ARM_FEATURE_AARCH64 isn't even set because -cpu host,aarch64=off disables it. So that code isn't even executing!

Hence when I saw that cpu_isar_feature equal to 0, I assumed that was because id_aa64pfr0 was 0x11, which is what you get if its read from userspace due to the HIDDEN/STRICT settings for the emulation. But dumping it, its also, so right before the assert() we have:

arm_feature(ARM_FEATURE_AARCH64)=0
cpu_isar_feature = 0
id_aa64pfr0=0

if dumped like:

"arm_feature(ARM_FEATURE_AARCH64)=%d cpu_isar_feature = %d , id_aa64pfr0=%lX\n",arm_feature(&cpu->env, ARM_FEATURE_AARCH64), cpu_isar_feature(aa64_aa32, cpu), cpu->isar.id_aa64pfr0

and run with qemu-system-aarch64 machine virt-3.1,accel=kvm -cpu host,aarch64=off

Comment 6 Richard Henderson 2018-12-11 21:57:20 UTC
Ok, knowing the command line options helps some,
but I still don't replicate either the assert or
what you're seeing.


(gdb) run -machine virt-3.1,accel=kvm -cpu host,aarch64=off
Thread 1 "qemu-system-aar" hit Breakpoint 2, arm_cpu_realizefn (
    dev=0xffffbc256010, errp=0xffffffffe670)
    at /home/rth/qemu/qemu/target/arm/cpu.c:906
906	        assert(no_aa32 || cpu_isar_feature(arm_div, cpu));
(gdb) p/x cpu->isar
$3 = {id_isar0 = 0x2101110, id_isar1 = 0x13112111, id_isar2 = 0x21232042, 
  id_isar3 = 0x1112131, id_isar4 = 0x10142, id_isar5 = 0x1, id_isar6 = 0x0, 
  mvfr0 = 0x10110222, mvfr1 = 0x12111111, mvfr2 = 0x43, id_aa64isar0 = 0x0, 
  id_aa64isar1 = 0x0, id_aa64pfr0 = 0x122, id_aa64pfr1 = 0x0}
(gdb) call isar_feature_arm_div(&cpu->isar)
$4 = true


The contents of id_aa64pfr0 should be completely ignored for aarch64=off.

I do still see kvm_arm_set_cpu_features_from_host being called in
order to grab the other id registers to fill in -cpu host.

Comment 7 Jeremy Linton 2018-12-11 22:20:55 UTC
Ok, I've been really confused about where id_aa6iasr0 has been comming from, and know I know whats going on (part of the clue was your line numbers/printing).

Upstream qemu head has a bunch of code to set the isar registers that (AFAIK) is missing from the release fedora is running. Looks like your commit was incompletely backported?

Comment 8 Richard Henderson 2018-12-11 22:25:46 UTC
OMG, now I feel foolish.  The clue was right here:

qemu-system-aarch64-3.1.0-0.1.rc1.fc30.aarch64
                              ^^^

The fix for the bug that you're seeing was included in 3.1.0-rc2.
The final 3.1.0 release came out today, fwiw.

Comment 9 Jeremy Linton 2018-12-11 23:14:43 UTC
I should have probably pulled qemu forward earlier to see if the problem exists. Anyway, it seems that its using KVM_GET_ONE_REG which returns the correct value for aa64pfr0 too.

Comment 10 Cole Robinson 2018-12-12 12:50:31 UTC
3.1.0 GA is built for rawhide now


Note You need to log in before you can comment on or make changes to this bug.