Bug 2184324 - [aarch64]Stable Guest ABI failed between rhel 9.0 and rhel 9.2
Summary: [aarch64]Stable Guest ABI failed between rhel 9.0 and rhel 9.2
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.2
Hardware: aarch64
OS: Unspecified
high
high
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Min Deng
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-04-04 09:04 UTC by Min Deng
Modified: 2023-04-28 02:44 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-26 11:12:56 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHELPLAN-153983 0 None None None 2023-04-04 09:05:20 UTC

Description Min Deng 2023-04-04 09:04:29 UTC
Description of problem:
Stable Guest ABI failed between rhel 9.0  and rhel 9.2 

Version-Release number of selected component (if applicable):
SRC:		
kernel-5.14.0-70.49.1.el9_0.aarch64		
qemu-kvm-6.2.0-11.el9_0.7.aarch64		
DST		
kernel-5.14.0-289.el9.aarch64		
qemu-kvm-7.2.0-14.el9_2.aarch64		

hostname:
ampere-hr330a-13.khw4.lab.eng.bos.redhat.com
ampere-hr330a-14.khw4.lab.eng.bos.redhat.com

How reproducible:
5/5

Steps to Reproduce:
1.boot up a guest on the rhel9.0 host 
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \
-blockdev node-name=file_aavmf_vars,driver=file,filename=avocado-vt-vm1_rhel900-aarch64-virtio-scsi_qcow2_filesystem_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \
-machine virt-rhel9.0.0,gic-version=host,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \
-m 8192 \
-object '{"size": 8589934592, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
-smp 8,maxcpus=8,cores=4,threads=1,sockets=2  \
-cpu 'host' \
-chardev socket,server=on,wait=off,path=/tmp/t11,id=qmp_id_qmpmonitor1  \
-mon chardev=qmp_id_qmpmonitor1,mode=control \
-chardev socket,server=on,wait=off,path=/tmp/t11,id=qmp_id_catch_monitor  \
-mon chardev=qmp_id_catch_monitor,mode=control  \
-serial unix:'/tmp/serialaarch64',server=on,wait=off \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-3", "addr": "0x0"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "rhel900-aarch64-virtio-scsi.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
-device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \
-device virtio-net-pci,mac=9a:0a:71:f3:69:7d,rombar=0,id=net0,netdev=tap0,bus=pcie-root-port-4,addr=0x0 \
-netdev tap,id=tap0,vhost=on \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-no-shutdown \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x2,chassis=6 \
-device pcie-root-port,id=pcie_extra_root_port_1,addr=0x2.0x1,bus=pcie.0,chassis=7 \
-monitor stdio 

On destination host
source side command + incoming defer

2.do migration 
  src -> dst - test passed
  boot vm with incoming defer again on src and migrate guest from rhel9.2 to rhel9.0
  dst -> src, migration failed with the following error 

Actual results:
[root@ampere-hr330a-13 rhel900]# sh boot_incoming.sh 
QEMU 6.2.0 monitor - type 'help' for more information
(qemu) migrate_incoming tcp:[::]:4000
(qemu) 
(qemu) 
(qemu) qemu-kvm: Invalid value 241 expecting positive value <= 237
qemu-kvm: Failed to load cpu:cpreg_vmstate_array_len
qemu-kvm: error while loading state for instance 0x0 of device 'cpu'
qemu-kvm: load of migration failed: Invalid argument

Expected results:
No error

Additional info:

Comment 1 Cornelia Huck 2023-04-05 09:13:43 UTC
Are the host cpu flags matching 100% between the two hosts? I.e. does `cat /proc/cpuinfo` yield the same flags on both machines?

Comment 4 Min Deng 2023-04-06 02:03:06 UTC
Did a comparison of these two hosts and there's no any difference between them. Please also refer to the attachments named by src.out and dst.output, any issues please let me know. Thank you !
ampere-hr330a-13.khw4.lab.eng.bos.redhat.com
ampere-hr330a-14.khw4.lab.eng.bos.redhat.com

Comment 5 Min Deng 2023-04-06 07:54:48 UTC
edk2 info:
SRC:
edk2-aarch64-20220126gitbb1bba3d77-3.el9_0.1.noarch
DST:
edk2-aarch64-20230301gitf80f052277c8-1.el9.noarch

Comment 6 Cornelia Huck 2023-04-06 11:17:18 UTC
Thanks for the output, the cpus on the two machines do indeed match, but the kernel versions obviously do not, and I think that's the source of the problem.

What the code is complaining about is that the number of registers available via the ONE_REG interface decreases when migrating from the 9.2 host to the 9.0 host. This is not surprising, as new kernel versions may expose more registers (and the code is fine with migrating to a system where that number actually increased.)

The unfortunate situation is that the number of registers is directly taken from whatever kernel version the host is running with, and not covered by any compatibility handling at all (neither upstream nor downstream). I.e. depending on the kernel versions used in different releases, there's a good chance that backwards migration will break.

What we would need is kind of an extended CPU model that not only covers whatever the host exposes, but also whatever KVM exposes, so that it can filter some features. Unfortunately, we do not even have more basic CPU models yet that could insulate us from small changes in the host feature set...

I do not see any quick way to fix this -- unless someone else has a good idea?

Comment 7 Sebastian Ott 2023-04-06 12:40:06 UTC
Hm, the only thing that comes to mind is to let migration tooling warn/prevent about a version mismatch (maybe just for downgrades).
The same should be true for a host kernel downgrade without migration where the VM's state is preserved (e.g. VM in suspend, or host kernel downgrade via kexec).

Comment 8 Min Deng 2023-04-10 11:01:19 UTC
The issue also can be reproduced between RHEL9.1 to RHEL9.2
qemu-kvm-7.0.0-13.el9_1.1.aarch64
5.14.0-162.23.1.el9_1.aarch64
edk2-aarch64-20220526git16779ede2d36-3.el9.noarch
qemu-kvm-7.2.0-14.el9_2.aarch64
5.14.0-162.23.1.el9_1.aarch64
edk2-aarch64-20220526git16779ede2d36-3.el9.noarch

Comment 9 Cornelia Huck 2023-04-11 09:30:48 UTC
(In reply to Min Deng from comment #8)
> The issue also can be reproduced between RHEL9.1 to RHEL9.2
> qemu-kvm-7.0.0-13.el9_1.1.aarch64
> 5.14.0-162.23.1.el9_1.aarch64
> edk2-aarch64-20220526git16779ede2d36-3.el9.noarch
> qemu-kvm-7.2.0-14.el9_2.aarch64
> 5.14.0-162.23.1.el9_1.aarch64
> edk2-aarch64-20220526git16779ede2d36-3.el9.noarch

Does that fail with the same error? (I'm surprised that the kernel version seems to match exactly, shouldn't it be an el9_2 version on the second host?)

Comment 10 Min Deng 2023-04-11 13:11:13 UTC
> 
> Does that fail with the same error? (I'm surprised that the kernel version
> seems to match exactly, shouldn't it be an el9_2 version on the second host?)

Hi Cornelia,
I pasted the info of these two hosts as below, thank you !
ampere-hr350a-08.khw4.lab.eng.bos.redhat.com
ampere-hr350a-09.khw4.lab.eng.bos.redhat.com
SRC:
[root@ampere-hr350a-08 ~]# uname -r
5.14.0-162.23.1.el9_1.aarch64
[root@ampere-hr350a-08 ~]# rpm -qa|grep qemu-kvm
qemu-kvm-common-7.0.0-13.el9_1.1.aarch64
qemu-kvm-audio-pa-7.0.0-13.el9_1.1.aarch64
qemu-kvm-device-display-virtio-gpu-7.0.0-13.el9_1.1.aarch64
qemu-kvm-device-display-virtio-gpu-gl-7.0.0-13.el9_1.1.aarch64
qemu-kvm-device-display-virtio-gpu-pci-7.0.0-13.el9_1.1.aarch64
qemu-kvm-device-display-virtio-gpu-pci-gl-7.0.0-13.el9_1.1.aarch64
qemu-kvm-device-usb-host-7.0.0-13.el9_1.1.aarch64
qemu-kvm-tools-7.0.0-13.el9_1.1.aarch64
qemu-kvm-docs-7.0.0-13.el9_1.1.aarch64
qemu-kvm-core-7.0.0-13.el9_1.1.aarch64
qemu-kvm-block-rbd-7.0.0-13.el9_1.1.aarch64
qemu-kvm-7.0.0-13.el9_1.1.aarch64
qemu-kvm-tests-7.0.0-13.el9_1.1.aarch64
qemu-kvm-block-curl-7.0.0-13.el9_1.1.aarch64

edk2-aarch64-20220526git16779ede2d36-3.el9.noarch

DST:
[root@ampere-hr350a-09 ~]# uname -r
5.14.0-284.8.1.el9_2.aarch64
[root@ampere-hr350a-09 ~]# rpm -qa|grep qemu-kvm
qemu-kvm-common-7.2.0-14.el9_2.aarch64
qemu-kvm-device-display-virtio-gpu-7.2.0-14.el9_2.aarch64
qemu-kvm-device-display-virtio-gpu-pci-7.2.0-14.el9_2.aarch64
qemu-kvm-audio-pa-7.2.0-14.el9_2.aarch64
qemu-kvm-device-usb-host-7.2.0-14.el9_2.aarch64
qemu-kvm-tools-7.2.0-14.el9_2.aarch64
qemu-kvm-docs-7.2.0-14.el9_2.aarch64
qemu-kvm-core-7.2.0-14.el9_2.aarch64
qemu-kvm-block-rbd-7.2.0-14.el9_2.aarch64
qemu-kvm-7.2.0-14.el9_2.aarch64
qemu-kvm-tests-7.2.0-14.el9_2.aarch64
qemu-kvm-block-curl-7.2.0-14.el9_2.aarch64

edk2-aarch64-20221207gitfff6d81270b5-9.el9_2.noarch

Comment 11 Cornelia Huck 2023-04-11 15:42:07 UTC
Thanks; with the two different kernel versions, this looks like the same problem as for 9.0 <-> 9.2.

Unfortunately, I think this needs to be addressed in the general context of CPU models, which means it will take some time -- we first need to build some kind of consensus upstream, and I doubt there will be major progress before (northern hemisphere) summer...

Comment 12 Cornelia Huck 2023-04-26 11:33:04 UTC
So, this is why we decided to close this as CANTFIX:

We currently see a breakage when migrating from 9.2 to anything older. This is caused by a change in the KVM kernel code, which triggers a visible change in the guest ABI exposed by QEMU. To fix this, we need a bigger change in QEMU, which needs to be done in the context of Arm CPU models. This is a non-trivial development item which will need some time to complete.

It won't make sense to try to backport any solution we come up with to 9.2 and older versions. Migrations from older versions to newer versions are unaffected.


Note You need to log in before you can comment on or make changes to this bug.