Created attachment 1946691 [details] program to detect bug Description of problem: RHEL supports use of QEMU with the Tiny Code Generator (TCG) instead of KVM in specific cases, such as running the libguestfs appliances in a VM without nested virtualization enabled. In this case, libguestfs uses the "-cpu max" CPU model which uses all the instructions that TCG is able to emulate. QEMU 7.2 included a rewrite of SSE and BMI instructions that unfortunately had a few bugs. Because of the "-cpu max" CPU model, these instructions are always enabled. The most common symptom is failure to do cryptographic operations, for example (without libguestfs) "curl" of an https URL will fail due to a bug in the ADCX and ADOX instructions. Because the issue is in the translation of x86 instructions, it is impossible to provide a full set of affected programs; on the other hand, the patch is very safe because it only affects TCG and more specifically only the buggy instructions. This is a regression. Version-Release number of selected component (if applicable): 7.2.0-10.el9 How reproducible: 100% Steps to Reproduce: 1. start a RHEL9 VM *without* using KVM and with the "-cpu max" command line option 2. compile the attached program and run it in the VM Actual results: The program detects an incorrect implementation of ADCX and ADOX Expected results: No error message.
Seems like you're intending to fix qemu, but we might also change libguestfs (RHEL 9.2 only) to use a different -cpu flag. However I'd also note that TCG is used by OpenStack and CNV for testing and is "soft supported" for those cases too.
> I met one error while compiling it inside guest. Did I miss some parameters? > > # gcc test.c > test.c: In function ‘test_adox_adcx’: > test.c:27:5: error: inconsistent operand constraints in an ‘asm’ > 27 | asm("push %0; popf;" > | ^~~ Yes, please add "-O2" (same on the host in fact).
(In reply to Paolo Bonzini from comment #6) > > I met one error while compiling it inside guest. Did I miss some parameters? > > > > # gcc test.c > > test.c: In function ‘test_adox_adcx’: > > test.c:27:5: error: inconsistent operand constraints in an ‘asm’ > > 27 | asm("push %0; popf;" > > | ^~~ > > Yes, please add "-O2" (same on the host in fact). Thanks. Reproduced this issue now while running the executable file with qemu-kvm-7.2.0-10.el9.x86_64. # gcc test.c -O2 a.out test.c # ./a.out 0 1 a.out: test.c:34: test_adox_adcx: Assertion `out_adcx == in_c + adcx_operand - 1' failed. Aborted (core dumped)
Test PASS with qemu-kvm-7.2.0-14.el9_2.x86_64. Test Env: host: kernel-5.14.0-284.2.1.el9_2.x86_64 guest: kernel-5.14.0-284.2.1.el9_2.x86_64 Test results. # ./a.out 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Mark Verified:Tested.
Hi Yongkui, Would you please help to do some sanity test on libguestfs side? Thanks. Best regards Nana
Tested with the following packages on RHEL9.2 host: libguestfs-1.50.1-3.el9.x86_64 qemu-kvm-7.2.0-14.el9_2.x86_64 kernel-5.14.0-289.el9.x86_64 Steps: 1. $ LIBGUESTFS_BACKEND=direct LIBGUESTFS_BACKEND_SETTINGS=force_tcg libguestfs-test-tool ... libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -display none libguestfs: command: run: \ -help libguestfs: command: run: /usr/libexec/qemu-kvm libguestfs: command: run: \ -display none libguestfs: command: run: \ -machine q35,accel=kvm:tcg libguestfs: command: run: \ -device ? libguestfs: command: run: echo '{ "execute": "qmp_capabilities" }' '{ "execute": "query-qmp-schema" }' '{ "execute": "quit" }' | QEMU_AUDIO_DRV=none "/usr/libexec/qemu-kvm" -display none -machine "q35,accel=kvm:tcg" -qmp stdio libguestfs: command: run: echo '{ "execute": "qmp_capabilities" }' '{ "execute": "query-kvm" }' '{ "execute": "quit" }' | QEMU_AUDIO_DRV=none "/usr/libexec/qemu-kvm" -display none -machine "q35,accel=kvm:tcg" -qmp stdio libguestfs: saving test results libguestfs: qemu version: 7.2 libguestfs: qemu mandatory locking: yes libguestfs: qemu KVM: enabled libguestfs: finished testing qemu features libguestfs: command: run: dmesg | grep -Eoh 'lpj=[[:digit:]]+' libguestfs: read_lpj_from_dmesg: external command exited with error status 1 libguestfs: read_lpj_from_files: no boot messages files are readable /usr/libexec/qemu-kvm \ -global virtio-blk-pci.scsi=off \ -no-user-config \ -nodefaults \ -display none \ -machine q35,accel=tcg,graphics=off \ -cpu max,la57=off \ -m 1280 \ -no-reboot \ -rtc driftfix=slew \ -no-hpet \ -global kvm-pit.lost_tick_policy=discard \ -kernel /var/tmp/.guestfs-1002/appliance.d/kernel \ -initrd /var/tmp/.guestfs-1002/appliance.d/initrd \ -object rng-random,filename=/dev/urandom,id=rng0 \ -device virtio-rng-pci,rng=rng0 \ -device virtio-scsi-pci,id=scsi \ -drive file=/tmp/libguestfsjB8ZPO/scratch1.img,cache=unsafe,format=raw,id=hd0,if=none \ -device scsi-hd,drive=hd0 \ -drive file=/var/tmp/.guestfs-1002/appliance.d/root,snapshot=on,id=appliance,cache=unsafe,if=none \ -device scsi-hd,drive=appliance \ -device virtio-serial-pci \ -serial stdio \ -chardev socket,path=/run/user/0/libguestfsfk99Id/guestfsd.sock,id=channel0 \ -device virtserialport,chardev=channel0,name=org.libguestfs.channel.0 \ -append "panic=1 console=ttyS0 edd=off udevtimeout=6000 udev.event-timeout=6000 no_timer_check printk.time=1 cgroup_disable=memory usbcore.nousb cryptomgr.notests tsc=reliable 8250.nr_uarts=1 root=UUID=3f0fe7b3-4257-4eb7-a56f-d1714f8d62bf selinux=0 guestfs_verbose=1 TERM=xterm-256color" \x1bc\x1b[?7l\x1b[2J\x1b[0mSeaBIOS (version 1.16.1-1.el9) Booting from ROM..\x1bc\x1b[?7l\x1b[2J[ 0.000000] Linux version 5.14.0-289.el9.x86_64 (mockbuild.eng.bos.redhat.com) (gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), GNU ld version 2.35.2-37.el9) #1 SMP PREEMPT_DYNAMIC Sun Mar 19 06:09:51 EDT 2023 ... ===== TEST FINISHED OK ===== Sanity check passed on libguestfs side.
Thanks! Move this bug to verified according to Comment 19 and Comment 23.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2162