Bug 2150760

Summary: Restart virtqemud then try to save a running vm will cause virtqemud crash
Product: Red Hat Enterprise Linux 9 Reporter: yalzhang <yalzhang>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
libvirt sub component: General QA Contact: Yanqiu Zhang <yanqzhan>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: jdenemar, lmen, meili, pkrempa, virt-maint, yanqzhan, yicui
Version: 9.2Keywords: Automation, Regression, Triaged, Upstream
Target Milestone: rcFlags: pm-rhel: mirror+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-8.10.0-2.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-09 07:27:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
coredump file none

Description yalzhang@redhat.com 2022-12-05 08:44:48 UTC
Created attachment 1930025 [details]
coredump file

Created attachment 1930025 [details]
coredump file

Description of problem:
Restart virtqemud then try to save a running vm will cause virtqemud crash

Version-Release number of selected component (if applicable):
# rpm -q libvirt qemu-kvm 
libvirt-8.10.0-1.el9.x86_64
qemu-kvm-7.1.0-6.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Start a vm and then restart the virtqemud:

# virsh start avocado-vt-vm1
Domain 'avocado-vt-vm1' started

# systemctl restart virtqemud

# pidof virtqemud
10948

2. Try to save the vm, virtqemud will crash:

# virsh save avocado-vt-vm1 test.save
error: Disconnected from qemu:///system due to end of file
error: Failed to save domain 'avocado-vt-vm1' to test.save
error: End of file while reading data: Input/output error

# pidof virtqemud
11023

# coredumpctl list
TIME                          PID UID GID SIG     COREFILE EXE                 SIZE
Mon 2022-12-05 03:25:37 EST 10948   0   0 SIGSEGV present  /usr/sbin/virtqemud 1.1M

Actual results:
Restart virtqemud then try to save a running vm will cause virtqemud crash

Expected results:
virtqemud should not crash

Additional info:

Comment 1 Peter Krempa 2022-12-05 08:53:24 UTC
Crash occurs in :

#0  __strrchr_evex () at ../sysdeps/x86_64/multiarch/strrchr-evex.S:85
#1  0x00007fd8d10e0a55 in virFileIsSharedFSType (path=0x0, fstypes=2047) at ../src/util/virfile.c:3435
#2  0x00007fd8c015e2c3 in qemuTPMHasSharedStorage (def=<optimized out>) at ../src/qemu/qemu_tpm.c:1030
#3  0x00007fd8c01108ec in qemuMigrationSrcIsAllowed
#4  0x00007fd8c00d5263 in qemuDomainSaveInternal
#5  0x00007fd8c00d5cb8 in qemuDomainSaveFlags (dom=0x7fd8680a7f80, path=0x7fd8a40024b0 "/root/test.save", dxml=0x0, flags=0)
#6  0x00007fd8d12dca98 in virDomainSave (domain=domain@entry=0x7fd8680a7f80, to=<optimized out>) at ../src/libvirt-domain.c:896


Looks like we don't serialize 'tpm->data.emulator.storagepath' in the status XML thus a restart clears it. Same thing should happen on migration after restart of the daemon.

Comment 2 Meina Li 2022-12-05 10:55:06 UTC
Another test scenario which have the similar crash:

Description of problem:
Dump guest failed with bzip2_format_dump

Version-Release number of selected component (if applicable):
libvirt-8.10.0-1.el9.x86_64
qemu-kvm-7.1.0-6.el9.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Start a guest.
# virsh domstate avocado-vt-vm1
running

2. Edit the dump_image_format in qemu.cfg.
# vim /etc/libvirt/qemu.conf
......
dump_image_format = "bzip2"
......

3. Restart the virtqemud service.
# systemctl restart virtqemud

4. Dump the guest.
# virsh dump avocado-vt-vm1 /var/tmp/vm.core 
error: error: Failed to core dump domain 'avocado-vt-vm1' to /var/tmp/vm.core
Disconnected from qemu:///system due to end of file

Actual results:
Failed as step 4.

Expected results:
Should dump successfully.

Additional info:
1) It will not fail if we edit the qemu.conf first and then start the guest.
2) Log:
#0  0x00007fa8c983c528 in __strrchr_sse2 () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7fa8c69fc640 (LWP 140571))]
(gdb) bt
#0  0x00007fa8c983c528 in __strrchr_sse2 () from /lib64/libc.so.6
#1  0x00007fa8c9ee0a55 in virFileIsSharedFSType (path=0x0, fstypes=2047)
    at ../src/util/virfile.c:3435
#2  0x00007fa8c415a2c3 in qemuTPMHasSharedStorage (def=<optimized out>)
    at ../src/qemu/qemu_tpm.c:1030
#3  0x00007fa8c410c8ec in qemuMigrationSrcIsAllowed (driver=0x7fa87c0231e0, 
    vm=0x7fa87c096850, remote=<optimized out>, asyncJob=<optimized out>, 
    flags=<optimized out>) at ../src/qemu/qemu_migration.c:1559
#4  0x00007fa8c40d21f1 in doCoreDump (driver=0x7fa87c0231e0, vm=0x7fa87c096850, 
    path=0x7fa8a8001460 "/var/tmp/vm.core", dump_flags=<optimized out>, dumpformat=0)
    at ../src/qemu/qemu_driver.c:3135
#5  0x00007fa8c40d29e9 in qemuDomainCoreDumpWithFormat (dom=<optimized out>, 
    path=0x7fa8a8001460 "/var/tmp/vm.core", dumpformat=0, flags=0)
    at ../src/qemu/qemu_driver.c:3219
#6  0x00007fa8ca0df756 in virDomainCoreDump (domain=domain@entry=0x7fa87c01fee0, 
    to=<optimized out>, flags=0) at ../src/libvirt-domain.c:1427
#7  0x000055d36cd18fcc in remoteDispatchDomainCoreDump (server=0x55d36e4a7080, 
    msg=0x55d36e4af840, args=0x7fa8a8000b80, rerr=0x7fa8c69fb9a0, 
    client=<optimized out>) at src/remote/remote_daemon_dispatch_stubs.h:4894
#8  remoteDispatchDomainCoreDumpHelper (server=0x55d36e4a7080, 
--Type <RET> for more, q to quit, c to continue without paging--bt
    client=<optimized out>, msg=0x55d36e4af840, rerr=0x7fa8c69fb9a0, 
    args=0x7fa8a8000b80, ret=0x0) at src/remote/remote_daemon_dispatch_stubs.h:4873
#9  0x00007fa8c9ff112c in virNetServerProgramDispatchCall (msg=0x55d36e4af840, 
    client=0x55d36e4b50b0, server=0x55d36e4a7080, prog=0x55d36e4a8410)
    at ../src/rpc/virnetserverprogram.c:428
#10 virNetServerProgramDispatch (prog=0x55d36e4a8410, server=0x55d36e4a7080, 
    client=0x55d36e4b50b0, msg=0x55d36e4af840)
    at ../src/rpc/virnetserverprogram.c:302
#11 0x00007fa8c9ff6e78 in virNetServerProcessMsg (msg=<optimized out>, 
    prog=<optimized out>, client=<optimized out>, srv=0x55d36e4a7080)
    at ../src/rpc/virnetserver.c:135
#12 virNetServerHandleJob (jobOpaque=0x55d36e4a2750, opaque=0x55d36e4a7080)
    at ../src/rpc/virnetserver.c:155
#13 0x00007fa8c9f31cf3 in virThreadPoolWorker (opaque=<optimized out>)
    at ../src/util/virthreadpool.c:164
#14 0x00007fa8c9f312a9 in virThreadHelper (data=<optimized out>)
    at ../src/util/virthread.c:256
#15 0x00007fa8c989f802 in start_thread () from /lib64/libc.so.6
#16 0x00007fa8c983f450 in clone3 () from /lib64/libc.so.6

Comment 3 Michal Privoznik 2022-12-05 12:18:05 UTC
Patches proposed on the list:

https://listman.redhat.com/archives/libvir-list/2022-December/236084.html

Comment 4 Michal Privoznik 2022-12-05 13:29:23 UTC
Merged upstream as:

7a20341270 qemu: Init ext devices paths on reconnect
3458c3ff8c qemu_extdevice: Expose qemuExtDevicesInitPaths()
f1958a3e5e qemu_extdevice: Init paths in qemuExtDevicesPrepareDomain()
107ebe62f4 qemu_process: Document qemuProcessPrepare{Domain,Host}() order

v8.10.0-69-g7a20341270

Comment 5 Yanqiu Zhang 2022-12-07 10:21:30 UTC
Reproduce on:
libvirt-8.10.0-1.el9.x86_64
qemu-kvm-7.1.0-5.el9.x86_64

Steps:
# systemctl restart virtqemud

1.# virsh managedsave avocado-vt-vm1 
error: Disconnected from qemu:///system due to end of file
error: Failed to save domain 'avocado-vt-vm1' state
error: End of file while reading data: Input/output error

2.# virsh dump avocado-vt-vm1 avocado.dump
error: Disconnected from qemu:///system due to end of file
error: Failed to core dump domain 'avocado-vt-vm1' to avocado.dump
error: End of file while reading data: Input/output error

3.# virsh snapshot-create-as avocado-vt-vm1 sp1 --memspec file=/tmp/foo.image
error: Disconnected from qemu:///system due to end of file
error: End of file while reading data: Input/output error

4.# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose
error: Disconnected from qemu:///system due to end of file
error: End of file while reading data: Input/output error

Verify on:
libvirt-8.10.0-2.el9.x86_64

(# yum upgrade libvirt)

1.# systemctl restart virtqemud
# virsh managedsave avocado-vt-vm1 

Domain 'avocado-vt-vm1' state saved by libvirt

# virsh start avocado-vt-vm1 
Domain 'avocado-vt-vm1' started

2.# systemctl restart virtqemud
# virsh dump avocado-vt-vm1 vm.dump

Domain 'avocado-vt-vm1' dumped to vm.dump

3.# systemctl restart virtqemud
#  virsh snapshot-create-as avocado-vt-vm1 sp1 --memspec file=/tmp/foo.image
Domain snapshot sp1 created

# virsh snapshot-delete avocado-vt-vm1 sp1 --metadata
Domain snapshot sp1 deleted

4. # systemctl restart virtqemud
# virsh migrate avocado-vt-vm1 --live qemu+ssh://hostB/system --verbose
Migration: [100 %]

Comment 8 Yanqiu Zhang 2022-12-08 07:38:53 UTC
Verified per comment5.
And other regression tests for vtpm is pass:
https://libvirt-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/libvirt-RHEL-9.2-runtest-x86_64-function-tpm_emulator/22/testReport/
libvirt	libvirt-8.10.0-2.el9.x86_64
qemu-kvm	qemu-kvm-7.1.0-6.el9.x86_64
kernel	kernel-5.14.0-205.el9.x86_64

Comment 10 errata-xmlrpc 2023-05-09 07:27:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (libvirt bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2171