Bug 2216915

Summary: DBus call to org.libvirt.Domain GetStats times out
Product: [Fedora] Fedora Reporter: Jelle van der Waa <jvanderwaa>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 39CC: berrange, clalancette, crobinso, jforbes, laine, libvirt-maint, virt-maint
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
URL: https://github.com/cockpit-project/cockpit-machines/pull/1104
Whiteboard: CockpitTest
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jelle van der Waa 2023-06-23 08:26:49 UTC
In our automated Cockpit CI tests, we have a test for adding a vsock interface where this dbus call times out, while running our test suite which is a regression.

> debug: dbus: {"timeout":5000,"type":"uu","call":["/org/libvirt/QEMU/domain/_ab6b7073_b424_4ccd_9269_7b9be0bd3c44","org.libvirt.Domain","GetStats",[45,0]],"id":"1"}


Reproducible: Always

Steps to Reproduce:
1. The test spins up a VM
2. Adds a virtual sock interface with custom identifier 5
3. Removes the custom identifier of the virtual socket interface
4. Now we trigger a "Force a shut down" and our GetStats dbus call times out causing our test to fail.




Expected Results:  
This test ran fine in rawhide, it is rather hard for us to pinpoint when it started failing, but likely it's related to the libvirt updates in Rawhide

  libvirt-client (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-config-network (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-interface (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-network (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-nodedev (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-qemu (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-storage-core (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-storage-disk (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-storage-iscsi (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-storage-iscsi-direct (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-daemon-driver-storage-logical (9.0.0-3.fc38 -> 9.3.0-2.fc39)
  libvirt-libs (9.0.0-3.fc38 -> 9.3.0-2.fc39)

We run the following virt-xml commands, before we press `force shut down`

virt-xml -c qemu:///system subVmTest1 --add-device --vsock cid.auto=yes,cid.address=3
virt-xml -c qemu:///system subVmTest1 --edit --vsock cid.auto=no,cid.address=5
virt-xml -c qemu:///system subVmTest1 --add-device --vsock cid.auto=no,cid.address=5 --update
virt-xml -c qemu:///system subVmTest1 --edit --vsock cid.auto=yes,cid.address=5

Comment 1 Jelle van der Waa 2023-06-23 08:52:33 UTC
Tried to reproduce it outside our CI with (which does not use /dev/kvm)

virt-install --connect qemu:///system --name subVmTest1  --os-variant cirros0.4.0 --boot hd,network --vcpus 1 --memory 128 --import --disk /var//lib/libvirt/images/cirros.qcow2 --graphics none

(Cirros image is used in our tests, but probably not relevant)

virsh dumpxml subVmTest1 | grep uuid

Convert the uuid to underscores from '-':

In a new terminal:

while true; do busctl call org.libvirt /org/libvirt/QEMU/domain/_87a881f2_8fcc_4d17_9e01_62a53dc6fb71 org.libvirt.Domain GetStats uu 45 0; done

Then:

[root@fedora-rawhide-127-0-0-2-2201 ~]# virt-xml -c qemu:///system subVmTest1 --add-device --vsock cid.auto=yes,cid.address=3
virt-xml -c qemu:///system subVmTest1 --edit --vsock cid.auto=no,cid.address=5
virt-xml -c qemu:///system subVmTest1 --add-device --vsock cid.auto=no,cid.address=5 --update
virt-xml -c qemu:///system subVmTest1 --edit --vsock cid.auto=yes,cid.address=5
Domain 'subVmTest1' defined successfully.
Changes will take effect after the domain is fully powered off.
Domain 'subVmTest1' defined successfully.
Changes will take effect after the domain is fully powered off.
Device hotplug successful.
ERROR    XML error: only a single vsock device is supported
Domain 'subVmTest1' defined successfully.
Changes will take effect after the domain is fully powered off.
[root@fedora-rawhide-127-0-0-2-2201 ~]# virsh destroy subVmTest1

The continuous running dbusctl hangs:

s" v t 30737474 "block.0.allocation" v t 19005440 "block.0.capacity" v t 46137344 "block.0.physical" v t 18956288
a{sv} 28 "state.state" v i 1 "state.reason" v i 1 "balloon.current" v t 131072 "balloon.maximum" v t 131072 "balloon.swap_in" v t 0 "balloon.last-update" v t 1687510227 "balloon.rss" v t 193388 "vcpu.current" v u 1 "vcpu.maximum" v u 1 "vcpu.0.state" v i 1 "vcpu.0.time" v t 9030000000 "vcpu.0.wait" v t 0 "vcpu.0.delay" v t 2787627256 "block.c
ount" v u 1 "block.0.name" v s "vda" "block.0.path" v s "/var/lib/libvirt/images/cirros.qcow2" "block.0.backingIndex" v u 1 "block.0.rd.reqs" v t 1130 "block.0.rd.bytes" v t 20962304 "block.0.rd.times" v t 248338149 "block.0.wr.reqs" v t 52 "block.0.wr.bytes" v t 146432 "block.0.wr.times" v t 21919171 "block.0.fl.reqs" v t 11 "block.0.fl.time
s" v t 30737474 "block.0.allocation" v t 19005440 "block.0.capacity" v t 46137344 "block.0.physical" v t 18956288

Call failed: Connection timed out

Comment 2 Fedora Release Engineering 2023-08-16 08:11:21 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle.
Changing version to 39.