Bug 1918364 - Can't connect to ballooning device when using virtio-transitional or virtio-non-transitional
Summary: Can't connect to ballooning device when using virtio-transitional or virtio-n...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: libvirt
Version: 8.3
Hardware: Unspecified
OS: Unspecified
high
unspecified
Target Milestone: rc
: 8.4
Assignee: Andrea Bolognani
QA Contact: Meina Li
URL:
Whiteboard:
Depends On: 1911786
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-20 14:47 UTC by Andrea Bolognani
Modified: 2021-02-22 15:40 UTC (History)
11 users (show)

Fixed In Version: libvirt-6.6.0-13.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1911786
Environment:
Last Closed: 2021-02-22 15:39:42 UTC
Type: Bug
Target Upstream Version: 7.0.0
Embargoed:


Attachments (Terms of Use)

Description Andrea Bolognani 2021-01-20 14:47:11 UTC
+++ This bug was initially created as a clone of Bug #1911786 +++

Description of problem:

I have problems with getting memory stats via https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainMemoryStats when setting the ballooning model to `virtio-transitional` or `virtio-non-transitional`. Only `virtio` seems to work.


QEMU commandline: 

> {"component":"virt-launcher","level":"info","msg":"LC_ALL=C \\PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \\HOME=/var/lib/libvirt/qemu/domain-1-default_vmi-fedora \\XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-1-default_vmi-fedora/.local/share \\XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-1-default_vmi-fedora/.cache \\XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-1-default_vmi-fedora/.config \\QEMU_AUDIO_DRV=none \\/usr/libexec/qemu-kvm \\-name guest=default_vmi-fedora,debug-threads=on \\-S \\-object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-default_vmi-fedora/master-key.aes \\-machine pc-q35-rhel8.3.0,accel=kvm,usb=off,dump-guest-core=off \\-cpu Skylake-Client-IBRS,ss=on,vmx=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on \\-m 977 \\-overcommit mem-lock=off \\-smp 1,sockets=1,dies=1,cores=1,threads=1 \\-object iothread,id=iothread1 \\-uuid 01847eb3-5f3b-4c22-95de-c70369fdeafb \\-smbios type=1,manufacturer=KubeVirt,product=None,uuid=01847eb3-5f3b-4c22-95de-c70369fdeafb,family=KubeVirt \\-no-user-config \\-nodefaults \\-chardev socket,id=charmonitor,fd=18,server,nowait \\-mon chardev=charmonitor,id=monitor,mode=control \\-rtc base=utc \\-no-shutdown \\-boot strict=on \\-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \\-device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \\-device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \\-device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \\-device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \\-device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \\-device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 \\-device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 \\-device virtio-scsi-pci,id=scsi0,bus=pci.2,addr=0x0 \\-device virtio-serial-pci-non-transitional,id=virtio-serial0,bus=pci.3,addr=0x0 \\-blockdev '{\"driver\":\"file\",\"filename\":\"/var/run/kubevirt/container-disks/disk_0.img\",\"node-name\":\"libvirt-3-storage\",\"cache\":{\"direct\":true,\"no-flush\":false},\"auto-read-only\":true,\"discard\":\"unmap\"}' \\-blockdev '{\"node-name\":\"libvirt-3-format\",\"read-only\":true,\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"qcow2\",\"file\":\"libvirt-3-storage\",\"backing\":null}' \\-blockdev '{\"driver\":\"file\",\"filename\":\"/var/run/kubevirt-ephemeral-disks/disk-data/containerdisk/disk.qcow2\",\"node-name\":\"libvirt-2-storage\",\"cache\":{\"direct\":true,\"no-flush\":false},\"auto-read-only\":true,\"discard\":\"unmap\"}' \\-blockdev '{\"node-name\":\"libvirt-2-format\",\"read-only\":false,\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"qcow2\",\"file\":\"libvirt-2-storage\",\"backing\":\"libvirt-3-format\"}' \\-device virtio-blk-pci-non-transitional,bus=pci.4,addr=0x0,drive=libvirt-2-format,id=ua-containerdisk,bootindex=1,write-cache=on \\-blockdev '{\"driver\":\"file\",\"filename\":\"/var/run/kubevirt-ephemeral-disks/cloud-init-data/default/vmi-fedora/noCloud.iso\",\"node-name\":\"libvirt-1-storage\",\"cache\":{\"direct\":true,\"no-flush\":false},\"auto-read-only\":true,\"discard\":\"unmap\"}' \\-blockdev '{\"node-name\":\"libvirt-1-format\",\"read-only\":false,\"cache\":{\"direct\":true,\"no-flush\":false},\"driver\":\"raw\",\"file\":\"libvirt-1-storage\"}' \\-device virtio-blk-pci-non-transitional,bus=pci.5,addr=0x0,drive=libvirt-1-format,id=ua-cloudinitdisk,write-cache=on \\-netdev tap,fd=20,id=hostua-default,vhost=on,vhostfd=21 \\-device virtio-net-pci-non-transitional,host_mtu=1440,netdev=hostua-default,id=ua-default,mac=1a:a3:3d:b3:65:0a,bus=pci.1,addr=0x0,romfile= \\-chardev socket,id=charserial0,fd=22,server,nowait \\-device isa-serial,chardev=charserial0,id=serial0 \\-chardev socket,id=charchannel0,fd=23,server,nowait \\-device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 \\-vnc vnc=unix:/var/run/kubevirt-private/ce5be773-07e1-41b8-874f-6b6218b2859c/virt-vnc \\-device VGA,id=video0,vgamem_mb=16,bus=pcie.0,addr=0x1 \\-device virtio-balloon-pci-non-transitional,id=balloon0,bus=pci.6,addr=0x0 \\-object rng-random,id=objrng0,filename=/dev/urandom \\-device virtio-rng-pci-non-transitional,rng=objrng0,id=rng0,bus=pci.7,addr=0x0 \\-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \\-msg timestamp=on","subcomponent":"qemu","timestamp":"2020-12-31T08:38:25.442441Z"}

The failure log line:

> {"component":"virt-launcher","level":"error","msg":"internal error: Cannot determine balloon device path","pos":"qemuMonitorInitBalloonObjectPath:1022","subcomponent":"libvirt","thread":"32","timestamp":"2020-12-31T08:40:38.917000Z"}


Version-Release number of selected component (if applicable):

libvirt version: 6.6.0
qemu version: qemu-kvm-5.1.0

--- Additional comment from Meina Li on 2021-01-06 04:19:17 CET ---

Reproduced on:
libvirt-6.6.0-11.module+el8.3.1+9196+74a80ca4.x86_64
qemu-kvm-5.1.0-17.module+el8.3.1+9213+7ace09c3.x86_64

Reproduced Steps: check the memory stats after the guest boot fully
1. With virtio-transitional memballoon.
# virsh dumpxml lmn | grep /memballoon -B3
    <memballoon model='virtio-transitional'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x09' slot='0x01' function='0x0'/>
    </memballoon>
# virsh dommemstat lmn
actual 1572864
rss 635228
--------Can't get expected stats

2. With virtio-non-transitional memballoon.
# virsh dumpxml lmn | grep /memballoon -B3
    <memballoon model='virtio-non-transitional'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </memballoon>
# virsh dommemstat lmn
actual 1572864
rss 530152
--------Can't get expected stats

3.  With virtio memballoon.
# virsh dumpxml lmn | grep /memballoon -B3
    <memballoon model='virtio'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </memballoon>
# virsh dommemstat lmn
actual 1572864
swap_in 0
swap_out 0
major_fault 269
minor_fault 138820
unused 1186396
available 1344380
usable 1145904
last_update 1609903086
disk_caches 61452
hugetlb_pgalloc 0
hugetlb_pgfail 0
rss 523996
-------Can get expected stats

--- Additional comment from Roman Mohr on 2021-01-07 13:34:43 CET ---

I am working around this by sticking with `virtio`.

Giving it priority high since I don't know what the default is on q35 if I simply choose `virtio`.
The priority can be lowered if `virtio` defaults to `virtio-transitional` because then we have no issue on older guests.
Feel free to lower the priority if `virtio` defaults to `virtio-transitional` in this case and let me know.

--- Additional comment from Jaroslav Suchanek on 2021-01-07 17:18:33 CET ---

Andrea, can you please investigate what is the behavior behind this model and reply to comment 3? Adding Cole to CC, as he made the original changes I believe. Thanks.

--- Additional comment from Andrea Bolognani on 2021-01-11 19:29:40 CET ---

(In reply to Jaroslav Suchanek from comment #4)
> Andrea, can you please investigate what is the behavior behind this model
> and reply to comment 3? Adding Cole to CC, as he made the original changes I
> believe. Thanks.

I'll dig further tomorrow, but the issue seems to be that in
qemuMonitorInitBalloonObjectPath() we look for the memballoon based
on two pieces of data: its alias and its type, where the latter is
expected to be either virtio-balloon-pci or virtio-balloon-ccw based
on the device's address type.

  https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_monitor.c#L990

However, the (non-)transitional devices have different QOM types:

  # <memballoon model='virtio'>
  $ virsh qemu-monitor-command test --hmp qom-list /machine/peripheral/ | grep balloon
  balloon0 (child<virtio-balloon-pci>)

  # <memballoon model='virtio-non-transitional'>
  $ virsh qemu-monitor-command test --hmp qom-list /machine/peripheral/ | grep balloon
  balloon0 (child<virtio-balloon-pci-non-transitional>)

Since libvirt always expects the type to be virtio-balloon-pci, it
can't find the memballoon when (non)-transitional devices are used.

(In reply to Roman Mohr from comment #3)
> I am working around this by sticking with `virtio`.
>
> Giving it priority high since I don't know what the default is on q35 if I
> simply choose `virtio`.
> The priority can be lowered if `virtio` defaults to `virtio-transitional`
> because then we have no issue on older guests.
> Feel free to lower the priority if `virtio` defaults to
> `virtio-transitional` in this case and let me know.

As you can see above, virtio is just virtio :) It's not an alias for
either one of the other options, which QEMU considers to be
completely separate devices - though obviously they share most of the
code.

When leaving PCI address assignment to libvirt, on a q35 machine type
the memballoon will be placed behind a pcie-root-port and so it will
behave like the non-transitional device. This is consistent with all
other virtio devices.

--- Additional comment from Andrea Bolognani on 2021-01-12 18:48:36 CET ---

Patch posted upstream.

  https://www.redhat.com/archives/libvir-list/2021-January/msg00621.html

--- Additional comment from Andrea Bolognani on 2021-01-13 15:19:12 CET ---

Fix pushed upstream.

  commit 0a6cb05e953d315b7a05103d707cff4d36221211
  Author: Andrea Bolognani <abologna>
  Date:   Tue Jan 12 17:17:44 2021 +0100

    qemu: Fix memstat for (non-)transitional memballoon
    
    Depending on the memballoon model, the corresponding QOM node
    will have a different type and we need to account for this
    when searching for it in the QOM tree.
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1911786
    
    Signed-off-by: Andrea Bolognani <abologna>
    Reviewed-by: Daniel Henrique Barboza <danielhb413>
    Reviewed-by: Michal Privoznik <mprivozn>

  v7.0.0-rc2-2-g0a6cb05e95

--- Additional comment from Roman Mohr on 2021-01-19 18:20:03 CET ---

Thanks Andrea. As discussed on github, it would be great to get a backport to 8.3, so that we can consume it in CNV 2.6 which is based on 8.3.

Comment 1 Andrea Bolognani 2021-01-20 14:50:14 UTC
Created RHEL AV 8.3.1 bug to track backporting the fix, as per
Roman's request.

See also

  https://github.com/kubevirt/kubevirt/pull/4730#issuecomment-762992036

Comment 5 Meina Li 2021-01-25 07:33:49 UTC
Verified Version:
libvirt-6.6.0-13.module+el8.3.1+9548+0a8fede5.x86_64
qemu-kvm-5.1.0-18.module+el8.3.1+9507+32d6953c.x86_64

Verified Steps:
1. Start a guest with virtio-non-transitional membaloon device.
# virsh dumpxml lmn |grep /memballoon -B3
    <memballoon model='virtio-non-transitional'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
    </memballoon>
# virsh start lmn
Domain 'lmn' started

2. Check memory statistic.
# virsh dommemstat lmn
actual 1572864
swap_in 0
swap_out 0
major_fault 346
minor_fault 149519
unused 1170256
available 1344388
usable 1135620
last_update 1611559956
disk_caches 72376
hugetlb_pgalloc 0
hugetlb_pgfail 0
rss 525492

3. Start a guest with virtio-transitional membaloon device.
# virsh dumpxml lmn |grep /memballoon -B3
    <memballoon model='virtio-transitional'>
      <alias name='balloon0'/>
      <address type='pci' domain='0x0000' bus='0x09' slot='0x01' function='0x0'/>
    </memballoon>
# virsh start lmn
Domain 'lmn' started

4. Check memory statistic.
# virsh dommemstat lmn
actual 1572864
swap_in 0
swap_out 0
major_fault 234
minor_fault 134170
unused 1190900
available 1344388
usable 1148000
last_update 1611559746
disk_caches 56816
hugetlb_pgalloc 0
hugetlb_pgfail 0
rss 375896

The results are expected.

Comment 7 errata-xmlrpc 2021-02-22 15:39:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:8.3 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0639


Note You need to log in before you can comment on or make changes to this bug.