Bug 816674

Summary: libvirt should cache qemu capabilities
Product: [Community] Virtualization Tools Reporter: Eric Blake <eblake>
Component: libvirtAssignee: Peter Krempa <pkrempa>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: low Docs Contact:
Priority: medium    
Version: unspecifiedCC: acathrow, ajia, dallan, dyasny, dyuan, eblake, mzhan, pkrempa, rwu
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 816662 Environment:
Last Closed: 2012-10-29 06:14:47 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 816662    
Bug Blocks:    

Description Eric Blake 2012-04-26 16:28:43 UTC
Libvirt should know the capabilities of any given qemu binary (based on inode and timestamp); these capabilities are the same no matter how many VMs use the binary.  Right now, we are recomputing the capabilities for every VM we start, which can cost a lot of time.

Also, for persistent guests, some of the capabilities are determined by parsing qemu -help output when the guest XML is first parsed, then recomputed later when the guest is started; since some of the capabilities depend on monitor command output, that means that the set of capabilities tied to a guest can vary over time.  Caching the capabilities will help us accurately reflect the set of capabilities up front, rather than missing some bits until the guest is started.

+++ This bug was initially created as a clone of Bug #816662 +++

Description of problem:
virsh blockpull raises a improper error message for a offline domain.

Version-Release number of selected component (if applicable):
# rpm -q libvirt qemu-kvm-rhev
libvirt-0.9.10-14.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.285.el6.x86_64

How reproducible:
always

Steps to Reproduce:
$ qemu-img create /var/lib/libvirt/images/test 1M

$ cat > /tmp/test.xml <<EOF
<domain type='qemu'>
  <name>test</name>
  <memory>219200</memory>
  <vcpu>1</vcpu>
  <os>
    <type arch='x86_64'>hvm</type>
    <boot dev='hd'/>
  </os>
  <devices>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/test'/>
      <target dev='vda' bus='virtio'/>
    </disk>
    <input type='mouse' bus='ps2'/>
    <graphics type='spice' autoport='yes' listen='0.0.0.0'/>
  </devices>
</domain>
EOF

$ virsh define /tmp/test.xml
$ valgrind -v virsh blockpull test /var/lib/libvirt/images/test --wait

  
Actual results:
error: unsupported configuration: block jobs not supported with this QEMU
binary

Expected results:


Additional info:
Eric has confirmed this issues, the following is his comments:
"the error message for an offline domain should be nicer."

--- Additional comment from ajia on 2012-04-26 10:07:02 MDT ---

(In reply to comment #0)
> $ valgrind -v virsh blockpull test /var/lib/libvirt/images/test --wait
It should be 'virsh blockpull test /var/lib/libvirt/images/test --wait' without
valgrind.

--- Additional comment from eblake on 2012-04-26 10:18:22 MDT ---

Yuck.  The problem here is that we don't learn some of qemu's capabilities until after we start the guest, so the behavior depends on what you have previously done with the guest:

offline
check capability -> not present
start guest
check capability -> present
stop guest
check capability -> present

I can do a quick fix for the symptoms (for the 3 or 4 capabilities that are conditional until the guest is first started, ensure that we are _always_ checking for an online guest before checking those caps), but the _real_ fix is to cache qemu capabilities once, instead of re-computing them per-VM, and as part of the up-front caching, compute even the capabilities that right now are only visible through the guest monitor.

--- Additional comment from eblake on 2012-04-26 10:24:03 MDT ---

Keeping this bug for the quick fix for 6.3 (swapping the checks is trivial to give us a better error message for offline domains); and cloning into 6.4 for the bigger issue of caching capabilities checks.

Comment 2 Peter Krempa 2012-10-29 06:14:47 UTC
This functionality was added upstream with

commit 85a7b5e1ce5adae6a0a4025d4f01f8a2373a962b
Author: Daniel P. Berrange <berrange>
Date:   Wed Aug 22 13:54:13 2012 +0100

    Add a qemu capabilities cache manager
    
    Introduce a qemuCapsCachePtr object to provide a global cache
    of capabilities for QEMU binaries. The cache auto-populates
    on first request for capabilities about a binary, and will
    auto-refresh if the binary has changed since a previous cache
    was populated
    
    Signed-off-by: Daniel P. Berrange <berrange>


and a few following commits. This will be released as a part of the upcoming 1.0.0 release.