Bug 1138340 - vdsm relies on unsupported 'min_guarantee' VM configuration parameter
Summary: vdsm relies on unsupported 'min_guarantee' VM configuration parameter
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm   
(Show other bugs)
Version: 3.5
Hardware: Unspecified
OS: Linux
unspecified
medium
Target Milestone: ---
: 3.4.4
Assignee: Martin Sivák
QA Contact: Gil Klein
URL:
Whiteboard: sla
Keywords:
Depends On:
Blocks: 1073943 1154665
TreeView+ depends on / blocked
 
Reported: 2014-09-04 14:25 UTC by Adam Litke
Modified: 2014-12-19 17:34 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2014-09-24 08:09:02 UTC
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Sample domain XML (4.72 KB, text/plain)
2014-09-04 14:25 UTC, Adam Litke
no flags Details
before_vm_start hook to remove min_guarantee (1.56 KB, text/x-python)
2014-12-18 23:48 UTC, Paul Heinlein
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 32741 master ABANDONED Force vdsm to stay with libvirt <= 1.2.7 Never
oVirt gerrit 32763 ovirt-3.5 MERGED Force vdsm to stay with libvirt <= 1.2.7 Never
oVirt gerrit 32764 ovirt-3.4 MERGED Force vdsm to stay with libvirt <= 1.2.7 Never
oVirt gerrit 32774 master MERGED Do not add the memtune/min_guarantee element to the libvirt xml Never
oVirt gerrit 32788 ovirt-3.5 MERGED Do not add the memtune/min_guarantee element to the libvirt xml Never
oVirt gerrit 32789 ovirt-3.4 MERGED Do not add the memtune/min_guarantee element to the libvirt xml Never
oVirt gerrit 32928 master MERGED Add a libvirt migration hook to filter out min_guarantee element Never
oVirt gerrit 33050 ovirt-3.5 MERGED Add a libvirt migration hook to filter out min_guarantee element Never
oVirt gerrit 33051 ovirt-3.4 MERGED Add a libvirt migration hook to filter out min_guarantee element Never
Red Hat Bugzilla 1122455 None CLOSED libvirt should refuse to start domain with unsupported/useless min-guarantee element in qemu driver 2019-02-19 12:53 UTC

Internal Trackers: 1122455

Description Adam Litke 2014-09-04 14:25:29 UTC
Created attachment 934459 [details]
Sample domain XML

Description of problem:

When creating a VM, vdsm uses the <min_guarantee/> parameter in the XML to pass through the minimum amount of memory that should be always available for the VM.  MOM uses this value in its memory ballooning algorithm.  This field is not supported by qemu but in the past libvirt has allowed it to be set anyway.  Recently, libvirt is now reporting an error since qemu won't honor it.

The proper long-term solution is to use the metadata feature of libvirt to attach this information as arbitrary metadata.  We need to explore backwards compatibility and upgrade issues for oVirt when making this change.

Here is the error returned by vdsm:

Thread-35147::ERROR::2014-09-04 09:09:15,331::vm::2336::vm.Vm::(_startUnderlyingVm) vmId=`e167e221-9dff-40b1-b911-d83ea74bbb20`::The vm start process failed
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 2276, in _startUnderlyingVm
    self._run()
  File "/usr/share/vdsm/virt/vm.py", line 3367, in _run
    self._connection.createXML(domxml, flags),
  File "/usr/lib64/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 3424, in createXML
    if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self)
libvirtError: unsupported configuration: Parameter 'min_guarantee' not supported by QEMU.


Version-Release number of selected component (if applicable):


How reproducible: Always


Steps to Reproduce:
1. Ensure libvirt >= libvirt-1.2.8-1 is installed on the host
2. Start a VM

Actual results:
VM creation fails due to above error

Expected results:
VM creation succeeds

Additional info:
Reproducible on el7 with the following package versions:
libvirt-1.2.8-1.el7.x86_64
qemu-kvm-rhev-2.1.0-3.el7.x86_64

Comment 1 Peter Krempa 2014-09-04 14:46:02 UTC
Please note that using an invalid VM XML will also affect migrations as the target of the migration will refuse to start the VM with invalid configuration.

Comment 2 Martin Sivák 2014-09-04 15:30:50 UTC
I believe it is bad decision on the libvirt side as they are forcing their users to store the logically and semantically identical value differently depending on the hypervisor.

Unfortunately for us, they seem to be pretty adamant about it. We will also have to revert the fix if qemu starts supporting it in the future.

Comment 3 Dan Kenigsberg 2014-09-04 17:44:38 UTC
If libvirt does not agree to ignore this, we have to backport the fix all the way to ovirt-3.3, when we introduced its usage (http://gerrit.ovirt.org/15799 ).

VMs started by ovirt-3.3.0 cannot migrate to destinations with new libvirt; but as long as we support ovirt-3.3 we must make sure that that ovirt-3.3.5 would be able to start migratable machines.

For people with long-living VMs started back in 3.3.0, we must supply a libvirt hook that strips <min_guarantee> on the destination. Peter, could you help us write one?

Comment 4 Peter Krempa 2014-09-05 11:16:57 UTC
Does the hook need to convert the used value to anything in the custom metadata?

Comment 5 Peter Krempa 2014-09-05 13:48:46 UTC
A different option I'm now investigating is that we might be able to implement the min_guarantee element in a way that would work for both libvirt (and the documented meaning of the element) and oVirt where it shouldn't have impact on the current code.

I'm looking into the feasibility of this. In that case, only the upstream 1.2.8 release of libvirt wouldn't work.

Comment 6 Peter Krempa 2014-09-05 15:03:51 UTC
(In reply to Peter Krempa from comment #5)
> A different option I'm now investigating is that we might be able to
> implement the min_guarantee element in a way that would work for both
> libvirt (and the documented meaning of the element) and oVirt where it
> shouldn't have impact on the current code.
> 
> I'm looking into the feasibility of this. In that case, only the upstream
> 1.2.8 release of libvirt wouldn't work.

Unfortunately, the semantics of the min_guarantee field is to guarantee a certain amount of ram (not including swap) to be available to a single guest. Currently libvirt isn't able to do such a guarantee as cgroups don't expose such mechanism.

Thus we won't be able to provide a suitable implementation.

Comment 7 Martin Sivák 2014-09-08 08:29:03 UTC
If I remember correctly we were talking about ballooning limits or qemu. cgroups have nothing to do with this as they are used for the upper limits only.

Is the integration to balloon limits really something that would violate the documented semantics?

Comment 8 Doron Fediuck 2014-09-15 14:22:29 UTC
Dan,
3.3.5?

Comment 9 Dan Kenigsberg 2014-09-15 14:48:44 UTC
We need to backport this to all supported stable versions. 3.3.z is not one of them; changing to 3.4.4.

Comment 10 Sandro Bonazzola 2014-09-24 08:09:02 UTC
oVirt 3.4.4 has been released.

Comment 11 Paul Heinlein 2014-12-18 23:48:39 UTC
Created attachment 970915 [details]
before_vm_start hook to remove min_guarantee

Running oVirt Engine 3.4.4 (on Fedora 19) with Fedora 21 hosts, I had to add a before_vm_start hook to remove the min_guarantee element from the domain XML. The attached python script works in our environment.

Comment 12 Dan Kenigsberg 2014-12-19 08:54:52 UTC
which vdsm version did you use on your hosts?

Comment 13 Paul Heinlein 2014-12-19 17:34:11 UTC
The host in question is running Fedora 21, with vdsm 4.14.8.1-1.fc21.

The engine (version 3.4.4) is running on Fedora 19.

Most of our hosts are currently running Fedora 19, but under the assumption that F19 will soon reach end of support now that F21 has been released, I'm starting to test F21 on the hosts.

It took a workaround outlined by Adam Litke in Bugzilla 1138807 to get oVirt Engine 3.4.4 to recognize the F21 host.

Once that step was done, I could not start a VM on the F21 host or migrate a running VM to it. That was the point at which I stumbled on this ticket.

We have a schedule for updating oVirt Engine to 3.5, but we're for the time being we're stuck with 3.4.4.


Note You need to log in before you can comment on or make changes to this bug.