Bug 1579909
Summary: | Cannot start VM with QoS IOPS after host&engine upgrade from 4.1 to 4.2 | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] vdsm | Reporter: | ernest.beinrohr | ||||||||
Component: | Core | Assignee: | Francesco Romani <fromani> | ||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Liran Rotenberg <lrotenbe> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | 4.20.23 | CC: | ahadas, bugs, ernest.beinrohr, fromani, gveitmic, michal.skrivanek, msivak | ||||||||
Target Milestone: | ovirt-4.2.4 | Flags: | rule-engine:
ovirt-4.2+
|
||||||||
Target Release: | --- | ||||||||||
Hardware: | x86_64 | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | vdsm-4.20.28-1 | Doc Type: | If docs needed, set a value | ||||||||
Doc Text: |
Vdsm uses the domain metadata section to store extra data which is required to configure a VM but not properly represented on the standard libvirt domain.
This always happens when a VM starts.
Vdsm tried to store the drive IO tune settings in the metadata, which was redundant because the IO tune has already a proper representation.
Furthermore the implementation of the store operation of the IO tune settings had an implementation bug, which made it not possible to succesfully start the VM.
This bug appears only if IO tune settings are enabled.
|
Story Points: | --- | ||||||||
Clone Of: | |||||||||||
: | 1589612 (view as bug list) | Environment: | |||||||||
Last Closed: | 2018-06-26 08:38:48 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 1589612 | ||||||||||
Attachments: |
|
Description
ernest.beinrohr
2018-05-18 16:04:14 UTC
From the original email thread: Also, please make sure to report the steps you did to get this error. Are you just starting a new VM using 4.2 Engine and 4.2 host? Or are you migrating an old VM created with Engine 4.1? Just for the sake of completeness (not sure it applies here), this flow is NOT supported: 1. have a VM happily run on 4.1 host 2. upgrade Vdsm on that host from 4.1 to 4.2 while the VM is running 3. restart Vdsm Please also attach engine.log Created attachment 1439439 [details]
engine log from a failed start
Created attachment 1439440 [details]
VM XML - no IO limits in this disks profile
Created attachment 1439441 [details]
VM XML - IO limits in this disks profile - not starting with this one on 4.2 host
I just created a new VM on my 4.2 cluster. It's starting OK when the disk profile has no IO limits. Once I select another disk profile with limits, the VM does not start. I'm attaching the engine log from the failed start and also 2 XMLs from the log, one with a IO limit disk profile which does not start and the other without IO limits, which starts OK. The diff is plain enough: diff --git a/bug_bad.xml b/bug_ok.xml index 40d2689..3aad896 100644 --- a/bug_bad.xml +++ b/bug_ok.xml @@ -98,7 +98,6 @@ <alias name="ua-e94f8c97-bb7a-4dbb-a1a1-1470d70250aa"/> <address bus="0" controller="0" target="0" type="drive" unit="0"/> <serial>e94f8c97-bb7a-4dbb-a1a1-1470d70250aa</serial> - <iotune read_bytes_sec="0" read_iops_sec="300" total_bytes_sec="0" total_iops_sec="0" write_bytes_sec="0" write_iops_sec="300"/> </disk> </devices> <pm> PS: on a 4.1 host both VMs start. Vdsm bug, unwanted side effect of patch https://gerrit.ovirt.org/#/c/90435/ (In reply to ernest.beinrohr from comment #6) > I just created a new VM on my 4.2 cluster. It's starting OK when the disk > profile has no IO limits. Once I select another disk profile with limits, > the VM does not start. I'm attaching the engine log from the failed start > and also 2 XMLs from the log, one with a IO limit disk profile which does > not start and the other without IO limits, which starts OK. > > The diff is plain enough: > diff --git a/bug_bad.xml b/bug_ok.xml > index 40d2689..3aad896 100644 > --- a/bug_bad.xml > +++ b/bug_ok.xml > @@ -98,7 +98,6 @@ > <alias name="ua-e94f8c97-bb7a-4dbb-a1a1-1470d70250aa"/> > <address bus="0" controller="0" target="0" type="drive" > unit="0"/> > <serial>e94f8c97-bb7a-4dbb-a1a1-1470d70250aa</serial> > - <iotune read_bytes_sec="0" read_iops_sec="300" > total_bytes_sec="0" total_iops_sec="0" write_bytes_sec="0" > write_iops_sec="300"/> > </disk> > </devices> > <pm> > > > PS: on a 4.1 host both VMs start. Please note that the XML generated from Engine looks wrong. It should look like https://libvirt.org/formatdomain.html#elementsDisks Vdsm has code to deal with the format as per libvirt docs. So, with the incoming patch Vdsm will let the Vm start, but until we also get an Engine fix, the IO tune settings will be silently discarded (In reply to Francesco Romani from comment #8) > Please note that the XML generated from Engine looks wrong. It should look > like > https://libvirt.org/formatdomain.html#elementsDisks Ack, posted a fix. A new machine createn on 4.2 engine has this iotune format: <iotune read_bytes_sec="0" read_iops_sec="300" total_bytes_sec="0" total_iops_sec="0" write_bytes_sec="0" write_iops_sec="300"/> </disk> whereas the 4.1 engine generated (but also not working on 4.2 host) has this format: <iotune> <read_bytes_sec>0</read_bytes_sec> <read_iops_sec>100</read_iops_sec> <total_bytes_sec>0</total_bytes_sec> <total_iops_sec>0</total_iops_sec> <write_bytes_sec>0</write_bytes_sec> <write_iops_sec>100</write_iops_sec> </iotune> </disk> Both formats work on a 4.1 host (In reply to ernest.beinrohr from comment #10) > A new machine createn on 4.2 engine has this iotune format: > <iotune read_bytes_sec="0" read_iops_sec="300" > total_bytes_sec="0" total_iops_sec="0" write_bytes_sec="0" > write_iops_sec="300"/> > </disk> > > whereas the 4.1 engine generated (but also not working on 4.2 host) has this > format: > <iotune> > <read_bytes_sec>0</read_bytes_sec> > <read_iops_sec>100</read_iops_sec> > <total_bytes_sec>0</total_bytes_sec> > <total_iops_sec>0</total_iops_sec> > <write_bytes_sec>0</write_bytes_sec> > <write_iops_sec>100</write_iops_sec> > </iotune> > </disk> > > Both formats work on a 4.1 host Hi, Thanks for the additional information, it matches our findings. I believe that patch https://gerrit.ovirt.org/#/c/91432/ should solve the issue, and, if so, will be included in the next ovirt 4.2 release. Note about verification: To properly fix this bug we need both Vdsm patch and Engine patch. Vdsm patched, Engine unpatched 1. run a VM with iotune settings, it should start and work as usual 2. using "virsh -r dumpxml", or vdsm-client dumpxmls, inspect the domain XML, it should NOT have iotune settings configured. Vdsm and Engine both patched: 1. run a VM with iotune settings, it should start and work as usual 2. using "virsh -r dumpxml", or vdsm-client dumpxmls, inspect the domain XML, it should have iotune settings configured. INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Open patch attached] For more info please contact: infra Moving back to post as per comment #13 Unmerged patches are not needed to solve this issue. We may keep this bug open for a different reason, however: without Engine patch, a VM with QoS settings will start, but ignoring the aforementioned QoS settings. Just noticed that Engine patch is merged, so moving back to MODIFIED https://gerrit.ovirt.org/#/c/91492/ (In reply to Francesco Romani from comment #16) > Just noticed that Engine patch is merged, so moving back to MODIFIED > > https://gerrit.ovirt.org/#/c/91492/ Right, and since all fixes were merged before 4.2.4 tagging moving to ON_QA. Verified on: ovirt-engine-4.2.4.1-0.1.el7.noarch vdsm-4.20.29-1.el7ev.x86_64 Steps of verification: 1. Deploy RHEV 4.1: ovirt-engine-4.1.11.2-0.1.el7.noarch, vdsm-4.19.51-1.el7ev.x86_64 2. Create storage QoS profile in the datacenter. 3. Set IOPS read/write - 100 and 100. Note: I created 2 QoS profiles one with IOPS and one with throughput(golden_env_mixed_virtio_0 is with IOPS, golden_env_mixed_virtio_1 is with throughput). 4. Create a VM, set the QoS profile on the disk VM and start VM. 5. From the host run: # virsh -r dumpxml golden_env_mixed_virtio_0 | grep -a3 iotune <iotune> <read_iops_sec>100</read_iops_sec> <write_iops_sec>100</write_iops_sec> </iotune> For throughput profile: <iotune> <read_bytes_sec>104857600</read_bytes_sec> <write_bytes_sec>104857600</write_bytes_sec> </iotune> 6. Shutdown the VM. 7. Upgrade engine and host to 4.2: ovirt-engine-4.2.4.1-0.1.el7.noarch, vdsm-4.20.29-1.el7ev.x86_64 8. Start VM (done on cluster version: 4.1) 9. From the host run: # virsh -r dumpxml golden_env_mixed_virtio_0 | grep -a3 iotune <iotune> <read_iops_sec>100</read_iops_sec> <write_iops_sec>100</write_iops_sec> </iotune> # vdsm-client VM getIoTune vmID='2a66838a-6a2f-4f7d-9486-561ca4ddc6bf' [ { "ioTune": { "write_bytes_sec": 0, "total_iops_sec": 0, "read_iops_sec": 100, "read_bytes_sec": 0, "write_iops_sec": 100, "total_bytes_sec": 0 }, # virsh -r dumpxml golden_env_mixed_virtio_1 | grep -a3 iotune <iotune> <read_bytes_sec>104857600</read_bytes_sec> <write_bytes_sec>104857600</write_bytes_sec> </iotune> # vdsm-client VM getIoTune vmID='78488f39-064f-4440-8586-a91ba2fa0f55' [ { "ioTune": { "write_bytes_sec": 104857600, "total_iops_sec": 0, "read_iops_sec": 0, "read_bytes_sec": 104857600, "write_iops_sec": 0, "total_bytes_sec": 0 }, 10. Repeat steps 8-9 on cluster with Compatibility Version 4.2 11. From engine log domain XML(cluster version 4.2): iops: <iotune> <read_bytes_sec>0</read_bytes_sec> <read_iops_sec>100</read_iops_sec> <total_bytes_sec>0</total_bytes_sec> <total_iops_sec>0</total_iops_sec> <write_bytes_sec>0</write_bytes_sec> <write_iops_sec>100</write_iops_sec> </iotune> throughput: <iotune> <read_bytes_sec>104857600</read_bytes_sec> <read_iops_sec>0</read_iops_sec> <total_bytes_sec>0</total_bytes_sec> <total_iops_sec>0</total_iops_sec> <write_bytes_sec>104857600</write_bytes_sec> <write_iops_sec>0</write_iops_sec> </iotune> Results: The VM start with iotune, after upgrading the engine from 4.1 to 4.2 with cluster compatibility of 4.1 and 4.2. This bugzilla is included in oVirt 4.2.4 release, published on June 26th 2018. Since the problem described in this bug report should be resolved in oVirt 4.2.4 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |