Bug 1458548
| Summary: | [vdsm] Live storage migration fails on "libvirtError: Requested operation is not valid: domain is not transient" during diskReplicateStart | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [oVirt] vdsm | Reporter: | Elad <ebenahar> | ||||
| Component: | Core | Assignee: | Milan Zamazal <mzamazal> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Elad <ebenahar> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.20.0 | CC: | amureini, bugs, bzlotnik, eshenitz, fromani, michal.skrivanek, mzamazal, tnisan | ||||
| Target Milestone: | ovirt-4.2.0 | Keywords: | Automation, Regression | ||||
| Target Release: | --- | Flags: | rule-engine:
ovirt-4.2+
rule-engine: blocker+ |
||||
| Hardware: | x86_64 | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2017-12-20 10:49:48 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1459113, 1459183 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Elad
2017-06-04 09:09:34 UTC
Milan/Francecso - the very recent change 15808ddf9035287575a59d9aa879598a33a2b6bc changed our domains to be persistent, which I'm guessing is the root cause here. Does this make any sense to you? Well, this is very surprising. Looks like we actually hit one QEMU issue.
libvirt actively and explicitely prevents virDomainBlockCopy with persistent domains. Totally unexpected.
in src/qemu/qemu_driver.c we have
commit c1eb38053d616d764c0c5381301b4cd5d2c45921
Author: Eric Blake <eblake>
Date: Fri Oct 19 17:46:08 2012 -0600
if (vm->persistent) {
/* XXX if qemu ever lets us start a new domain with mirroring
* already active, we can relax this; but for now, the risk of
* 'managedsave' due to libvirt-guests means we can't risk
* this on persistent domains. */
virReportError(VIR_ERR_OPERATION_INVALID, "%s",
_("domain is not transient"));
goto cleanup;
}
now we need to see if this is still relevant after 5 years.
We'll need at very least a libvirt bug to depend on.
Assigning to Milan in the mean time while he works with libvirt devs to see if this limitation can/should be lifted. Note: this bug only occurs when the volume format from which the LSM auto-generated snapshot is taken is COW I got information about the problem from libvirt developers. libvirt could probably add a small feature to permit running block-copy operations also on persistent domains, with the same limitations as for transient domains, see https://bugzilla.redhat.com/1459113. *** Bug 1461468 has been marked as a duplicate of this bug. *** Tested the scenario described in the bug description (https://polarion.engineering.redhat.com/polarion/#/project/RHEVM3/workitem?id=RHEVM3-6057) It passed: 2017-07-19 17:28:11,004 - MainThread - art.ll_lib.jobs - INFO - JOB 'Migrating Disk disk_virtiocow_1916450315 from iscsi_1 to iscsi_2' TOOK 79.777 seconds 2017-07-19 17:28:11,004 - MainThread - art.ll_lib.jobs - INFO - All jobs are gone 2017-07-19 17:41:16,309 - MainThread - art.logging - INFO - Status: passed Tested using: ovirt-engine-4.2.0-0.0.master.20170717104433.gita1ba045.el7.centos.noarch vdsm-4.20.1-202.git9f953f3.el7.centos.x86_64 libvirt-daemon-3.2.0-14.el7.x86_64 This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report. |