Bug 1461101
| Summary: | [downstream clone - 4.1.3] [downstream clone] LiveMerge fails with libvirtError: Block copy still active. Disk not ready for pivot | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ |
| Component: | vdsm | Assignee: | Ala Hino <ahino> |
| Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | unspecified | CC: | ahino, alitke, amureini, bazulay, bugs, creatmbox, eedri, gshinar, jspanko, lsurette, mkalinin, ratamir, srevivo, tnisan, trichard, ycui, ykaul, ylavi |
| Target Milestone: | ovirt-4.1.3 | Keywords: | ZStream |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: |
Previously, the method used to test the completion of a live merge operation was incorrect; it checked the live merge progress value available from the libvirt API, which does not provide the status of a live merge operation. As a result, the live merge was detected as completed before the operation was actually completed. Trying to finalize the merge operation failed repeatedly until the operation was actually completed, logging multiple errors during the process.
Now, live merge completion is detected using the libvirt XML, so the operation should complete successfully without logging errors.
|
Story Points: | --- |
| Clone Of: | 1427184 | Environment: | |
| Last Closed: | 2017-07-27 18:03:45 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1427184 | ||
| Bug Blocks: | |||
|
Description
rhev-integ
2017-06-13 13:59:34 UTC
Created attachment 1201372 [details]
Engine Log
(Originally by mst)
(Originally by rhev-integ)
Created attachment 1201373 [details]
VDSM Log
(Originally by mst)
(Originally by rhev-integ)
Created attachment 1201374 [details]
Task State
(Originally by mst)
(Originally by rhev-integ)
Yesterday the error occured once again. This time a VM with 6 disk with a disk snapshot that contained 4 disks. Result: task is running but vdsm shows abort traceback. Very interesting for us, after the livemerge init qemu started some kind of high disk IO for more than 6 hours. So we assume, that something was doing merge operations but we do not know was really going on. See screenshot attached. (Originally by mst) (Originally by rhev-integ) Created attachment 1210227 [details]
io during failing live merge
(Originally by mst)
(Originally by rhev-integ)
VDSM log of second error attached (Originally by mst) (Originally by rhev-integ) Created attachment 1210247 [details]
vdsm log 2nd error
(Originally by mst)
(Originally by rhev-integ)
I count with customer the same
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 372, in wrapper
return f(*a, **kw)
File "/usr/share/vdsm/virt/vm.py", line 4924, in run
self.tryPivot()
File "/usr/share/vdsm/virt/vm.py", line 4893, in tryPivot
ret = self.vm._dom.blockJobAbort(self.drive.name, flags)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 733, in blockJobAbort
if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self)
ckJobAbort
libvirtError: block copy still active: disk 'vda' not ready for pivot yet
VM with high disk I/O (hosting Mysql,Nagios) is not able to live migrate disks, job fails.
(Originally by Jaroslav Spanko)
(Originally by rhev-integ)
*** Bug 1406851 has been marked as a duplicate of this bug. *** (Originally by Ala Hino) (Originally by rhev-integ) 4.0.6 has been the last oVirt 4.0 release, please re-target this bug. (Originally by Sandro Bonazzola) (Originally by rhev-integ) *** Bug 1419767 has been marked as a duplicate of this bug. *** (Originally by Tal Nisan) (Originally by rhev-integ) Ala, the attached patch is merged to the 4.1 branch, but it doesn't seem to solve anything (just handle the erroneous situation better). Is the "real" fix on track for 4.1.2? (Originally by Allon Mureinik) Indeed, the merged patch doesn't fix the issue but better handles the expected exception. The "real" fix is based on handling libvirt events. Actually, this BZ is identical to BZ 1438850 that is targeted to 4.2. Close duplicate this BZ? (Originally by Ala Hino) (In reply to Ala Hino from comment #17) > Close duplicate this BZ? No. This is a downstream clone used to hold the customer ticket. (Originally by Allon Mureinik) (In reply to Ala Hino from comment #17) > Indeed, the merged patch doesn't fix the issue but better handles the > expected exception. > > The "real" fix is based on handling libvirt events. Actually, this BZ is > identical to BZ 1438850 that is targeted to 4.2. So, this was eventually solved for 4.1.3. Fixing target release to 4.1.3 so it can be included in ET. (Originally by Allon Mureinik) (In reply to rhev-integ from comment #20) > (In reply to Ala Hino from comment #17) > > Indeed, the merged patch doesn't fix the issue but better handles the > > expected exception. > > > > The "real" fix is based on handling libvirt events. Actually, this BZ is > > identical to BZ 1438850 that is targeted to 4.2. > So, this was eventually solved for 4.1.3. Fixing target release to 4.1.3 so > it can be included in ET. > > (Originally by Allon Mureinik) Setting to ON_QA based on this. Note that the work Kevin did on bug 1376580 should also verify this one, but up to QA staleholders whether they want to re-verify this or just mark as VERIFIED based on that one. Moving to VERIFIED based on Comment 21 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1815 |