Bug 1427184
| Summary: | [downstream clone] LiveMerge fails with libvirtError: Block copy still active. Disk not ready for pivot | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ | |
| Component: | vdsm | Assignee: | Ala Hino <ahino> | |
| Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | unspecified | CC: | ahino, alitke, bburmest, bugs, creatmbox, eedri, jspanko, lsurette, mkalinin, ratamir, srevivo, tnisan, ycui, ykaul, ylavi | |
| Target Milestone: | ovirt-4.2.0 | Keywords: | ZStream | |
| Target Release: | 4.2.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: |
Previously, the libvirt API would report live merges as complete before they were completed, resulting in errors.
With this release, live merge progress is now detected using the libvirt xml, resulting in correct reporting of live merge completion status.
|
Story Points: | --- | |
| Clone Of: | 1376580 | |||
| : | 1461101 (view as bug list) | Environment: | ||
| Last Closed: | 2018-05-15 17:50:23 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1376580, 1438850 | |||
| Bug Blocks: | 1461101 | |||
|
Description
rhev-integ
2017-02-27 14:35:58 UTC
Created attachment 1201372 [details]
Engine Log
(Originally by mst)
Created attachment 1201373 [details]
VDSM Log
(Originally by mst)
Created attachment 1201374 [details]
Task State
(Originally by mst)
Yesterday the error occured once again. This time a VM with 6 disk with a disk snapshot that contained 4 disks. Result: task is running but vdsm shows abort traceback. Very interesting for us, after the livemerge init qemu started some kind of high disk IO for more than 6 hours. So we assume, that something was doing merge operations but we do not know was really going on. See screenshot attached. (Originally by mst) Created attachment 1210227 [details]
io during failing live merge
(Originally by mst)
VDSM log of second error attached (Originally by mst) Created attachment 1210247 [details]
vdsm log 2nd error
(Originally by mst)
I count with customer the same
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 372, in wrapper
return f(*a, **kw)
File "/usr/share/vdsm/virt/vm.py", line 4924, in run
self.tryPivot()
File "/usr/share/vdsm/virt/vm.py", line 4893, in tryPivot
ret = self.vm._dom.blockJobAbort(self.drive.name, flags)
File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 69, in f
ret = attr(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 123, in wrapper
ret = f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/vdsm/utils.py", line 917, in wrapper
return func(inst, *args, **kwargs)
File "/usr/lib64/python2.7/site-packages/libvirt.py", line 733, in blockJobAbort
if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self)
ckJobAbort
libvirtError: block copy still active: disk 'vda' not ready for pivot yet
VM with high disk I/O (hosting Mysql,Nagios) is not able to live migrate disks, job fails.
(Originally by Jaroslav Spanko)
*** Bug 1406851 has been marked as a duplicate of this bug. *** (Originally by Ala Hino) 4.0.6 has been the last oVirt 4.0 release, please re-target this bug. (Originally by Sandro Bonazzola) *** Bug 1419767 has been marked as a duplicate of this bug. *** (Originally by Tal Nisan) Ala, the attached patch is merged to the 4.1 branch, but it doesn't seem to solve anything (just handle the erroneous situation better). Is the "real" fix on track for 4.1.2? Indeed, the merged patch doesn't fix the issue but better handles the expected exception. The "real" fix is based on handling libvirt events. Actually, this BZ is identical to BZ 1438850 that is targeted to 4.2. Close duplicate this BZ? (In reply to Ala Hino from comment #17) > Close duplicate this BZ? No. This is a downstream clone used to hold the customer ticket. (In reply to Ala Hino from comment #17) > Indeed, the merged patch doesn't fix the issue but better handles the > expected exception. > > The "real" fix is based on handling libvirt events. Actually, this BZ is > identical to BZ 1438850 that is targeted to 4.2. So, this was eventually solved for 4.1.3. Fixing target release to 4.1.3 so it can be included in ET. Verified with the following code:
-------------------------------------------
kovirt-engine-4.2.0-0.0.master.20170621095718.git8901d14.el7.centos.noarch
vdsm-4.20.1-66.git228c7be.el7.centos.x86_64
Verified with the following scenario:
-------------------------------------------
1. Create a VM with disk, install OS and write data
2. Create snap1
3. Write 500m new data
4. Create snap2
5. Write 500m new data
6. Create snap3
7. Write 2g of new data and delete snap2 during the write operation
>>>>> Snapshot is Deleted successfully, Live Merge is successful, writes completed successfully.
Moving to VERIFIED!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2018:1489 BZ<2>Jira Resync |