Bug 1788464 - LSM's diskReplicateFinish failed on libvirtError: internal error: unable to execute QEMU command 'block-job-complete: Could not open backing file: Failed to get shared "write" lock'
Summary: LSM's diskReplicateFinish failed on libvirtError: internal error: unable to e...
Keywords:
Status: CLOSED DUPLICATE of bug 1833780
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ovirt-4.4.5
: ---
Assignee: Benny Zlotnik
QA Contact: Avihai
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-01-07 10:05 UTC by Evelina Shames
Modified: 2020-11-23 15:37 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-11-23 15:37:22 UTC
oVirt Team: Storage
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)
logs (6.14 MB, application/zip)
2020-01-07 10:05 UTC, Evelina Shames
no flags Details

Description Evelina Shames 2020-01-07 10:05:00 UTC
Created attachment 1650326 [details]
logs

Description of problem:
Live storage migration fails in diskReplicateFinish phase with libvirtError:

vdsm-log:
2020-01-03 09:56:41,786-0500 ERROR (jsonrpc/3) [virt.vm] (vmId='40544673-e7c2-4e3c-a00d-284bd78b9c1f') Unable to stop the replication for the drive: vdb (vm:4354)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 4351, in diskReplicateFinish
    self._dom.blockJobAbort(drive.name, blockJobFlags)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 793, in blockJobAbort
    if ret == -1: raise libvirtError ('virDomainBlockJobAbort() failed', dom=self)
libvirt.libvirtError: internal error: unable to execute QEMU command 'block-job-complete': Could not open backing file: Failed to get shared "write" lock

engine-log:
2020-01-03 16:56:42,993+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-86) [disks_syncAction_c2aee412-3eef-4c35] EngineException: Failed to change disk image (Failed with error FAILED_CHANGE_CD_IS_MOUNTED and code 41): org.ovirt.engine.core.common.errors.EngineException: EngineException: Failed to change disk image (Failed with error FAILED_CHANGE_CD_IS_MOUNTED and code 41)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.replicateDiskFinish(LiveMigrateDiskCommand.java:434)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.completeLiveMigration(LiveMigrateDiskCommand.java:398)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand.performNextOperation(LiveMigrateDiskCommand.java:247)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.0-0.13.master.el7.noarch
vdsm-4.40.0-164.git38a19bb.el8ev.x86_64
libvirt-5.6.0-6.module+el8.1.0+4244+9aa4e6bb.x86_64


How reproducible:
Once

Steps to Reproduce:
1. Create a VM with disks from all allowed disks permutations
2. Run the VM
3. Move all the VM disks concurrently to different storage domain


Actual results:
Operation failed

Expected results:
Operation should succeed

Additional info:
logs are attached

Comment 1 Tal Nisan 2020-01-13 15:43:06 UTC
This is the exact bug as bug 1597019 but this time with Libvirt logs, Benny please have a look

Comment 2 Benny Zlotnik 2020-11-23 15:37:22 UTC

*** This bug has been marked as a duplicate of bug 1833780 ***


Note You need to log in before you can comment on or make changes to this bug.