Bug 1157222
| Summary: | Deleting all snapshot disks during a vdsm restart causes some of the snapshot disks to stay in locked status | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | lkuchlan <lkuchlan> | ||||||
| Component: | ovirt-engine | Assignee: | Daniel Erez <derez> | ||||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | lkuchlan <lkuchlan> | ||||||
| Severity: | high | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 3.5.0 | CC: | acanan, amureini, derez, ecohen, gklein, gpadgett, iheim, lkuchlan, lpeer, lsurette, rbalakri, Rhev-m-bugs, rnori, scohen, tnisan, yeylon | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 3.5.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | storage | ||||||||
| Fixed In Version: | org.ovirt.engine-root-3.5.0-25 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | Type: | Bug | |||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
Daniel, you handled something similar, isn't this already solved? (In reply to Tal Nisan from comment #1) > Daniel, you handled something similar, isn't this already solved? Yes, should already be solved. Added the relevant oVirt gerrit external tracker. tested using RHEVM 3.5 vt11 most of the snapshot volumes still remain in locked status Hi Liron, Can you please attach the recent logs from engine and vdsm. Created attachment 962810 [details]
logs and image
Please find attached the logs
Hi Ravi, Looking at the engine log, several MergeSnapshotsVDSCommand had a connectivity error [1] upon sending to VDSM (as the service been stopped), which has resulted in a transaction roll-back of the RemoveSnapshotSingleDiskCommand [2]. I.e. upon failure, only 'CommandBase -> rollback' was executed. Can we invoke the compensation mechanism in this flow so the disk could get properly unlocked? Or just manually override 'rollback' method? The flow: RemoveSnapshotSingleDiskCommand is invoked as an internal command by RemoveDiskSnapshotTaskHandler, which is called by RemoveDiskSnapshotsCommand. [1] 2014-11-29 21:12:26,019 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (org.ovirt.thread.pool-7-thread-8) [38273561] ERROR, MergeSnapshotsVDSCommand( storagePoolId = 00000002-0002-0002-0002-0000000000b8, ignoreFailoverLimit = false, storageDomainId = 0f599def-1b9b-45be-83bc-f4271ce26569, imageGroupId = beaddb35-ab14-4589-aaee-40feaed2d573, imageId = e030be66-68a5-4afa-a0df-bb5b7465cc70, imageId2 = 7b75eda2-3013-4c4a-a9ab-127c3b3b0bd8, vmId = 00000000-0000-0000-0000-000000000000, postZero = false), exception: XmlRpcRunTimeException: Connection issues during send request, log id: 4e0a56ae: org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcRunTimeException: Connection issues during send request at org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.<init>(FutureMap.java:70) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcIIrsServer.mergeSnapshots(JsonRpcIIrsServer.java:157) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.MergeSnapshotsVDSCommand.executeIrsBrokerCommand(MergeSnapshotsVDSCommand.java:15) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.executeVDSCommand(IrsBrokerCommand.java:156) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56) [vdsbroker.jar:] at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:31) [dal.jar:] at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:418) [vdsbroker.jar:] at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2042) [bll.jar:] at org.ovirt.engine.core.bll.storage.StorageHandlingCommandBase.runVdsCommand(StorageHandlingCommandBase.java:639) [bll.jar:] at org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand.mergeSnapshots(RemoveSnapshotSingleDiskCommand.java:50) [bll.jar:] at org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand.executeCommand(RemoveSnapshotSingleDiskCommand.java:37) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1179) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1318) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1943) [bll.jar:] [2] 2014-11-29 21:12:26,054 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand] (org.ovirt.thread.pool-7-thread-8) [38273561] Command org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand throw Vdc Bll exception. With error message VdcBLLException: org.ovirt.engine.core.vdsbroker.irsbroker.IRSProtocolException: IRSProtocolException: (Failed with error ENGINE and code 5001) 2014-11-29 21:12:26,080 ERROR [org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand] (org.ovirt.thread.pool-7-thread-8) [38273561] Transaction rolled-back for command: org.ovirt.engine.core.bll.RemoveSnapshotSingleDiskCommand. In my testing both compensate and rollback were invoked on RemoveSnapshotSingleDiskCommand. The only option was to override rollback and unlock the disk. Submitted a patch Tested using RHEVM 3.5 vt13.3 RHEV-M 3.5.0 has been released, closing this bug. RHEV-M 3.5.0 has been released, closing this bug. |
Created attachment 950767 [details] logs Description of problem: Deleting all snapshot disks during a vdsm restart causes some of the snapshot disks to stay in locked status Version-Release number of selected component (if applicable): 3.5 vt5 How reproducible: 100% Steps to Reproduce: 1. Add a VM with 3 disks (using an NFS domain), create a snapshot 2. Add another 2 disks and create another snapshot 3. Select all volumes from the snapshot overview and remove them 4. During the removal process, restart the vdsm service Actual results: Some of the snapshot volumes remain in locked status Expected results: The volumes snapshots should be deleted, or else the volumes should not be locked if the deletion does not take place