Bug 1575305 - Live merge fails on NotConnectedError: VM '' was not defined yet or was undefined
Summary: Live merge fails on NotConnectedError: VM '' was not defined yet or was undef...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: 4.20.23
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.2.4
: ---
Assignee: Ala Hino
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-05 21:06 UTC by Elad
Modified: 2018-05-08 11:15 UTC (History)
3 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2018-05-08 11:15:01 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
logs (4.36 MB, application/x-gzip)
2018-05-07 13:55 UTC, Elad
no flags Details

Description Elad 2018-05-05 21:06:13 UTC
Description of problem:
Live merge fails on vdsm with NotConnectedError.

Version-Release number of selected component (if applicable):
hosted engine

How reproducible:
Not always

Steps to Reproduce:
1. Create a VM from template as thin with disk resides on iSCSI
2. Create and attach a preallocated disk to the VM
3. Start the VM
4. Create a snapshot to the VM
5. Delete the snapshot

Actual results:
Live merge fails with:

2018-05-05 07:53:44,889+0300 INFO  (jsonrpc/7) [api.virt] START merge(drive={'imageID': '8f5ca85f-3432-43c7-9562-ad7a1eddfff1', 'volumeID': '99b0352b-9b27-470d-989f-a7d31a75dee4', 'domainID': 'e8a8e491-2487-46fb
-afe9-00781198e966', 'poolID': 'f23d137e-ae43-4b1b-8a8b-f6aed029365f'}, baseVolUUID='ff21f234-72ac-49c5-a5fe-9a9965e0cb82', topVolUUID='99b0352b-9b27-470d-989f-a7d31a75dee4', bandwidth='0', jobUUID='18eaa245-0af
a-48f5-bef0-3ad8be90f50a') from=::ffff:10.46.16.85,46024, flow_id=snapshots_delete_c9564133-b9b8-40dc, vmId=dc7149c8-abf9-44cc-bb7e-cf806715f201 (api:46)
2018-05-05 07:53:44,889+0300 ERROR (jsonrpc/7) [api] FINISH merge error=VM 'dc7149c8-abf9-44cc-bb7e-cf806715f201' was not defined yet or was undefined (api:132)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/common/api.py", line 122, in method
    ret = func(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/API.py", line 736, in merge
    drive, baseVolUUID, topVolUUID, bandwidth, jobUUID)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5837, in merge
    chains = self._driveGetActualVolumeChain([drive])
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5976, in _driveGetActualVolumeChain
    self._updateDomainDescriptor()
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 5123, in _updateDomainDescriptor
    domainXML = self._dom.XMLDesc(0)
  File "/usr/lib/python2.7/site-packages/vdsm/virt/virdomain.py", line 47, in __getattr__
    % self.vmid)
NotConnectedError: VM 'dc7149c8-abf9-44cc-bb7e-cf806715f201' was not defined yet or was undefined


Expected results:
Live merge should succeed

Additional info:
engine.log:

2018-05-05 07:53:44,892+03 INFO  [org.ovirt.engine.core.bll.MergeCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [snapshots_delete_c9564133-b9b8-40dc] Running command: MergeCommand internal: true.
 Entities affected :  ID: e8a8e491-2487-46fb-afe9-00781198e966 Type: Storage
2018-05-05 07:53:44,893+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.MergeVDSCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-9) [snapshots_delete_c9564133-b9b8-40dc] START, MergeVDSCommand(Hos
tName = host_mixed_3, MergeVDSCommandParameters:{hostId='ef0a4bf8-9ca9-4cc8-9611-585296de6e47', vmId='dc7149c8-abf9-44cc-bb7e-cf806715f201', storagePoolId='f23d137e-ae43-4b1b-8a8b-f6aed029365f', storageDomainId=
'e8a8e491-2487-46fb-afe9-00781198e966', imageGroupId='63f78780-2063-448c-8b91-a0f6e74f725f', imageId='ede71ac0-2d4e-4fb8-be4b-5b083e88a19b', baseImageId='57a16a9b-5747-40b6-b62a-a177e2006c5e', topImageId='ede71a
c0-2d4e-4fb8-be4b-5b083e88a19b', bandwidth='0'}), log id: 3a7f192
2018-05-05 07:53:44,890+03 ERROR [org.ovirt.engine.core.bll.MergeCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7) [snapshots_delete_c9564133-b9b8-40dc] Engine exception thrown while sending merge c
ommand: org.ovirt.engine.core.common.errors.EngineException: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to MergeVDS, error = Gene
ral Exception: ("VM 'dc7149c8-abf9-44cc-bb7e-cf806715f201' was not defined yet or was undefined",), code = 100 (Failed with error GeneralException and code 100)
        at org.ovirt.engine.core.bll.VdsHandler.handleVdsResult(VdsHandler.java:118) [bll.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.runVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:2045) [bll.jar:]
        at org.ovirt.engine.core.bll.MergeCommand.executeCommand(MergeCommand.java:45) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1133) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1285) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1934) [bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInNewTransaction(TransactionSupport.java:202) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInRequired(TransactionSupport.java:137) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:105) [utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1345) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:400) [bll.jar:]

Comment 1 Elad 2018-05-05 21:19:09 UTC
Version-Release number of selected component (if applicable):
vdsm-4.20.27.1-1.el7ev.x86_64
libvirt-3.9.0-14.el7_5.3.x86_64
qemu-img-rhev-2.10.0-21.el7_5.2.x86_64
sanlock-3.6.0-1.el7.x86_64
Red Hat Enterprise Linux Server 7.5
kernel 3.10.0-862.el7.x86_64
ovirt-engine-4.2.3.4-0.1.el7.noarch

Comment 2 Ala Hino 2018-05-07 10:40:49 UTC
I couldn't reproduce the issue on a "regular", i.e. not-hosted engine environment. I am not sure whether the is related to a hosted engine environment. Would appreciate if you can try the same scenario on a regular environment and update us with the results.

Could you please upload the logs?

Comment 3 Elad 2018-05-07 11:11:56 UTC
Couldn't reproduce on a non hosted engine environment so far.

Comment 4 Ala Hino 2018-05-07 11:55:25 UTC
Thanks Elad.
Can you please upload the logs?

Comment 5 Elad 2018-05-07 13:55:46 UTC
Created attachment 1432595 [details]
logs

Comment 6 Ala Hino 2018-05-08 10:23:14 UTC
It seems that VM was stopped after live merge started, hence the VM wasn't found.

From the logs:

1. Live merge started at 2018-05-05 07:53:42,366+03:

2018-05-05 07:53:42,366+03 INFO  [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (default task-8) [snapshots_delete_c9564133-b9b8-40dc] Lock Acquired to object 'EngineLock:{exclusiveLocks='[dc7149c8-
abf9-44cc-bb7e-cf806715f201=VM]', sharedLocks=''}'

2. The VM was stopped at 2018-05-05 07:53:42,941+03:

2018-05-05 07:53:42,941+03 INFO  [org.ovirt.engine.core.bll.StopVmCommand] (default task-9) [vms_syncAction_7b4e35b4-08c8-4189] Running command: StopVmCommand internal: false. Entities affected :  ID: dc7149c8-abf9-44cc-bb7e-cf806715f201 Type: VMAction group STOP_VM with role type USER

Do you remember shutting down the VM? Or was the VM shutdown as a side effect of something else?

Comment 7 Elad 2018-05-08 10:31:16 UTC
This occurred as part of an automated test. The test doesn't shut down the VM intentionally so I guess the shut down is a side effect.
The test includes live merge while there is not enough free space on the storage domain for it to be done.

Comment 8 Ala Hino 2018-05-08 11:14:06 UTC
In such cases, there are two options:

1. Perform cold merge
2. Start the VM again and retry live merge

I think there is no bug here and suggest to close the bug.
What do you think Elad?

Comment 9 Ala Hino 2018-05-08 11:15:01 UTC
Agreed with Elad that this isn't a bug.


Note You need to log in before you can comment on or make changes to this bug.