Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1821001

Summary: Failed to create live snapshot on VM with nfs/gluster disk
Product: [oVirt] ovirt-engine Reporter: Evelina Shames <eshames>
Component: BLL.StorageAssignee: Tal Nisan <tnisan>
Status: CLOSED DUPLICATE QA Contact: Avihai <aefrat>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4.0CC: bugs, bzlotnik, izuckerm
Target Milestone: ---Keywords: Automation, AutomationBlocker, Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-06 09:41:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Logs
none
Engin + vdsm logs none

Description Evelina Shames 2020-04-05 11:25:12 UTC
Created attachment 1676376 [details]
Logs

Description of problem:
Failed to create live snapshot on VM with nfs/gluster disk.

Engine log:
2020-04-05 14:10:02,440+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] EVENT_ID: USER_ADD_DISK_FINISHED_FAILURE(2,022), Add-Disk operation failed to complete.
2020-04-05 14:10:02,445+03 ERROR [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] Ending command 'org.ovirt.engine.core.bll.storage.disk.AddDiskCommand' with failure.
2020-04-05 14:10:02,451+03 ERROR [org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] Ending command 'org.ovirt.engine.core.bll.storage.disk.image.AddImageFromScratchCommand' with failure.
2020-04-05 14:10:02,457+03 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMRevertTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] START, SPMRevertTaskVDSCommand( SPMTaskGuidBaseVDSCommandParameters:{storagePoolId='fea4bd95-fafb-46ed-ad48-b5209fbd0e20', ignoreFailoverLimit='false', taskId='185ce201-5334-4f4d-8f9f-9576b054b762'}), log id: 1b8165e4
2020-04-05 14:10:02,459+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] START, HSMRevertTaskVDSCommand(HostName = host_mixed_3, HSMTaskGuidBaseVDSCommandParameters:{hostId='74e85708-d769-4c72-b8db-c5b34fa80ea7', taskId='185ce201-5334-4f4d-8f9f-9576b054b762'}), log id: 76f50aa3
2020-04-05 14:10:02,464+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] Trying to revert unknown task '185ce201-5334-4f4d-8f9f-9576b054b762'
2020-04-05 14:10:02,464+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMRevertTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] FINISH, HSMRevertTaskVDSCommand, return: , log id: 76f50aa3
2020-04-05 14:10:02,464+03 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.SPMRevertTaskVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [dd837fc1-aadc-44c4-aa22-8bcce41946e6] FINISH, SPMRevertTaskVDSCommand, return: , log id: 1b8165e4
2020-04-05 14:10:02,474+03 WARN  [org.ovirt.engine.core.bll.storage.disk.AddDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] VM is null - no unlocking
2020-04-05 14:10:02,495+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] EVENT_ID: USER_ADD_DISK_FINISHED_FAILURE(2,022), Add-Disk operation failed to complete.
2020-04-05 14:10:02,503+03 WARN  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] Trying to release exclusive lock which does not exist, lock key: '8dbf32d8-64f0-4924-aa9b-025fe5620a4fVM'
2020-04-05 14:10:02,503+03 INFO  [org.ovirt.engine.core.bll.snapshots.CreateSnapshotForVmCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] Lock freed to object 'EngineLock:{exclusiveLocks='[8dbf32d8-64f0-4924-aa9b-025fe5620a4f=VM]', sharedLocks=''}'
2020-04-05 14:10:02,537+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] EVENT_ID: USER_CREATE_SNAPSHOT_FINISHED_FAILURE(69), Failed to complete snapshot 's1' creation for VM 'golden_env_mixed_virtio_1_0'.
2020-04-05 14:10:02,538+03 WARN  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-71) [] Command 'CreateSnapshotForVm' id: '5d32d659-23dd-4e31-930d-eb2c3630b42e' end method execution failed, as the command isn't marked for endAction() retries silently ignoring
2020-04-05 14:10:05,916+03 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [] Polling and updating Async Tasks: 6 tasks, 1 tasks to poll now
2020-04-05 14:10:05,928+03 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HSMGetAllTasksStatusesVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [] Failed in 'HSMGetAllTasksStatusesVDS' method
2020-04-05 14:10:05,957+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), VDSM host_mixed_3 command HSMGetAllTasksStatusesVDS failed: value=Could not acquire resource. Probably resource factory threw an exception.: () abortedcode=100
2020-04-05 14:10:05,958+03 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [] SPMAsyncTask::PollTask: Polling task '3bc165a8-f7a7-4920-b957-87eaf508b879' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') returned status 'finished', result 'cleanSuccess'.
2020-04-05 14:10:05,958+03 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [] BaseAsyncTask::logEndTaskFailure: Task '3bc165a8-f7a7-4920-b957-87eaf508b879' (Parent Command 'Unknown', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Could not acquire resource. Probably resource factory threw an exception.: () abortedcode=100, code = 100',
-- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = value=Could not acquire resource. Probably resource factory threw an exception.: () abortedcode=100, code = 100'
2020-04-05 14:10:05,958+03 INFO  [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-21) [] SPMAsyncTask::ClearAsyncTask: Attempting to clear task '3bc165a8-f7a7-4920-b957-87eaf508b879'

VDSM log:
2020-04-05 14:09:55,029+0300 ERROR (virt/638f42f9) [root] Job '638f42f9-6706-41bc-99b1-a83a06e90572' failed (jobs:223)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/jobs.py", line 159, in run
    self._run()
  File "/usr/lib/python3.6/site-packages/vdsm/virt/jobs/snapshot.py", line 121, in _run
    snap.snapshot()
  File "/usr/lib/python3.6/site-packages/vdsm/virt/jobs/snapshot.py", line 488, in snapshot
    raise RuntimeError("Failed to execute snapshot, "
RuntimeError: Failed to execute snapshot, considering the operation as failure


Version-Release number of selected component (if applicable):
vdsm-4.40.11-1.el8ev.x86_64
ovirt-engine-4.4.0-0.31.master.el8ev.noarch


How reproducible:
100%

Steps to Reproduce:
1.Create VM with nfs/gluster disk
2.Run VM
3.Try to create live snapshot (with memory)

Actual results:
Operation fails

Expected results:
Operation should succeed.

Additional info:
Logs are attached

Comment 1 Ilan Zuckerman 2020-04-05 12:28:11 UTC
Facing the same issue while analyzing tier1 4.4 during live snapshot volume creation.

Environment:

ovirt-engine-4.4.0-0.31.master.el8ev.noarch
vdsm-4.40.11-1.el8ev.x86_64
libvirt-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64

Attaching relevant logs.

Comment 2 Ilan Zuckerman 2020-04-05 12:28:54 UTC
Created attachment 1676398 [details]
Engin + vdsm logs

Comment 3 Avihai 2020-04-05 22:03:06 UTC
Issue seen in multiple cases, marking as an automation blocker.

(In reply to Ilan Zuckerman from comment #1)
> Facing the same issue while analyzing tier1 4.4 during live snapshot volume
> creation.
> 
> Environment:
> 
> ovirt-engine-4.4.0-0.31.master.el8ev.noarch
> vdsm-4.40.11-1.el8ev.x86_64
> libvirt-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64
> 
> Attaching relevant logs.

Ilan, please add which Automation TestCase did you see this issue on (add this to QA whiteboard as well) and what was the full scenario with printouts here.

Comment 4 Ilan Zuckerman 2020-04-06 05:54:20 UTC
(In reply to Avihai from comment #3)
> Issue seen in multiple cases, marking as an automation blocker.
> 
> (In reply to Ilan Zuckerman from comment #1)
> > Facing the same issue while analyzing tier1 4.4 during live snapshot volume
> > creation.
> > 
> > Environment:
> > 
> > ovirt-engine-4.4.0-0.31.master.el8ev.noarch
> > vdsm-4.40.11-1.el8ev.x86_64
> > libvirt-6.0.0-16.module+el8.2.0+6131+4e715f3b.x86_64
> > 
> > Attaching relevant logs.
> 
> Ilan, please add which Automation TestCase did you see this issue on (add
> this to QA whiteboard as well) and what was the full scenario with printouts
> here.

The Automatipn TestCase is added to whiteboard, adding the test steps of the full scenario here.


Test Setup   2: Creating VM vm_TestCase11660_0419571544
Test Setup   3: [class] Description: clone vm from a pre-defined template latest-rhel-guest-image-8.2-infra with {'vol_sparse': True, 'name': 'vm_TestCase11660_0419571544', 'clone': False, 'cluster': 'golden_env_mixed_1', 'storagedomain': 'nfs_1', 'display_type': 'spice', 'timeout': 900, 'virtio_scsi': True, 'vol_format': 'cow', 'os_type': 'RHEL6x64', 'type': 'desktop', 'wait': True}
storage/rhevmtests.storage.storage_snapshots.test_live_snapshot.TestCase11660.test_live_snapshot[nfs]
Create a snapshot while VM is running
STORAGE: NFS
Test Step   4: Start VMs with {'vm_list': ['vm_TestCase11660_0419571544'], 'wait_for_ip': True, 'wait_for_status': 'powering_up', 'max_workers': 2}
Test Step   5: Start writing continuously on VM vm_TestCase11660_0419571544 via dd
Test Step   6: Creating live snapshot on a VM vm_TestCase11660_0419571544
Test Step   7: Adding new snapshot to VM vm_TestCase11660_0419571544 with all disks
Test Step   8: Add snapshot to VM vm_TestCase11660_0419571544 with {'description': 'snap_TestCase11660_0419573246', 'wait': True}
ERROR Result: FAILED

Comment 5 Benny Zlotnik 2020-04-06 09:41:50 UTC
2020-04-05 13:46:37,442+0300 ERROR (virt/fd16aacc) [virt.vm] (vmId='3faebc4f-66f8-4e28-b8a6-a4776f94d777') Unable to take snapshot (snapshot:463)
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/vdsm/virt/jobs/snapshot.py", line 453, in snapshot
    self._vm.run_dom_snapshot(snapxml, snap_flags)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/vm.py", line 5585, in run_dom_snapshot
    self._dom.snapshotCreateXML(snapxml, snap_flags)
  File "/usr/lib/python3.6/site-packages/vdsm/virt/virdomain.py", line 101, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/libvirtconnection.py", line 131, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/vdsm/common/function.py", line 94, in wrapper
    return func(inst, *args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/libvirt.py", line 2800, in snapshotCreateXML
    if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self)
libvirt.libvirtError: internal error: unable to execute QEMU command 'blockdev-add': Image is not in qcow2 format

dupe of 1820068

*** This bug has been marked as a duplicate of bug 1820068 ***