Description of problem: Part of testing https://bugzilla.redhat.com/show_bug.cgi?id=1837199 , running create snapshot of VM with 13Disks (Each disk on diff SD) Get Errors: aborting: Task is aborted: 'value=dictionary changed size during iteration abortedcode=100' (task:1190) Version-Release number of selected component (if applicable): RHV-4.4.1-11 VDSM-4.40.22-1 How reproducible: Create VM with 13 Disks. The disks are distributed on 12 SDs (FC) Steps to Reproduce: 1. 2 concurrent running 2. Create VM from template 3. Create 10 Snapshots 4. Remove VM Actual results: Got errors during the test Expected results: No errors Additional info: Test timestamp: Start - 2020-07-08 14:30 End - 2020-07-09 5:30 VDSM & ENGINE log files: https://drive.google.com/drive/folders/1KsaQh2f0Ej487VVcazOjZwZh6jUkyHHG LAB INFO: The test running on DC with: 10 Hosts 1350VM 50 Active SDs (FC) (+27 unattached SDs) Total LUNs: 328 (multipath) The 1350 VMs are defined 12 SDs, same as the VMs with 13Disks. VG #PV #LV #SN Attr VSize VFree 09876054-0698-4b1f-b258-6fc2689807c2 1 126 0 wz--n- <5.00t 4.87t 10afc0f1-b8c8-4f3a-ad37-5517431a6ccd 1 135 0 wz--n- <5.00t 4.86t 37657106-dea4-494e-a828-f24ecbb26b3d 1 9 0 wz--n- <5.00t 4.99t 3895ca01-a88e-42db-806a-d8ea4937cc35 1 150 0 wz--n- <5.00t 4.85t 3c994815-e4f3-47f2-81c5-3d5cc38d4ad6 1 107 0 wz--n- <5.00t 4.89t 47f6c8c7-83db-4e13-8efe-9d2c9c951656 1 126 0 wz--n- <5.00t <4.66t 628a0b20-4fff-4922-99dd-6f1ce765f6c5 1 157 0 wz--n- <5.00t <4.55t 6f261e89-8fa9-467a-9d50-58e05901ff4a 1 144 0 wz--n- <5.00t <4.86t 87964d59-974f-4297-a00c-53f06a2a7061 1 131 0 wz--n- <5.00t <4.87t de355e04-2f60-4019-a7a0-ecf59408736b 1 115 0 wz--n- <5.00t <4.89t f021c5cf-55ad-4b05-9020-7c06f2fff7a9 1 153 0 wz--n- <5.00t <4.85t f479c782-236d-4a15-a1a5-40b09d417fc9 1 156 0 wz--n- <5.00t 4.84t f8c33fab-29b5-4598-ab9c-7eabc1b24c56 1 150 0 wz--n- <5.00t <4.85t vg_f01-h05-000-r620 1 2 0 wz--n- <930.50g 0 Total SDS of cluster "L0_Group_0": 12 SDs with 'VMs' 38 SDs - Empty 27 SDs - Unattached ******************************* 2020-07-09 02:48:23,454-0400 INFO (jsonrpc/3) [vdsm.api] FINISH deleteImage error=dictionary changed size during iteration from=::ffff:10.1.41.200,56428, flow_id=1be05f02-f2c7-47a3-82c9-22960efbabe8, task_id=579e5d8d-20c1-4b8b-94f8-1494d3d9d424 (api:52) 2020-07-09 02:48:23,455-0400 ERROR (jsonrpc/3) [storage.TaskManager.Task] (Task='579e5d8d-20c1-4b8b-94f8-1494d3d9d424') Unexpected error (task:880) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/task.py", line 887, in _run return fn(*args, **kargs) File "<decorator-gen-63>", line 2, in deleteImage File "/usr/lib/python3.6/site-packages/vdsm/common/api.py", line 50, in method ret = func(*args, **kwargs) File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 1529, in deleteImage allVols = dom.getAllVolumes() File "/usr/lib/python3.6/site-packages/vdsm/storage/sd.py", line 841, in getAllVolumes return self._manifest.getAllVolumes() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 765, in getAllVolumes vols, rems = self.getAllVolumesImages() File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 752, in getAllVolumesImages allVols = getAllVolumes(self.sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 285, in getAllVolumes vols = _getVolsTree(sdUUID) File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 203, in _getVolsTree for lv in _iter_volumes(sdUUID): File "/usr/lib/python3.6/site-packages/vdsm/storage/blockSD.py", line 216, in _iter_volumes for lv in lvm.getLV(sdUUID): File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 1319, in getLV lv = _lvminfo.getLv(vgName, lvName) File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 952, in getLv lvs = self._reloadlvs(vgName) File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 734, in _reloadlvs staleLVs = [lvName for v, lvName in self._lvs File "/usr/lib/python3.6/site-packages/vdsm/storage/lvm.py", line 734, in <listcomp> staleLVs = [lvName for v, lvName in self._lvs RuntimeError: dictionary changed size during iteration 2020-07-09 02:48:23,456-0400 INFO (jsonrpc/3) [storage.TaskManager.Task] (Task='579e5d8d-20c1-4b8b-94f8-1494d3d9d424') aborting: Task is aborted: 'value=dictionary changed size during iteration abortedcode=100' (task:1190)
Would we need 4.3 clone for this ticket (GA?) ? I don't see it's different there as far is it goes for popping items from lvm cache dict outside of locks.
(In reply to Amit Bawer from comment #1) > Would we need 4.3 clone for this ticket (GA?) ? > I don't see it's different there as far is it goes for popping items from > lvm cache dict outside of locks. Same issue exists in 4.3 of course. We can clone the bug and if we get acks we can backport the fixes.
Fixes should be part of build 4.40.23 when available
we want to add more fixes under this ticket, so moving back to NEW
Can it go back to MODIFIED now?
While the system was upgrade to 4.4.2 2020-08-23 10:18:32,001-0400 WARN (MainThread) [storage.HSM] Failed to stop RepoStats thread (hsm:3432) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/storage/hsm.py", line 3429, in prepareForShutdown self.domainMonitor.shutdown() File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 254, in shutdown self._stopMonitors(self._monitors.values(), shutdown=True) File "/usr/lib/python3.6/site-packages/vdsm/storage/monitor.py", line 273, in _stopMonitors for monitor in monitors: RuntimeError: dictionary changed size during iteration
Since this bug has been marked as fixed in version 4.40.27 by assignee and latest version included in oVirt 4.4.2 is 4.40.26 I'm moving this bug to 4.4.3.
verified version: redhat-release-8.3-1.0 rhv-release-4.4.3-5-001 vdsm-4.40.30-1 Tested scenarios: VM snapshot with 13 Disks 50 Users VM snapshot No Error "RuntimeError: dictionary changed size during iteration" during exection
This bugzilla is included in oVirt 4.4.3 release, published on November 10th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.3 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.