Created attachment 1834217 [details] vdsm log Description of problem: If thin disk on block storage becomes full during backup, the VM will pause and the disk will not be extended unitl backup is finished. If the VM is HA VM and backup takes lot of time, the VM may be killed on the host by the code killing paused VMs, aborting the backup. Version-Release number of selected component (if applicable): vdsm-4.50.0.1 How reproducible: Always - if you can manage to write data fast enough during a backup. During real backup, unknown, it is hard to time the guest writes and the backup. Steps to Reproduce: 1. Run a VM with empty 2g thin disk on block storage 2. Start backup (do not download backup yet) 3. In the guest, try to write 2g to the empty data disk The VM will pause 4. Download backup disks 5. Finalize the backup The VM will resume. Actual results: VM pause during the backup, does not resume until backup is stopped. Vdsm does try to extend the disk during the backup. Expected results: Thin disk extended during backup without pausing the VM, or with short pauses if the guest writes data faster then vdsm can extend the disk. Additional info: Here is detailed steps: 1. Start VM with 2 disks disk 1 - os disk, specs not important. disk 2: size: 2g format: qcow2 storage domain: iSCSI allocation policy: thin uuid: 25178339-51f7-42dd-968e-a28b19a2d640 2. Start backup $ ./backup_vm.py -c engine-dev start 6e95d38f-d9b8-4955-878c-da6d631d0ab2 [ 0.0 ] Starting full backup for VM '6e95d38f-d9b8-4955-878c-da6d631d0ab2' [ 0.3 ] Waiting until backup 'c9753476-fe7f-4e9d-a7dc-38186a7997cb' is ready [ 26.7 ] Created checkpoint 'cf36474c-22ae-4b89-b164-b7824ba7995a' (to use in --from-checkpoint-uuid for the next incremental backup) [ 26.8 ] Backup 'c9753476-fe7f-4e9d-a7dc-38186a7997cb' is ready 3. In the guest, write 2g to the data disk # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 6G 0 disk ├─sda1 8:1 0 1M 0 part ├─sda2 8:2 0 1G 0 part /boot ├─sda3 8:3 0 615M 0 part [SWAP] └─sda4 8:4 0 4.4G 0 part / sdb 8:16 0 2G 0 disk sr0 11:0 1 1024M 0 rom # realpath /dev/disk/by-id/*25178339-51f7-42dd-968e-a28b19a2d640 /dev/sdb # dd if=/dev/zero bs=1M count=2048 of=/dev/sdb oflag=direct conv=fsync status=progress 882900992 bytes (883 MB, 842 MiB) copied, 4 s, 221 MB/s VM paused. In vdsm log we see: 3.a. Block threshold excceeded for drive sdb 2021-10-18 15:06:42,843+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Block threshold 536870912 exceeded by 327680 for drive 'sdb[26]' (/rhev/data-center/mnt/blockSD/feab3738-c158-4d48-8a41-b5a95c057a50/images/25178339-51f7-42dd-968e-a28b19a2d640/b302faf4-ad02-4163-be25-c56043b52207) (drivemonitor:173) 3.b. VM was paused 2021-10-18 15:06:45,519+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') abnormal vm stop device error enospc (vm:5193) 2021-10-18 15:06:45,519+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 3.c. No drives were extended 2021-10-18 15:06:45,541+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No VM drives were extended (vm:5200) Vdsm does not try to extend the disk after this point. 4. Download backup for this vm $ ./backup_vm.py -c engine-dev download --backup-uuid c9753476-fe7f-4e9d-a7dc-38186a7997cb 6e95d38f-d9b8-4955-878c-da6d631d0ab2 [ 0.0 ] Downloading VM '6e95d38f-d9b8-4955-878c-da6d631d0ab2' disks [ 0.1 ] Downloading full backup for disk '242c082f-ca51-4cf4-82b9-fd3a3c92f38b' [ 1.3 ] Image transfer 'fefd5712-57ae-47f6-953a-73efbccefac8' is ready [ 100.00% ] 6.00 GiB, 4.74 seconds, 1.27 GiB/s [ 6.0 ] Finalizing image transfer [ 7.1 ] Download completed successfully [ 7.1 ] Downloading full backup for disk '25178339-51f7-42dd-968e-a28b19a2d640' [ 8.2 ] Image transfer '51f8321b-4e2b-483a-be66-4274a1b5b7d9' is ready [ 100.00% ] 2.00 GiB, 0.11 seconds, 18.89 GiB/s [ 8.3 ] Finalizing image transfer [ 10.4 ] Download completed successfully [ 10.5 ] Finished downloading disks Checking backups: $ qemu-img info 242c082f-ca51-4cf4-82b9-fd3a3c92f38b.202110181508.full.qcow2 image: 242c082f-ca51-4cf4-82b9-fd3a3c92f38b.202110181508.full.qcow2 file format: qcow2 virtual size: 6 GiB (6442450944 bytes) disk size: 1.58 GiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false extended l2: false $ qemu-img info 25178339-51f7-42dd-968e-a28b19a2d640.202110181508.full.qcow2 image: 25178339-51f7-42dd-968e-a28b19a2d640.202110181508.full.qcow2 file format: qcow2 virtual size: 2 GiB (2147483648 bytes) disk size: 196 KiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false extended l2: false Backup of data disk is correct, this disk contains no data since the disk was empty when the backup was started. 5. Finalize the backup $ ./backup_vm.py -c engine-dev stop 6e95d38f-d9b8-4955-878c-da6d631d0ab2 c9753476-fe7f-4e9d-a7dc-38186a7997cb [ 0.0 ] Finalizing backup 'c9753476-fe7f-4e9d-a7dc-38186a7997cb' [ 6.2 ] Backup 'c9753476-fe7f-4e9d-a7dc-38186a7997cb' completed successfully VM resumes after backup was stopped. In the guest, the dd command was completed: # dd if=/dev/zero bs=1M count=2048 of=/dev/sdb oflag=direct conv=fsync status=progress 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 11 s, 193 MB/s 2048+0 records in 2048+0 records out 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 11.1553 s, 193 MB/s In vdsm log we see: Backup was stopped: 2021-10-18 15:11:06,724+0300 INFO (jsonrpc/6) [api.virt] START stop_backup(backup_id='c9753476-fe7f-4e9d-a7dc-38186a7997cb') from=::ffff:192.168.122.11,48604, flow_id=3d72a830-95a6-4a57-9a60-4676939c9f0e, vmId=6e95d38f-d9b8-4955-878c-da6d631d0ab2 (api:48) ... 2021-10-18 15:11:06,730+0300 INFO (jsonrpc/6) [api.virt] FINISH stop_backup return={'status': {'code': 0, 'message': 'Done'}} from=::ffff:192.168.122.11,48604, flow_id=3d72a830-95a6-4a57-9a60-4676939c9f0e, vmId=6e95d38f-d9b8-4955-878c-da6d631d0ab2 (api:54) Soon after that, vdsm detect tha the disk needs extension: 2021-10-18 15:11:07,335+0300 DEBUG (periodic/1) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Extension info for drive sdb volume b302faf4-ad02-4163-be25-c56043b52207: BlockInfo(capacity=2147483648, allocation=1074135040, physical=1073741824) (vm:1277) And try to extend the disk: 2021-10-18 15:11:07,335+0300 INFO (periodic/1) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Requesting extension for volume b302faf4-ad02-4163-be25-c56043b52207 on domain feab3738-c158-4d48-8a41-b5a95c057a50 (apparent: 1073741824, capacity: 2147483648, allocated: 1074135040, physical: 1073741824 threshold_state: exceeded) (vm:1350) The data disk was extended: # lvs feab3738-c158-4d48-8a41-b5a95c057a50/b302faf4-ad02-4163-be25-c56043b52207 LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert b302faf4-ad02-4163-be25-c56043b52207 feab3738-c158-4d48-8a41-b5a95c057a50 -wi-ao---- 2.25g The root cause is that vdsm does not try to extend thin disks during backup.
Suggesting for 4.4.9.
In normal bakckup we start the backup and download the disk immediately. Trying to reproduce this with normal backup flow: 1. Start a VM with 20g thin disk 2. In the VM, starts dd process, writing 20g to the think disk # dd if=/dev/zero bs=1M count=20480 of=/dev/sdb oflag=direct conv=fsync status=progress 3. Perform full backup $ ./backup_vm.py -c engine-dev full 6e95d38f-d9b8-4955-878c-da6d631d0ab2 [ 0.0 ] Starting full backup for VM '6e95d38f-d9b8-4955-878c-da6d631d0ab2' [ 0.3 ] Waiting until backup 'd9d862e4-c48a-4b30-ad59-4d6bf98adcfa' is ready [ 20.5 ] Created checkpoint '1a63fe50-31d3-4c60-bda2-f9b26f541e96' (to use in --from-checkpoint-uuid for the next incremental backup) [ 20.5 ] Downloading full backup for disk '242c082f-ca51-4cf4-82b9-fd3a3c92f38b' [ 21.6 ] Image transfer '95f28835-b224-4bd9-adaf-7f1f9f4c0fd7' is ready [ 100.00% ] 6.00 GiB, 5.45 seconds, 1.10 GiB/s [ 27.1 ] Finalizing image transfer [ 28.1 ] Download completed successfully [ 28.1 ] Downloading full backup for disk 'd6aee2d9-01a7-432b-978d-c8f61b5bbd21' [ 29.3 ] Image transfer '30bca4a2-62d1-4be5-82dc-31fd50f345d2' is ready [ 100.00% ] 20.00 GiB, 15.14 seconds, 1.32 GiB/s [ 44.4 ] Finalizing image transfer [ 53.5 ] Download completed successfully [ 53.5 ] Finalizing backup [ 65.7 ] Full backup 'd9d862e4-c48a-4b30-ad59-4d6bf98adcfa' completed successfully The VM paused after starting the backup, and resume when the backup was done. Checking vdsm log: Backup started: 2021-10-18 16:14:34,860+0300 INFO (jsonrpc/1) [api.virt] START start_backup(config={'backup_id': 'd9d862e4-c48a-4b30-ad59-4d6bf98adcfa', 'disks': [{'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/feab3738-c158-4d48-8a41-b5a95c057a50/images/fa3d4701-e435-4ff8-86b7-1a7d5ea4b29d/622c57d7-5b7c-4f4c-b5e1-d56596b81a53', 'type': 'block'}, 'imageID': '242c082f-ca51-4cf4-82b9-fd3a3c92f38b', 'volumeID': '44d498a1-54a5-4371-8eda-02d839d7c840', 'backup_mode': 'full', 'domainID': 'feab3738-c158-4d48-8a41-b5a95c057a50'}, {'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/feab3738-c158-4d48-8a41-b5a95c057a50/images/ea177d1a-c720-4b34-bf38-1fbeb3601c16/84355ad9-1d51-4d49-82c9-c34f9275ec93', 'type': 'block'}, 'imageID': 'd6aee2d9-01a7-432b-978d-c8f61b5bbd21', 'volumeID': '5354890d-8a69-44db-ab5f-ef76271d8cdf', 'backup_mode': 'full', 'domainID': 'feab3738-c158-4d48-8a41-b5a95c057a50'}], 'from_checkpoint_id': None, 'parent_checkpoint_id': 'cf36474c-22ae-4b89-b164-b7824ba7995a', 'require_consistency': False, 'to_checkpoint_id': '1a63fe50-31d3-4c60-bda2-f9b26f541e96'}) from=::ffff:192.168.122.11,48604, flow_id=a5cfc7e5-385d-4d9b-bb1a-9750406156b4, vmId=6e95d38f-d9b8-4955-878c-da6d631d0ab2 (api:48) 2021-10-18 16:14:35,419+0300 DEBUG (jsonrpc/1) [api] FINISH start_backup response={'result': {'disks': {'242c082f-ca51-4cf4-82b9-fd3a3c92f38b': 'nbd:unix:/run/vdsm/backup/d9d862e4-c48a-4b30-ad59-4d6bf98adcfa:exportname=sda', 'd6aee2d9-01a7-432b-978d-c8f61b5bbd21': 'nbd:unix:/run/vdsm/backup/d9d862e4-c48a-4b30-ad59-4d6bf98adcfa:exportname=sde'}, ... VM paused: 2021-10-18 16:14:40,688+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') abnormal vm stop device error enospc (vm:5193) 2021-10-18 16:14:40,689+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 2021-10-18 16:14:40,969+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No VM drives were extended (vm:5200) Backup stopped: 2021-10-18 16:15:08,803+0300 DEBUG (jsonrpc/6) [api] START stop_backup args=(<vdsm.API.VM object at 0x7f75001529b0>, 'd9d862e4-c48a-4b30-ad59-4d6bf98adcfa') kwargs={} (api:122) 2021-10-18 16:15:08,811+0300 DEBUG (jsonrpc/6) [api] FINISH stop_backup response=None (api:137) Vdsm try to extend disk: 2021-10-18 16:15:08,894+0300 DEBUG (periodic/2) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Extension info for drive sde volume 5354890d-8a69-44db-ab5f-ef76271d8cdf: BlockInfo(capacity=21474836480, allocation=7516520448, physical=7516192768) (vm:1277) 2021-10-18 16:15:08,894+0300 INFO (periodic/2) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Requesting extension for volume 5354890d-8a69-44db-ab5f-ef76271d8cdf on domain feab3738-c158-4d48-8a41-b5a95c057a50 (apparent: 7516192768, capacity: 21474836480, allocated: 7516520448, physical: 7516192768 threshold_state: exceeded) (vm:1350) Guest resumed: 2021-10-18 16:15:13,225+0300 DEBUG (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') event Resumed detail 0 opaque None (vm:5634) 2021-10-18 16:15:13,226+0300 INFO (mailbox-hsm/0) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: continue (vm:6028) Timeline: 16:14:34 backup started 16:14:40 vm paused 16:15:08 backup stopped 16:15:13 vm resumed VM was paused for 61 seconds during the backup. So this is easy to reproduce (and test) with normal backup flow.
(In reply to Nir Soffer from comment #3) > VM was paused for 61 seconds during the backup. Correction - vm was paused for 33 seconds during backup.
Same backup flow with the fix: 1. In the guest, start writing 20g: # dd if=/dev/zero bs=1M count=20480 of=/dev/sdb oflag=direct conv=fsync status=progress 2. Run full backup $ ./backup_vm.py -c engine-dev full --backup-dir /data/scratch/nsoffer/backups 6e95d38f-d9b8-4955-878c-da6d631d0ab2 [ 0.0 ] Starting full backup for VM '6e95d38f-d9b8-4955-878c-da6d631d0ab2' [ 0.3 ] Waiting until backup '87e1cf21-911d-4207-bf2f-0f23dab05825' is ready [ 27.6 ] Created checkpoint 'e67f5ec0-7e66-43c0-bcd7-ad0c8c2cd9c5' (to use in --from-checkpoint-uuid for the next incremental backup) [ 27.7 ] Downloading full backup for disk '58550e51-f836-4bfc-8ae6-d40b5dc750b4' [ 28.8 ] Image transfer '191e9516-535d-472c-bce9-4542b41b4484' is ready [ 100.00% ] 20.00 GiB, 14.69 seconds, 1.36 GiB/s [ 43.5 ] Finalizing image transfer [ 52.6 ] Download completed successfully [ 52.6 ] Downloading full backup for disk '242c082f-ca51-4cf4-82b9-fd3a3c92f38b' [ 53.7 ] Image transfer '4947b46b-d9dd-42b2-9918-1927f052ea04' is ready [ 100.00% ] 6.00 GiB, 6.06 seconds, 1014.58 MiB/s [ 59.8 ] Finalizing image transfer [ 67.9 ] Download completed successfully [ 67.9 ] Finalizing backup [ 73.0 ] Full backup '87e1cf21-911d-4207-bf2f-0f23dab05825' completed successfully Because all backup data is zero, we expect a very small backup: $ qemu-img info /data/scratch/nsoffer/backups/58550e51-f836-4bfc-8ae6-d40b5dc750b4.202110181747.full.qcow2 image: /data/scratch/nsoffer/backups/58550e51-f836-4bfc-8ae6-d40b5dc750b4.202110181747.full.qcow2 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: 960 KiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false extended l2: false We backed up about 5g of zeroes at the start of the disk: $ qemu-img map --output json /data/scratch/nsoffer/backups/58550e51-f836-4bfc-8ae6-d40b5dc750b4.202110181747.full.qcow2 [{ "start": 0, "length": 5404360704, "depth": 0, "present": true, "zero": true, "data": false}, { "start": 5404360704, "length": 16070475776, "depth": 0, "present": false, "zero": true, "data": false}] 3. Run incremental backup $ ./backup_vm.py -c engine-dev incremental --backup-dir /data/scratch/nsoffer/backups --from-checkpoint-uuid e67f5ec0-7e66-43c0-bcd7-ad0c8c2cd9c5 6e95d38f-d9b8-4955-878c-da6d631d0ab2 [ 0.0 ] Starting incremental backup for VM '6e95d38f-d9b8-4955-878c-da6d631d0ab2' [ 0.4 ] Waiting until backup '3369d87d-dd01-4d46-93fb-904a480a79ca' is ready [ 15.5 ] Created checkpoint '2761217d-c942-4827-8b4e-ccfa20186f1c' (to use in --from-checkpoint-uuid for the next incremental backup) [ 15.6 ] Downloading incremental backup for disk '242c082f-ca51-4cf4-82b9-fd3a3c92f38b' [ 16.7 ] Image transfer 'a070fb56-38a4-4eac-b2c6-47d70ce154c6' is ready [ 100.00% ] 6.00 GiB, 0.22 seconds, 27.31 GiB/s [ 16.9 ] Finalizing image transfer [ 18.0 ] Download completed successfully [ 18.0 ] Downloading incremental backup for disk '58550e51-f836-4bfc-8ae6-d40b5dc750b4' [ 19.1 ] Image transfer 'd04a873e-327b-4a22-941d-09e79e5ba82a' is ready [ 100.00% ] 20.00 GiB, 35.66 seconds, 574.39 MiB/s [ 54.8 ] Finalizing image transfer [ 63.9 ] Download completed successfully [ 63.9 ] Finalizing backup [ 70.2 ] Incremental backup '3369d87d-dd01-4d46-93fb-904a480a79ca' completed successfully The incremental backup contains more zeroes, so it is also relatively small file: $ qemu-img info /data/scratch/nsoffer/backups/58550e51-f836-4bfc-8ae6-d40b5dc750b4.202110181749.incremental.qcow2 image: /data/scratch/nsoffer/backups/58550e51-f836-4bfc-8ae6-d40b5dc750b4.202110181749.incremental.qcow2 file format: qcow2 virtual size: 20 GiB (21474836480 bytes) disk size: 2.12 MiB cluster_size: 65536 Format specific information: compat: 1.1 compression type: zlib lazy refcounts: false refcount bits: 16 corrupt: false extended l2: false When I started the incremental backup, the write in the guest was almost finished, so we seem to get the rest of the zeroes until the end of the disk. $ qemu-img map --output json /data/scratch/nsoffer/backups/58550e51-f836-4bfc-8ae6-d40b5dc750b4.202110181749.incremental.qcow2 [{ "start": 0, "length": 5403312128, "depth": 0, "present": false, "zero": true, "data": false}, { "start": 5403312128, "length": 16071524352, "depth": 0, "present": true, "zero": true, "data": false}] The VM paused few times during the backup for very short time, expected when writing fast as possible to thin disk with default configuration. Looking in vdsm log: 1. Full backup starting... 2021-10-18 17:47:44,862+0300 INFO (jsonrpc/1) [api.virt] START start_backup(config={'backup_id': '87e1cf21-911d-4207-bf2f-0f23dab05825', 'disks': [{'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/feab3738-c158-4d48-8a41-b5a95c057a50/images/782f6dce-00cf-4e56-b595-6f990f1764b6/214b2565-72fd-4e4d-b4ed-a96f33040348', 'type': 'block'}, 'imageID': '58550e51-f836-4bfc-8ae6-d40b5dc750b4', 'volumeID': '344b8342-47ea-4033-bf68-008806eeeeac', 'backup_mode': 'full', 'domainID': 'feab3738-c158-4d48-8a41-b5a95c057a50'}, {'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/feab3738-c158-4d48-8a41-b5a95c057a50/images/cfc5096e-a8e0-4e9e-b65e-341daeaff190/ec6d9517-3291-4703-8b6a-5d1e7536533e', 'type': 'block'}, 'imageID': '242c082f-ca51-4cf4-82b9-fd3a3c92f38b', 'volumeID': '44d498a1-54a5-4371-8eda-02d839d7c840', 'backup_mode': 'full', 'domainID': 'feab3738-c158-4d48-8a41-b5a95c057a50'}], 'from_checkpoint_id': None, 'parent_checkpoint_id': '1a63fe50-31d3-4c60-bda2-f9b26f541e96', 'require_consistency': False, 'to_checkpoint_id': 'e67f5ec0-7e66-43c0-bcd7-ad0c8c2cd9c5'}) from=::ffff:192.168.122.11,50832, flow_id=2a7b49d0-2c11-47b7-9f88-2e84d65693ee, vmId=6e95d38f-d9b8-4955-878c-da6d631d0ab2 (api:48) 2. Vm was paused 2021-10-18 17:47:44,875+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') abnormal vm stop device ua-58550e51-f836-4bfc-8ae6-d40b5dc750b4 error enospc (vm:5193) 2021-10-18 17:47:44,875+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 3. Vdsm extends the disk 2021-10-18 17:47:44,893+0300 DEBUG (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Extension info for drive sdb volume 344b8342-47ea-4033-bf68-008806eeeeac: BlockInfo(capacity=21474836480, allocation=5369757696, physical=5368709120) (vm:1277) 2021-10-18 17:47:44,893+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Requesting extension for volume 344b8342-47ea-4033-bf68-008806eeeeac on domain feab3738-c158-4d48-8a41-b5a95c057a50 (apparent: 5368709120, capacity: 21474836480, allocated: 5369757696, physical: 5368709120 threshold_state: exceeded) (vm:1350) 4. Full backup was started 2021-10-18 17:47:45,650+0300 DEBUG (jsonrpc/1) [api] FINISH start_backup response={'result': {'disks': {'58550e51-f836-4bfc-8ae6-d40b5dc750b4': 'nbd:unix:/run/vdsm/backup/87e1cf21-911d-4207-bf2f-0f23dab05825:exportname=sdb', '242c082f-ca51-4cf4-82b9-fd3a3c92f38b': 'nbd:unix:/run/vdsm/backup/87e1cf21-911d-4207-bf2f-0f23dab05825:exportname=sda'} ... 5. VM resumed 2021-10-18 17:47:45,397+0300 DEBUG (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') event Resumed detail 0 opaque None (vm:5634) 2021-10-18 17:47:45,405+0300 INFO (mailbox-hsm/4) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: continue (vm:6028) 6. VM paused 2021-10-18 17:48:08,265+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') abnormal vm stop device error enospc (vm:5193) 2021-10-18 17:48:08,266+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 7. Vdsm try to extend the disk 2021-10-18 17:48:08,279+0300 WARN (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:08,279+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') Requesting extension for volume 344b8342-47ea-4033-bf68-008806eeeeac on domain feab3738-c158-4d48-8a41-b5a95c057a50 (apparent: 7516192768, capacit y: 21474836480, allocated: 0, physical: 7516192768 threshold_state: exceeded) (vm:1350) 8. VM resumed 2021-10-18 17:48:09,429+0300 INFO (mailbox-hsm/3) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: continue (vm:6028) 9. Backup stopped 2021-10-18 17:48:26,699+0300 INFO (jsonrpc/3) [api.virt] START stop_backup(backup_id='87e1cf21-911d-4207-bf2f-0f23dab05825') from=::ffff:192.168.122.11,50832, flow_id=2529ef2a-973e-4425-9ea0-999b6d4550f0, vmId=6e95d38f-d9b8-4955-878c-da6d631d0ab2 (api:48) Collection all pause/resume events during the backup: # egrep 'CPU stopped:|CPU running:' vdsm.log 2021-10-18 17:47:44,875+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 2021-10-18 17:47:44,898+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onSuspend (vm:6028) 2021-10-18 17:47:44,903+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 2021-10-18 17:47:45,405+0300 INFO (mailbox-hsm/4) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: continue (vm:6028) 2021-10-18 17:47:45,498+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: onResume (vm:6028) Pause time: 0.5 seconds 2021-10-18 17:48:08,266+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 2021-10-18 17:48:08,286+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onSuspend (vm:6028) 2021-10-18 17:48:09,429+0300 INFO (mailbox-hsm/3) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: continue (vm:6028) 2021-10-18 17:48:09,525+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: onResume (vm:6028) Pause time: 1.2 seconds 2021-10-18 17:48:24,250+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onIOError (vm:6028) 2021-10-18 17:48:24,329+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU stopped: onSuspend (vm:6028) 2021-10-18 17:48:25,404+0300 INFO (mailbox-hsm/3) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: continue (vm:6028) 2021-10-18 17:48:25,502+0300 INFO (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') CPU running: onResume (vm:6028) Pause time: 0.1 seconds We can see that vdsm handles ENOSPC events correctly, and the VM is paused for short time and resume when vdsm complete the extension.
We log new warning when libvirt reports allocation=0 when checking if disk should be extended. Here are the new warnings during backup: # grep 'No allocation info' vdsm.log 2021-10-18 17:47:49,174+0300 WARN (periodic/3) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:47:51,114+0300 WARN (periodic/0) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:07,092+0300 WARN (periodic/3) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:08,279+0300 WARN (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:09,101+0300 WARN (periodic/1) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:13,140+0300 WARN (periodic/0) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:15,153+0300 WARN (periodic/2) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:23,098+0300 WARN (periodic/2) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:24,316+0300 WARN (libvirt/events) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-18 17:48:25,118+0300 WARN (periodic/1) [virt.vm] (vmId='6e95d38f-d9b8-4955-878c-da6d631d0ab2') No allocation info for drive sdb, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251)
The root cause is libvirt bug 2015281.
Build for 4.4.9 available here in case you want to test this it before the official compose. https://jenkins.ovirt.org/job/vdsm_standard-check-patch/30223/artifact/build-artifacts.build-py3.el8stream.x86_64/
Hi Nir, Is this a RHEL8.5/RHV4.4.9 new issue(regression) ? Will we need this workaround backported to earlier RHV4.4.8/RHEL8.4 builds as well?
(In reply to Avihai from comment #9) > Is this a RHEL8.5/RHV4.4.9 new issue(regression) ? I did not test. > Will we need this workaround backported to earlier RHV4.4.8/RHEL8.4 builds > as well? Unlikely.
Failed verification according to steps in comment Versions: ovirt-engine-4.4.9.3-0.3.el8ev.noarch vdsm-4.40.90.3-1.el8ev.x86_64 libvirt-7.6.0-5.module+el8.5.0+12933+58cb48a1 steps to reproduce: 1. Start a VM with 20g thin disk 2. In the VM, starts dd process, writing 20g to the think disk # dd if=/dev/zero bs=1M count=20480 of=/dev/sda oflag=direct conv=fsync status=progress 3. Perform full backup [root@storage-ge3-vdsm3 examples]# ./backup_vm.py -c engine full 4faa9712-c1aa-4cc2-8396-169138858eea [ 0.0 ] Starting full backup for VM '4faa9712-c1aa-4cc2-8396-169138858eea' Traceback (most recent call last): File "./backup_vm.py", line 525, in <module> main() File "./backup_vm.py", line 191, in main args.command(args) File "./backup_vm.py", line 205, in cmd_full backup = start_backup(connection, args) File "./backup_vm.py", line 336, in start_backup description=args.description File "/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py", line 34151, in add return self._internal_add(backup, headers, query, wait) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 232, in _internal_add return future.wait() if wait else future File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 55, in wait return self._code(response) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 229, in callback self._check_fault(response) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 132, in _check_fault self._raise_error(response, body) File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 118, in _raise_error raise error ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot backup VM. The Virtual Machine should be in Up or Down status.]". HTTP response code is 409. Host: 2021-10-21 11:15:49,471+0300 INFO (libvirt/events) [virt.vm] (vmId='4faa9712-c1aa-4cc2-8396-169138858eea') abnormal vm stop device ua-02921e71-459d-4b52-844d-380b91e3f110 error enospc (vm:5193) 2021-10-21 11:15:49,471+0300 INFO (libvirt/events) [virt.vm] (vmId='4faa9712-c1aa-4cc2-8396-169138858eea') CPU stopped: onIOError (vm:6028) 2021-10-21 11:15:49,489+0300 INFO (libvirt/events) [virt.vm] (vmId='4faa9712-c1aa-4cc2-8396-169138858eea') Requesting extension for volume 6296825e-4e12-4314-a13a-1be6df45950f on domain 759f583c-6331-4820-9ba6-ec971ef62b3c (apparent: 1073741824, capacity: 21474836480, allocated: 1074135040, physical: 1073741824 threshold_state: exceeded) (vm:1350) 2021-10-21 11:15:49,489+0300 DEBUG (libvirt/events) [storage.TaskManager.Task] (Task='e339e041-0511-4154-b807-261d8fc795bb') moving from state finished -> state preparing (task:624)
Adding to some inforamtion about the VM during this error: The VM was paused and was up during the writing to the disk couple of times and during all these time got these errors. The disk was fully used at the end of the dd action and then the VM got back to Active state.
(In reply to sshmulev from comment #12) > Failed verification according to steps in comment > > Versions: > ovirt-engine-4.4.9.3-0.3.el8ev.noarch > vdsm-4.40.90.3-1.el8ev.x86_64 > libvirt-7.6.0-5.module+el8.5.0+12933+58cb48a1 > > > steps to reproduce: > > 1. Start a VM with 20g thin disk > 2. In the VM, starts dd process, writing 20g to the think disk > > # dd if=/dev/zero bs=1M count=20480 of=/dev/sda oflag=direct conv=fsync > status=progress > > 3. Perform full backup > [root@storage-ge3-vdsm3 examples]# ./backup_vm.py -c engine full > 4faa9712-c1aa-4cc2-8396-169138858eea > [ 0.0 ] Starting full backup for VM '4faa9712-c1aa-4cc2-8396-169138858eea' > Traceback (most recent call last): ... > ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is > "[Cannot backup VM. The Virtual Machine should be in Up or Down status.]". > HTTP response code is 409. This is expected failure, we don't allow backup when VM is paused. You should retry the backup operation in this case. > Host: > 2021-10-21 11:15:49,471+0300 INFO (libvirt/events) [virt.vm] > (vmId='4faa9712-c1aa-4cc2-8396-169138858eea') abnormal vm stop device > ua-02921e71-459d-4b52-844d-380b91e3f110 error enospc (vm:5193) > 2021-10-21 11:15:49,471+0300 INFO (libvirt/events) [virt.vm] > (vmId='4faa9712-c1aa-4cc2-8396-169138858eea') CPU stopped: onIOError > (vm:6028) > 2021-10-21 11:15:49,489+0300 INFO (libvirt/events) [virt.vm] > (vmId='4faa9712-c1aa-4cc2-8396-169138858eea') Requesting extension for > volume 6296825e-4e12-4314-a13a-1be6df45950f on domain > 759f583c-6331-4820-9ba6-ec971ef62b3c (apparent: 1073741824, capacity: > 21474836480, allocated: 1074135040, physical: 1073741824 threshold_state: > exceeded) (vm:1350) If this happens during the backup, it show that the issue is fixed. Without the fix we do not see extend attempts. Please check carefully comment 5 and comment 6, they show what to look for in vdsm logs.
Verified. Versions: ovirt-engine-4.4.9.3-0.3.el8ev.noarch vdsm-4.40.90.3-1.el8ev.x86_64 libvirt-7.6.0-5.module+el8.5.0+12933+58cb48a1 Steps to reproduce: 1. Start a VM with 20g thin disk 2. In the VM, starts dd process, writing 20g to the think disk # dd if=/dev/zero bs=1M count=20480 of=/dev/sda oflag=direct conv=fsync status=progress during the writing started backup Full backup: [root@storage-ge3-vdsm3 examples]# ./backup_vm.py -c engine full dc411543-5a9c-440e-9567-12b83214ea28 [ 0.0 ] Starting full backup for VM 'dc411543-5a9c-440e-9567-12b83214ea28' [ 1.3 ] Waiting until backup 'bf26ebff-b3b3-447e-b284-5871e54f553f' is ready [ 21.6 ] Created checkpoint '09a6a04a-8b82-4844-b2dc-06cd8f2fdd9b' (to use in --from-checkpoint-uuid for the next incremental backup) [ 21.7 ] Downloading full backup for disk '27616873-0fb2-4df5-9336-1a52b95614d8' [ 22.9 ] Image transfer '2a4551db-bf02-4e8d-8e8f-30ed1fbe6edc' is ready [ 100.00% ] 20.00 GiB, 17.28 seconds, 1.16 GiB/s [ 40.1 ] Finalizing image transfer [ 48.3 ] Download completed successfully [ 48.3 ] Downloading full backup for disk 'a95c2a14-8bd1-4ab4-8123-a5967bf7787a' [ 49.4 ] Image transfer '360cd5b3-e09e-4cff-8029-2f777719eda2' is ready [ 100.00% ] 10.00 GiB, 13.56 seconds, 755.12 MiB/s [ 63.0 ] Finalizing image transfer [ 74.2 ] Download completed successfully [ 74.2 ] Finalizing backup [ 86.5 ] Full backup 'bf26ebff-b3b3-447e-b284-5871e54f553f' completed successfully and then Incremental backup: [root@storage-ge3-vdsm3 examples]# ./backup_vm.py -c engine incremental --from-checkpoint-uuid 09a6a04a-8b82-4844-b2dc-06cd8f2fdd9b b08f99b4-ddad-46c3-93ff-74b3140e4824 [ 0.0 ] Starting incremental backup for VM 'b08f99b4-ddad-46c3-93ff-74b3140e4824' [ 0.2 ] VM 'b08f99b4-ddad-46c3-93ff-74b3140e4824' does not exist [root@storage-ge3-vdsm3 examples]# ./backup_vm.py -c engine incremental --from-checkpoint-uuid 09a6a04a-8b82-4844-b2dc-06cd8f2fdd9b dc411543-5a9c-440e-9567-12b83214ea28 [ 0.0 ] Starting incremental backup for VM 'dc411543-5a9c-440e-9567-12b83214ea28' [ 1.4 ] Waiting until backup '3d217d41-f249-471f-9c23-2277ce434bdb' is ready [ 18.6 ] Created checkpoint 'e160c9ed-1d37-455e-8c9d-7b3d3e786c5e' (to use in --from-checkpoint-uuid for the next incremental backup) [ 18.7 ] Downloading incremental backup for disk '27616873-0fb2-4df5-9336-1a52b95614d8' [ 19.8 ] Image transfer '2ee53e1c-b0b3-40c7-b506-6a2b3ef13fc0' is ready [ 100.00% ] 20.00 GiB, 48.53 seconds, 422.04 MiB/s [ 68.3 ] Finalizing image transfer [ 75.5 ] Download completed successfully [ 75.5 ] Downloading incremental backup for disk 'a95c2a14-8bd1-4ab4-8123-a5967bf7787a' [ 76.6 ] Image transfer '98be34d3-5c19-4295-bc9a-eb80ab4df2ac' is ready [ 100.00% ] 10.00 GiB, 0.17 seconds, 57.49 GiB/s [ 76.7 ] Finalizing image transfer [ 79.8 ] Download completed successfully [ 79.8 ] Finalizing backup [ 84.0 ] Incremental backup '3d217d41-f249-471f-9c23-2277ce434bdb' completed successfully start_backup: 2021-10-21 18:12:11,043+0300 INFO (jsonrpc/1) [api.virt] START start_backup(config={'backup_id': 'bf26ebff-b3b3-447e-b284-5871e54f553f', 'disks': [{'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/d754ab8a-cffc-49d0-b45b-e41b9502a7a1/images/32ede744-e380-415f-aa59-2138a38ce618/8707d29a-1ff4-4326-b3e5-3833c918edb7', 'type': 'block'}, 'imageID': '27616873-0fb2-4df5-9336-1a52b95614d8', 'volumeID': '7e392d9b-4ba0-4bbb-81b1-b9ee3562e048', 'backup_mode': 'full', 'domainID': 'd754ab8a-cffc-49d0-b45b-e41b9502a7a1'}, {'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/759f583c-6331-4820-9ba6-ec971ef62b3c/images/05a23cf2-0288-495b-b152-f20175619b66/3a029b94-1119-498a-825c-019bd6e5b020', 'type': 'block'}, 'imageID': 'a95c2a14-8bd1-4ab4-8123-a5967bf7787a', 'volumeID': '01205056-6033-4b98-8cda-751a1d2150a3', 'backup_mode': 'full', 'domainID': '759f583c-6331-4820-9ba6-ec971ef62b3c'}], 'from_checkpoint_id': None, 'parent_checkpoint_id': None, 'require_consistency': False, 'to_checkpoint_id': '09a6a04a-8b82-4844-b2dc-06cd8f2fdd9b'}) from=::ffff:10.46.12.144,52796, flow_id=e747aac2-b2e0-43e8-bc7c-db6eb09df60d, vmId=dc411543-5a9c-440e-9567-12b83214ea28 (api:48) VM paused: 2021-10-21 18:12:11,870+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU stopped: onIOError (vm:6028) 2021-10-21 18:12:11,892+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU stopped: onSuspend (vm:6028) 2021-10-21 18:12:11,044+0300 INFO (jsonrpc/1) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') Freezing guest filesystems (vm:4276) 2021-10-21 18:12:11,097+0300 INFO (jsonrpc/1) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') 1 guest filesystems frozen (vm:4295) Requesting for extension: 2021-10-21 18:12:11,870+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') abnormal vm stop device error enospc (vm:5193) 2021-10-21 18:12:11,870+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU stopped: onIOError (vm:6028) 2021-10-21 18:12:11,887+0300 WARN (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') No allocation info for drive sda, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-21 18:12:11,887+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') Requesting extension for volume 7e392d9b-4ba0-4bbb-81b1-b9ee3562e048 on domain d754ab8a-cffc-49d0-b45b-e41b9502a7a1 (apparent: 5368709120, capacity: 21474836480, allocated: 0, physical: 5368709120 threshold_state: exceeded) (vm:1350) 2021-10-21 18:12:11,098+0300 INFO (jsonrpc/1) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') Starting backup for backup_id: 'bf26ebff-b3b3-447e-b284-5871e54f553f', backup xml: <?xml version='1.0' encoding='utf-8'?> VM running again: 2021-10-21 18:12:11,219+0300 INFO (jsonrpc/1) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') Thawing guest filesystems (vm:4303) 2021-10-21 18:12:11,230+0300 INFO (jsonrpc/1) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') 1 guest filesystems thawed (vm:4322) 2021-10-21 18:12:15,221+0300 INFO (mailbox-hsm/3) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU running: continue (vm:6028) 2021-10-21 18:12:15,319+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU running: onResume (vm:6028) start_backup: 2021-10-21 18:13:49,512+0300 INFO (jsonrpc/5) [api.virt] START start_backup(config={'backup_id': '3d217d41-f249-471f-9c23-2277ce434bdb', 'disks': [{'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/d754ab8a-cffc-49d0-b45b-e41b9502a7a1/images/4db0c94c-5702-4802-8baa-5c1b4b7c1f30/85733056-4b8b-4bce-b3d7-79de890c6a18', 'type': 'block'}, 'imageID': '27616873-0fb2-4df5-9336-1a52b95614d8', 'volumeID': '7e392d9b-4ba0-4bbb-81b1-b9ee3562e048', 'backup_mode': 'incremental', 'domainID': 'd754ab8a-cffc-49d0-b45b-e41b9502a7a1'}, {'checkpoint': True, 'scratch_disk': {'path': '/rhev/data-center/mnt/blockSD/759f583c-6331-4820-9ba6-ec971ef62b3c/images/a8e4b6e7-a3db-46e9-8cf1-1bc7f21c54e8/e1461e78-d7fa-4a1f-93c7-569e5367f57e', 'type': 'block'}, 'imageID': 'a95c2a14-8bd1-4ab4-8123-a5967bf7787a', 'volumeID': '01205056-6033-4b98-8cda-751a1d2150a3', 'backup_mode': 'incremental', 'domainID': '759f583c-6331-4820-9ba6-ec971ef62b3c'}], 'from_checkpoint_id': '09a6a04a-8b82-4844-b2dc-06cd8f2fdd9b', 'parent_checkpoint_id': '09a6a04a-8b82-4844-b2dc-06cd8f2fdd9b', 'require_consistency': False, 'to_checkpoint_id': 'e160c9ed-1d37-455e-8c9d-7b3d3e786c5e'}) from=::ffff:10.46.12.144,52796, flow_id=f297676e-4aeb-4e04-a497-bf0bdd9a2769, vmId=dc411543-5a9c-440e-9567-12b83214ea28 (api:48) VM paused: 2021-10-21 18:13:03,892+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU stopped: onIOError (vm:6028) 2021-10-21 18:13:03,914+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU stopped: onSuspend (vm:6028) 2021-10-21 18:13:49,513+0300 INFO (jsonrpc/5) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') Freezing guest filesystems (vm:4276) 2021-10-21 18:13:49,544+0300 INFO (jsonrpc/5) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') 1 guest filesystems frozen (vm:4295) Requesting for extension: 2021-10-21 18:13:52,728+0300 WARN (periodic/3) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') No allocation info for drive sda, but block threshold was exceeded - assuming that drive needs extension (drivemonitor:251) 2021-10-21 18:13:52,729+0300 INFO (periodic/3) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') Requesting extension for volume 7e392d9b-4ba0-4bbb-81b1-b9ee3562e048 on domain d754ab8a-cffc-49d0-b45b-e41b9502a7a1 (apparent: 20401094656, capacity: 21474836480, allocated: 0, physical: 20401094656 threshold_state: exceeded) (vm:1350) VM resumes: 2021-10-21 18:13:08,894+0300 INFO (mailbox-hsm/3) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU stopped: continue (vm:6028) 2021-10-21 18:13:08,989+0300 INFO (libvirt/events) [virt.vm] (vmId='dc411543-5a9c-440e-9567-12b83214ea28') CPU running: onResume (vm:6028) stop_backup: 2021-10-21 18:14:51,156+0300 INFO (jsonrpc/2) [api.virt] START stop_backup(backup_id='3d217d41-f249-471f-9c23-2277ce434bdb') from=::ffff:10.46.12.144,52796, flow_id=f350c4f4-f8f3-4e06-b2e4-aafd07d9cc65, vmId=dc411543-5a9c-440e-9567-12b83214ea28 (api:48) 2021-10-21 18:14:51,165+0300 INFO (jsonrpc/2) [storage.backup] Removing scratch disks for backup id: 3d217d41-f249-471f-9c23-2277ce434bdb (backup:587) 2021-10-21 18:14:51,166+0300 DEBUG (jsonrpc/2) [storage.TaskManager.Task] (Task='effafdd1-7f1f-4830-a578-14d641cb3308') moving from state finished -> state preparing (task:624) 2021-10-21 18:14:51,166+0300 INFO (jsonrpc/2) [vdsm.api] START list_transient_disks(owner_name='dc411543-5a9c-440e-9567-12b83214ea28') from=::ffff:10.46.12.144,52796, flow_id=f350c4f4-f8f3-4e06-b2e4-aafd07d9cc65, task_id=effafdd1-7f1f-4830-a578-14d641cb3308 (api:48)