1883949 – ovirt_disk Ansible module uses the physical size of a qcow2 file instead of the virtual size

Bug 1883949 - ovirt_disk Ansible module uses the physical size of a qcow2 file instead of the virtual size

Summary: ovirt_disk Ansible module uses the physical size of a qcow2 file instead of t...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Virtualization Manager
Classification:	Red Hat
Component:	ovirt-ansible-collection
Sub Component:
Version:	4.3.10
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	ovirt-4.5.0
Target Release:	4.5.0
Assignee:	Martin Necas
QA Contact:	Barbora Dolezalova
Docs Contact:
URL:
Whiteboard:
Depends On:	1923178 2014017
Blocks:
TreeView+	depends on / blocked

Reported:	2020-09-30 14:54 UTC by Juan Orti
Modified:	2022-06-13 13:06 UTC (History)
CC List:	6 users (show)
Fixed In Version:	ovirt-ansible-collection-2.0.0
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-05-26 17:25:09 UTC
oVirt Team:	Infra
Target Upstream Version:
Embargoed:
Flags:	emarcus: needinfo- bdolezal: testing_plan_complete+

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	oVirt ovirt-ansible-collection pull 183	None	closed	ovirt_disk: automatically detect virtual size of qcow image	2021-02-21 07:42:02 UTC
Github	oVirt ovirt-ansible-collection pull 215	None	closed	ovirt_disk: fix transfer ending	2021-02-21 07:42:02 UTC
Github	oVirt ovirt-ansible-collection pull 358	None	Merged	ovirt_disk: use imageio client	2021-11-23 14:36:17 UTC
Red Hat Knowledge Base (Solution)	5445891	None	None	None	2020-09-30 14:54:30 UTC
Red Hat Product Errata	RHSA-2022:4712	None	None	None	2022-05-26 17:25:31 UTC

Description Juan Orti 2020-09-30 14:54:31 UTC

Description of problem:
When uploading a qcow2 disk to RHV 4.3 using the Ansible module ovirt_disk and the size is not specified, it uses the file size instead of the qcow2 virtual size.

Version-Release number of selected component (if applicable):
ansible-2.9.10-1.el7ae.noarch

How reproducible:
Always

Steps to Reproduce:
1. Create playbook to upload qcow2 disk

---
- name: Create thin vdisk in ovirt playbook
  hosts: localhost
  gather_facts: no
  vars:
    ovirt_url: "https://rhvm.example.com/ovirt-engine/api"
    ovirt_username: admin@internal
    ovirt_password: 1234
    ca_file: "/etc/pki/ovirt-engine/ca.pem"

  tasks:
    - block:
      - name: Obtain SSO token with using username/password credentials
        ovirt_auth:
          url: "{{ ovirt_url }}"
          username: "{{ ovirt_username }}"
          password: "{{ ovirt_password }}"
          ca_file: "{{ ca_file }}"
      
      - name: Upload qcow2 image and create Disk
        ovirt_disk:
          auth: "{{ ovirt_auth }}"
          name: "Disk-test"
          #size: "4GiB"  <----- Do not specify size
          format: cow
          sparse: true
          sparsify: true
          bootable: yes
          upload_image_path: /var/tmp/Fedora-Cloud-Base-32-1.6.x86_64.qcow2
          storage_domain: data-block
          interface: virtio

      always:
        - name: Always revoke the SSO token
          ovirt_auth:
            state: absent
            ovirt_auth: "{{ ovirt_auth }}"

2. Run playbook

Actual results:

The task fails in RHVM:

VDSM rhvh01.example.com command VerifyUntrustedVolumeVDS failed: Image verification failed: 'reason=Image virtual size 4294967296 is bigger than volume size 302841856'

Tha Ansible playbook fails:

TASK [Upload qcow2 image and create Disk] *****************************************************************************************************************************************************
task path: /root/ansible/create-thin-vdisk.yml:20
<127.0.0.1> ESTABLISH LOCAL CONNECTION FOR USER: root
<127.0.0.1> EXEC /bin/sh -c 'echo ~root && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '( umask 77 && mkdir -p "` echo /root/.ansible/tmp `"&& mkdir /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342 && echo ansible-tmp-1601477202.35-10722-36398564875342="` echo /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342 `" ) && sleep 0'
Using module file /usr/lib/python2.7/site-packages/ansible/modules/cloud/ovirt/ovirt_disk.py
<127.0.0.1> PUT /root/.ansible/tmp/ansible-local-10676mFQ18F/tmp48vOSf TO /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342/AnsiballZ_ovirt_disk.py
<127.0.0.1> EXEC /bin/sh -c 'chmod u+x /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342/ /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342/AnsiballZ_ovirt_disk.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c '/usr/bin/python2 /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342/AnsiballZ_ovirt_disk.py && sleep 0'
<127.0.0.1> EXEC /bin/sh -c 'rm -f -r /root/.ansible/tmp/ansible-tmp-1601477202.35-10722-36398564875342/ > /dev/null 2>&1 && sleep 0'
The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_ovirt_disk_payload_ZfeNgf/ansible_ovirt_disk_payload.zip/ansible/modules/cloud/ovirt/ovirt_disk.py", line 754, in main
  File "/tmp/ansible_ovirt_disk_payload_ZfeNgf/ansible_ovirt_disk_payload.zip/ansible/modules/cloud/ovirt/ovirt_disk.py", line 489, in upload_disk_image
  File "/tmp/ansible_ovirt_disk_payload_ZfeNgf/ansible_ovirt_disk_payload.zip/ansible/modules/cloud/ovirt/ovirt_disk.py", line 428, in transfer
Exception: Error occurred while uploading image. The transfer is in finalizing_failure
fatal: [localhost]: FAILED! => {
    "changed": false, 
    "invocation": {
        "module_args": {
            "activate": null, 
            "auth": {
                "ca_file": "/etc/pki/ovirt-engine/ca.pem", 
                "compress": true, 
                "headers": null, 
                "insecure": false, 
                "kerberos": false, 
                "timeout": 0, 
                "token": "<redacted>", 
                "url": "https://rhvm.example.com/ovirt-engine/api"
            }, 
            "bootable": true, 
            "content_type": "data", 
            "description": null, 
            "download_image_path": null, 
            "fetch_nested": false, 
            "force": false, 
            "format": "cow", 
            "host": null, 
            "id": "0beb0291-8157-472d-b801-c94d1eb9631c", 
            "image_provider": null, 
            "interface": "virtio", 
            "logical_unit": null, 
            "name": "Disk-test", 
            "nested_attributes": [], 
            "openstack_volume_type": null, 
            "poll_interval": 3, 
            "profile": null, 
            "quota_id": null, 
            "shareable": null, 
            "size": null, 
            "sparse": true, 
            "sparsify": true, 
            "state": "present", 
            "storage_domain": "data-block", 
            "storage_domains": null, 
            "timeout": 180, 
            "upload_image_path": "/var/tmp/Fedora-Cloud-Base-32-1.6.x86_64.qcow2", 
            "vm_id": null, 
            "vm_name": null, 
            "wait": true, 
            "wipe_after_delete": null
        }
    }, 
    "msg": "Error occurred while uploading image. The transfer is in finalizing_failure"
}


Expected results:
ovirt_disk should use the qcow2 virtual disk size.


Additional info:

# ls -l Fedora-Cloud-Base-32-1.6.x86_64.qcow2 
-rw-r--r--. 1 root root 302841856 sep 29 11:12 Fedora-Cloud-Base-32-1.6.x86_64.qcow2

# qemu-img info Fedora-Cloud-Base-32-1.6.x86_64.qcow2
image: Fedora-Cloud-Base-32-1.6.x86_64.qcow2
file format: qcow2
virtual size: 4.0G (4294967296 bytes) <--------
disk size: 289M
cluster_size: 65536
Format specific information:
    compat: 0.10
    refcount bits: 16

Comment 3 Pavol Brilla 2021-01-12 14:39:06 UTC

From playbook:
TASK [ovirt_disk] *******************************************************************************************************************************************************************************************************************************************************************************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: Exception: Error occurred while uploading image. The transfer is in finalizing_failure
fatal: [localhost]: FAILED! => {"changed": false, "msg": "Error occurred while uploading image. The transfer is in finalizing_failure"}


From engine:
VDSM host command VerifyUntrustedVolumeVDS failed: Image verification failed: 'reason=Image virtual size 734003200 is bigger than volume size 8060928'

Comment 4 Pavol Brilla 2021-01-12 14:51:01 UTC

 yum list ovirt-ansible*
Last metadata expiration check: 0:00:09 ago on Tue 12 Jan 2021 03:47:54 PM CET.
Installed Packages
ovirt-ansible-collection.noarch                                                                                                                            1.2.4-1.el8ev

Comment 5 Pavol Brilla 2021-01-12 15:13:38 UTC

Discard comment 3, I forget to put collections: redhat.rhv to playbook


The full traceback is:
Traceback (most recent call last):
  File "/tmp/ansible_ovirt_disk_payload_pl6v2c12/ansible_ovirt_disk_payload.zip/ansible_collections/redhat/rhv/plugins/modules/ovirt_disk.py", line 881, in main
  File "/tmp/ansible_ovirt_disk_payload_pl6v2c12/ansible_ovirt_disk_payload.zip/ansible_collections/redhat/rhv/plugins/module_utils/ovirt.py", line 629, in create
    **kwargs
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/services.py", line 7014, in add
    return self._internal_add(attachment, headers, query, wait)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 232, in _internal_add
    return future.wait() if wait else future
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 55, in wait
    return self._code(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 229, in callback
    self._check_fault(response)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 132, in _check_fault
    self._raise_error(response, body)
  File "/usr/lib64/python3.6/site-packages/ovirtsdk4/service.py", line 118, in _raise_error
    raise error
ovirtsdk4.Error: Fault reason is "Operation Failed". Fault detail is "[Cannot attach Virtual Disk: Disk is locked. Please try again later.]". HTTP response code is 409.
[WARNING]: Module did not set no_log for pass_discard
fatal: [localhost]: FAILED! => {
    "changed": false,
    "invocation": {
        "module_args": {
            "activate": true,
            "auth": {
                "ca_file": null,
                "compress": true,
                "headers": null,
                "insecure": true,
                "kerberos": false,
                "timeout": 0,
                "token": "xQutjv26CZlOyyjN1HEyOV9ZVKrippUl75RP24WwpM7qO8fWiVotP1C6HOrHMl_fshI7VxDzgCjNKbzuYsqsBQ",
                "url": "https://summit-demo.rhev.lab.eng.brq.redhat.com/ovirt-engine/api"
            },
            "backup": null,
            "bootable": null,
            "content_type": "data",
            "description": null,
            "download_image_path": null,
            "fetch_nested": false,
            "force": false,
            "format": "cow",
            "host": null,
            "id": "4db23390-23f6-463e-89e5-5c73512875af",
            "image_path": "/root/xxx.qcow2",
            "image_provider": null,
            "interface": "virtio",
            "logical_unit": null,
            "name": "pokus",
            "nested_attributes": [],
            "openstack_volume_type": null,
            "pass_discard": null,
            "poll_interval": 3,
            "profile": null,
            "propagate_errors": null,
            "quota_id": null,
            "scsi_passthrough": null,
            "shareable": null,
            "size": "700MiB",
            "sparse": null,
            "sparsify": null,
            "state": "present",
            "storage_domain": "brq_storage",
            "storage_domains": null,
            "timeout": 18000,
            "upload_image_path": "/root/xxx.qcow2",
            "uses_scsi_reservation": null,
            "vm_id": null,
            "vm_name": "pokus_vm2",
            "wait": true,
            "wipe_after_delete": null
        }
    },
    "msg": "Fault reason is \"Operation Failed\". Fault detail is \"[Cannot attach Virtual Disk: Disk is locked. Please try again later.]\". HTTP response code is 409."
}

PLAY RECAP **************************************************************************************************************************************************************************************************************************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   


Size is recognized correctly, but play failed, before Disk returned OK Status ( when I click to engine GUI, I saw 10 seconds later that disk when from Complete status to OK ). Whole upload took less than 2 minutes - not timeout issue

Comment 6 Pavol Brilla 2021-01-12 15:15:09 UTC

Attaching also playbook:

---
- name: Ansible worker playbook
  hosts: local

  pre_tasks:
    - ovirt_auth:
        url: "{{engine_url}}"
        username: "{{engine_user}}"
        password: "{{engine_password}}"
        state: present
        insecure: true

  post_tasks:
    - ovirt_auth:
        state: absent
        ovirt_auth: "{{ ovirt_auth }}"

  tasks:
     - ovirt_disk:
         name: pokus
         vm_name: pokus_vm2
         interface: virtio
         format: cow
         image_path: /root/xxx.qcow2
         storage_domain: brq_storage
         auth: "{{ ovirt_auth }}"
         timeout: 18000

  collections:
    - redhat.rhv

Comment 8 Martin Necas 2021-02-02 15:33:09 UTC

I was able to find the issue with Pavols' run.
When the user uploads an image and immediately attaches the disk to the VM, at that moment the disk has in API status OK but when it tries to attach it fails with status locked.
I added a little bit of delay to fix this issue but I think it should be fixed somewhere else and would consider this patch temporary.
Another possible workaround for this: split the upload and attachment of the disk into 2 tasks

Comment 9 Martin Perina 2021-02-03 06:22:35 UTC

(In reply to Martin Necas from comment #8)
> I was able to find the issue with Pavols' run.
> When the user uploads an image and immediately attaches the disk to the VM,
> at that moment the disk has in API status OK but when it tries to attach it
> fails with status locked.
> I added a little bit of delay to fix this issue but I think it should be
> fixed somewhere else and would consider this patch temporary.
> Another possible workaround for this: split the upload and attachment of the
> disk into 2 tasks

Hmm, that seems quite hacky, shouldn't this be fixed inside engine backend code?

Comment 10 Eyal Shenitzky 2021-02-08 12:18:00 UTC

I am confused here.

So the upload ended successfully but the disk failed to attach to a VM since it was locked?

Can you please try to add the following under ovirt_disk:
state: attached

According to the Ansible documentation, the task will wait for that status.

If it does not solve the issue, I believe that the fix should be in the Ansible ovirt_disk module,
we should wait for the disk status to be 'OK' before attaching it to a VM.

Comment 11 Martin Perina 2021-02-08 13:17:53 UTC

Isn't the remaining issue the same as raised in BZhttps://bugzilla.redhat.com/show_bug.cgi?id=1849861?

Comment 12 Martin Necas 2021-02-08 21:14:30 UTC

(In reply to Eyal Shenitzky from comment #10)
> I am confused here.
> 
> So the upload ended successfully but the disk failed to attach to a VM since
> it was locked?
yes
> 
> Can you please try to add the following under ovirt_disk:
> state: attached
In this case, the `attached` and `present` mean the same thing.
> 
> According to the Ansible documentation, the task will wait for that status.
yes, it is waiting for status (DiskStatus.OK), but at that moment the disk does have DiskStatus.OK but when we try to attach, it fails
We are waiting for this right after the disk upload https://github.com/oVirt/ovirt-ansible-collection/blob/master/plugins/modules/ovirt_disk.py#L476-L481
> 
> If it does not solve the issue, I believe that the fix should be in the
> Ansible ovirt_disk module,
> we should wait for the disk status to be 'OK' before attaching it to a VM.

Comment 13 Eyal Shenitzky 2021-02-11 08:38:29 UTC

(In reply to Martin Perina from comment #11)
> Isn't the remaining issue the same as raised in
> BZhttps://bugzilla.redhat.com/show_bug.cgi?id=1849861?

According to comment #12 it looks like the same issue.

Comment 15 Eyal Shenitzky 2021-07-06 10:09:16 UTC

We need to fix the ansible playbook now, it should consume the engine fix and use the image transfer session phase instead of waiting to the disk status as an indication for the ending of the transfer session.

Moving back to infra team.

Comment 16 Martin Perina 2021-07-12 13:15:29 UTC

(In reply to Eyal Shenitzky from comment #15)
> We need to fix the ansible playbook now, it should consume the engine fix
> and use the image transfer session phase instead of waiting to the disk
> status as an indication for the ending of the transfer session.
> 
> Moving back to infra team.

Feel free to perform any changes in ovirt_disk module to fix that issue, infra team don't have enough storage knowledge to fix that properly

Comment 17 Eyal Shenitzky 2021-10-18 06:23:34 UTC

(In reply to Martin Perina from comment #16)
> (In reply to Eyal Shenitzky from comment #15)
> > We need to fix the ansible playbook now, it should consume the engine fix
> > and use the image transfer session phase instead of waiting to the disk
> > status as an indication for the ending of the transfer session.
> > 
> > Moving back to infra team.
> 
> Feel free to perform any changes in ovirt_disk module to fix that issue,
> infra team don't have enough storage knowledge to fix that properly

I see that Martin patch changes the ovirt_disk model and fixes those issues - 
https://github.com/oVirt/ovirt-ansible-collection/pull/358

Moving back to the Infra team.

Comment 21 Barbora Dolezalova 2022-05-04 12:33:12 UTC

Verified in ovirt-ansible-collection-2.0.3-1.el8ev.noarch

Comment 26 errata-xmlrpc 2022-05-26 17:25:09 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV Engine and Host Common Packages security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4712

Note You need to log in before you can comment on or make changes to this bug.