Bug 1910338

Summary: OVA export might fail with: nlosetup: /var/tmp/ova_vm.ova.tmp: failed to set up loop device: Resource temporarily unavailable
Product: [oVirt] ovirt-engine Reporter: Yedidyah Bar David <didi>
Component: BLL.VirtAssignee: Yedidyah Bar David <didi>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 4.4.4CC: ahadas, bugs, nsimsolo
Target Milestone: ovirt-4.4.5Flags: pm-rhel: ovirt-4.4+
pm-rhel: planning_ack+
ahadas: devel_ack+
mavital: testing_ack+
Target Release: 4.4.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.4.5-0.11 Doc Type: Bug Fix
Doc Text:
Under certain conditions, exporting a VM as OVA failed, with this error in ansible-runner-service.log: losetup: /var/tmp/ova_vm.ova.tmp: failed to set up loop device: Resource temporarily unavailable With this release, pack_ova.py flushes buffered written data to the file before calling losetup, which should prevent this failure.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-03-18 15:15:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
post-004_basic_sanity_pytest.py.tar.gz none

Description Yedidyah Bar David 2020-12-23 14:02:07 UTC
Created attachment 1741561 [details]
post-004_basic_sanity_pytest.py.tar.gz

Description of problem:

Now tried OST locally, he-basic-suite-master, with something similar to [1] and it failed in test_import_vm1. This test does not exist yet in HE suites but [1] includes it. I think this is not specific to OST or HE, other than perhaps timing/load issues. Full logs are attached. It seems like the root cause is during the call to losetup inside pack_ova.py, which fails with:

ansible-runner-service.log:

losetup: /var/tmp/ova_vm.ova.tmp: failed to set up loop device: Resource temporarily unavailable

/var/log/messages on host-0 has:

Dec 23 12:37:47 lago-he-basic-suite-master-host-0 kernel: loop: module loaded
Dec 23 12:37:47 lago-he-basic-suite-master-host-0 su[57218]: (to vdsm) root on pts/0
Dec 23 12:37:47 lago-he-basic-suite-master-host-0 kernel: loop_set_status: loop1 () has still dirty pages (nrpages=1)
Dec 23 12:37:47 lago-he-basic-suite-master-host-0 python3[57209]: detected unhandled Python exception in '/root/.ansible/tmp/ansible-tmp-1608723466.6123397-5325-41317927430391/pack_ova.py'

So I guess it's perhaps because something was still writing stuff to the ova while we called losetup, and this caused it to fail.

Not sure what's the best way to fix (perhaps call 'sync' somewhere, or call ova_file.flush() before os.fsync). Also not sure if we have a systematic way to reproduce.

[1] https://gerrit.ovirt.org/c/ovirt-system-tests/+/112661

Comment 1 Nisim Simsolo 2021-02-24 11:36:19 UTC
Verified:
ovirt-engine-4.4.5.6-0.11.el8ev
vdsm-4.40.50.6-1.el8ev.x86_64
libvirt-daemon-6.6.0-13.module+el8.3.1+9548+0a8fede5.x86_64
qemu-kvm-5.1.0-20.module+el8.3.1+9918+230f5c26.x86_64

Verification scenario:
1. Create CPU/memory load on both host and engine
2. Export/Import OVA in a loop.
3. Verify export succeeds.
4. Verify import succeeds, run imported VM and verify VM is running.

Comment 2 Sandro Bonazzola 2021-03-18 15:15:06 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 3 Sandro Bonazzola 2021-03-22 12:55:41 UTC
This bugzilla is included in oVirt 4.4.5 release, published on March 18th 2021.

Since the problem described in this bug report should be resolved in oVirt 4.4.5 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.