Bug 1970372 - Virt-handler fails to verify container-disk
Summary: Virt-handler fails to verify container-disk
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 2.6.4
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 2.6.6
Assignee: sgott
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks: 1970454
TreeView+ depends on / blocked
 
Reported: 2021-06-10 11:52 UTC by lpivarc
Modified: 2021-08-10 17:33 UTC (History)
4 users (show)

Fixed In Version: virt-operator-container-v2.6.6-4 hco-bundle-registry-container-v2.6.6-31
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1970454 (view as bug list)
Environment:
Last Closed: 2021-08-10 17:33:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:3119 0 None None None 2021-08-10 17:33:48 UTC

Description lpivarc 2021-06-10 11:52:18 UTC
Description of problem:
When we create new vms we can see warnings `runtime cannot allocate memory`. It is caused by verifying container disk ownership(therefore only vms with container disk are affected). This issue happens at random AFAIK but if we get unlucky it could block creation of vm. Otherwise this issue should be retried by the handler and vm should start.
The root cause is golang new behaviour. It can only be seen with golang 1.14+.


Version-Release number of selected component (if applicable):
2.6.z, 4.8


How reproducible:
Create VM and observe logs of virt-handler.


Steps to Reproduce:
1.
2.
3.

Actual results:
Log of virt-handler:
failed to get image info: failed to invoke qemu-img: exit status 2: 'fatal error: runtime: cannot allocate memory


Expected results:
Nothing in log.


Additional info:

Comment 2 Fabian Deutsch 2021-06-10 12:28:01 UTC
How reproducible is this bug?

Comment 3 lpivarc 2021-06-10 12:34:40 UTC
I saw 2-4 failures from the whole test suit. So something like 2/600 probability?

Comment 4 Fabian Deutsch 2021-06-10 12:38:58 UTC
Okay - while I don't see if all 600 cases used a containerDIsk.

WHat will happen if the bug is triggered? Speak what happens to a VM running into this bug? Will it resolve itself or does it need admin attention?

Comment 5 lpivarc 2021-06-10 12:45:21 UTC
This operation will be retried in next `sync` of the handler. The VM should eventually start if we don't hit the timeout(read the error will not occur too much).

Comment 6 Fabian Deutsch 2021-06-10 13:03:55 UTC
Okay, due to the low level of reproducability, and because if might eventually fix itself I'm not considering this to be a blocker.

Comment 12 zhe peng 2021-07-26 06:41:28 UTC
after check with test suite result, this issue is removed, move this to verified.
verify build is: 
hco-bundle-registry-container-v2.6.6-35

Comment 17 errata-xmlrpc 2021-08-10 17:33:37 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.6 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3119


Note You need to log in before you can comment on or make changes to this bug.