2122119 – Virtual machine fails to start with error "Unable to use native AIO: failed to create linux AIO context: Resource temporarily unavailable"

Bug 2122119 - Virtual machine fails to start with error "Unable to use native AIO: failed to create linux AIO context: Resource temporarily unavailable"

Summary: Virtual machine fails to start with error "Unable to use native AIO: failed t...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Virtualization
Sub Component:
Version:	4.10.4
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.13.0
Assignee:	Igor Bezukh
QA Contact:	guy chen
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-08-29 08:15 UTC by nijin ashok
Modified:	2023-05-18 02:56 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-05-18 02:55:41 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	redhat-performance tuned pull 461	None	Merged	Profile openshift: increase fs.aio-max-nr tunable	2022-09-08 10:07:17 UTC
Red Hat Bugzilla	2036195	medium	CLOSED	Provide mechanism to tune the sysctl param fs.aio-max-nr on the host	2022-12-07 20:40:07 UTC
Red Hat Issue Tracker	CNV-20879	None	None	None	2022-11-21 09:02:59 UTC
Red Hat Issue Tracker	PSAP-900	None	None	None	2022-09-30 16:16:15 UTC
Red Hat Knowledge Base (Solution)	6974118	None	None	None	2022-09-01 11:53:48 UTC
Red Hat Product Errata	RHSA-2023:3205	None	None	None	2023-05-18 02:56:30 UTC

Description nijin ashok 2022-08-29 08:15:46 UTC

Description of problem:

The OpenShift nodes by default are having fs.aio-max-nr = 65536. However, this is low and a host with many VMs running can hit this limit and the VM startup fails with error "Resource temporarily unavailable" since io_setup fails with EAGAIN.

libvirtd raises this limit to 1048576 with a custom sysctl.conf [1] in other virtualization platforms like RHV and Stack which is hard to hit. Since libvirtd is containerized in OpenShift Virt, this conf is not applied.

In my tests, for every VM that has disks with aio=native, the fs.aio-nr is getting incremented by 1024 (Looks like 128 was increased to 1024 [2]) and I can see io_setup called with nr_events=1024 in the strace of the qemu-kvm. So it's easy to hit the limit on a host with many VMs with the default 65536.

[1] https://github.com/libvirt/libvirt/commit/5298551e07a9839c046e0987b325e03f8ba801e5
[2] https://github.com/qemu/qemu/commit/2558cb8dd4150512bc8ae6d505cdcd10d0cc46bb

Version-Release number of selected component (if applicable):

OpenShift Virtualization 4.10.4

How reproducible:

100%

Steps to Reproduce:

1. Start a VM with disk aio=native. Use preallocated file or block disks which use aio=native by default.
2. fs.aio-nr is incremented by 1024 for every new VMs.
3. With many VMs running in the host, it will hit the aio-max-nr

Actual results:

Virtual machine fails to start with error "Unable to use native AIO: failed to create linux AIO context: Resource temporarily unavailable"

Expected results:

Lift the default limit fs.aio-max-nr so that host can have more VMs.

Additional info:

Comment 2 Igor Bezukh 2022-09-01 11:25:05 UTC

Hi,

Unlike RHV, in OCP case the CNV doesn't manage the nodes directly, CNV is an operator installed as an add-on on the cluster.

Because of that, the tuning of the RHCOS nodes needs to be done on the OCP level using MachineConfig or the Node Tune Operator.

This cannot be done automatically as part of CNV deployment, since then it will cause issue with HyperShift support.

Perhaps what we can do is produce some kind of event that the CNV worker nodes are not tuned.

Comment 5 nijin ashok 2022-09-05 11:45:14 UTC

Also, aio-nr per VM also depends on VMs IO threads policy. It is 1024 * disks if the disks are having dedicated IO threads. The High-Performance VM templates have got "ioThreadsPolicy: shared" and will enable one io threads and for these VMs it will be 2048 per VM.

Comment 8 Igor Bezukh 2022-09-08 08:53:27 UTC

We've found a way to make this tune be part of default OCP installation, details are in this ticket: https://issues.redhat.com/browse/PSAP-900

Comment 11 guy chen 2023-03-22 12:57:48 UTC

Tested the verified the following scenario on lab with OCP 4.12.5 and CNV is 4.12.0.

Checked aio on the node before starting VMS :
sudo sysctl -a | grep aio
fs.aio-max-nr = 1048576
fs.aio-nr = 2048

1. Start a VM with disk aio=native. Use'd block local storage ODF.
2. fs.aio-nr is incremented by 1024 for every new VMs.
3. Started 81 VMS on a single node

Checked aio on the node with the VMS :
sudo sysctl -a | grep aio
fs.aio-max-nr = 1048576
fs.aio-nr = 84992

Comment 14 errata-xmlrpc 2023-05-18 02:55:41 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.13.0 Images security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:3205

Note You need to log in before you can comment on or make changes to this bug.