Bug 2016988 - NTO does not set io_timeout and max_retries for AWS Nitro instances
Summary: NTO does not set io_timeout and max_retries for AWS Nitro instances
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node Tuning Operator
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.10.0
Assignee: Jiří Mencák
QA Contact: Simon
URL:
Whiteboard:
Depends On:
Blocks: 2017066
TreeView+ depends on / blocked
 
Reported: 2021-10-25 11:18 UTC by Jiří Mencák
Modified: 2022-03-10 16:22 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-10 16:21:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-node-tuning-operator pull 283 0 None open Bug 2016988: openshift profile: fix malformed patch 2021-10-25 11:19:32 UTC
Red Hat Product Errata RHSA-2022:0056 0 None None None 2022-03-10 16:22:01 UTC

Description Jiří Mencák 2021-10-25 11:18:40 UTC
Description of problem:
AWS Nitro instances need special tuning for NVME devices, see:
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/nvme-ebs-volumes.html#timeout-nvme-ebs-volumes

Version-Release number of selected component (if applicable):
4.9 and 4.10

How reproducible:
Always.

Steps to Reproduce:
1. echo "cat /sys/module/nvme_core/parameters/io_timeout" | oc debug node/<node_name>

Actual results:
OS-provided value not equal to 4294967295

Expected results:
4294967295

Additional info:
https://github.com/openshift/cluster-node-tuning-operator/pull/283

Comment 4 Simon 2021-10-26 17:48:50 UTC
$ for node in $(oc get nodes --no-headers | cut -f 1 -d ' ' ); do echo $node; echo ""; echo "cat /sys/module/nvme_core/parameters/io_timeout" | oc debug node/$node; done
ip-10-0-152-104.us-east-2.compute.internal

W1026 13:47:33.683325  129785 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true)
Starting pod/ip-10-0-152-104us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.152.104
If you don't see a command prompt, try pressing enter.
4294967295

Removing debug pod ...
ip-10-0-159-95.us-east-2.compute.internal

W1026 13:47:35.662635  129807 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true)
Starting pod/ip-10-0-159-95us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.159.95
If you don't see a command prompt, try pressing enter.
4294967295

Removing debug pod ...
ip-10-0-188-186.us-east-2.compute.internal

W1026 13:47:45.288153  129848 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true)
Starting pod/ip-10-0-188-186us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.188.186
If you don't see a command prompt, try pressing enter.
4294967295

Removing debug pod ...
ip-10-0-191-199.us-east-2.compute.internal

W1026 13:47:55.608726  129872 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true)
Starting pod/ip-10-0-191-199us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.191.199
If you don't see a command prompt, try pressing enter.
4294967295

Removing debug pod ...
ip-10-0-206-123.us-east-2.compute.internal

W1026 13:48:04.133077  129897 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true)
Starting pod/ip-10-0-206-123us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.206.123
If you don't see a command prompt, try pressing enter.
4294967295

Removing debug pod ...
ip-10-0-222-13.us-east-2.compute.internal

W1026 13:48:12.475668  129927 warnings.go:70] would violate "latest" version of "baseline" PodSecurity profile: host namespaces (hostNetwork=true, hostPID=true), hostPath volumes (volume "host"), privileged (container "container-00" must not set securityContext.privileged=true)
Starting pod/ip-10-0-222-13us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.222.13
If you don't see a command prompt, try pressing enter.
4294967295

Removing debug pod ...

$ oc get clusterversion
NAME      VERSION                         AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.ci-2021-10-26-082859   True        False         4h6m    Cluster version is 4.10.0-0.ci-2021-10-26-082859

Comment 7 errata-xmlrpc 2022-03-10 16:21:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056


Note You need to log in before you can comment on or make changes to this bug.