Bug 1773905 - [CNV deploy] nmstate pod 2.2.0-8 state is flakey
Summary: [CNV deploy] nmstate pod 2.2.0-8 state is flakey
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.2.0
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 2.2.0
Assignee: Quique Llorente
QA Contact: Yossi Segev
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-19 10:00 UTC by Tareq Alayan
Modified: 2023-09-14 05:47 UTC (History)
6 users (show)

Fixed In Version: cluster-network-addons-operator-container-v2.2.0-5
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-01-30 16:27:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2020:0307 0 None None None 2020-01-30 16:27:42 UTC

Description Tareq Alayan 2019-11-19 10:00:06 UTC
Description of problem:
nmstate pods are flakey 
they keep going from Ready to CrashLoopBackOff state

{"level":"info","ts":1574152115.700579,"logger":"cmd","msg":"failed to initialize service object for metrics: pods \"nmstate-handler-4bzgg\" is forbidden: User \"system:serviceaccount:openshift-cnv:nmstate-handler\" cannot get resource \"pods\" in API group \"\" in the namespace \"openshift-cnv\""}



Version-Release number of selected component (if applicable):

registry-proxy.engineering.redhat.com/rh-osbs/container-native-virtualization-kubernetes-nmstate-handler-rhel8:v2.2.0-8
How reproducible:
always 

Steps to Reproduce:
1. deploy cnv 

Actual results:


Expected results:


Additional info:

Comment 1 Quique Llorente 2019-11-19 10:58:57 UTC
Real issue is OOMKiller triggered by resources limit at nmstate-handler pods

The upstream fix is this https://github.com/kubevirt/cluster-network-addons-operator/pull/263

This can be fixes after deploy with the following command:

oc patch ds -n openshift-cnv nmstate-handler --patch '{"spec": {"template": {"spec":{ "containers": [{"name": "nmstate-handler", "resources": {"requests": {"cpu": "200m", "memory": "120Mi" }, "limits": {"cpu": "200m", "memory": "120Mi" }}}]}}}}'

Comment 2 Dan Kenigsberg 2019-11-20 13:50:17 UTC
Waiting for CPaaS to pick up upstream, chew it, and push it to errata.

Comment 3 Quique Llorente 2019-11-21 15:13:44 UTC
Errata is has it and it's a QE now 
https://errata.devel.redhat.com/errata?search=CNV+2.2.0

Comment 4 Nelly Credi 2019-11-25 08:08:55 UTC
please add fixed in version

Comment 5 Quique Llorente 2019-11-25 08:17:57 UTC
Added the CNAO version

Comment 6 Yossi Segev 2019-11-27 13:51:15 UTC
nmstate-handler pods are running without restart for more than 4 hours.
Verified in a cluster with OCP4.3/CNV2.2m with cluster-network-addons-operator:v2.2.0-5.

Comment 8 errata-xmlrpc 2020-01-30 16:27:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0307

Comment 9 Red Hat Bugzilla 2023-09-14 05:47:10 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.