Bug 2089716

Summary: [4.11][reliability]one worker node became NotReady on which ovnkube-node pod's memory increased sharply
Product: OpenShift Container Platform Reporter: Qiujie Li <qili>
Component: NetworkingAssignee: Martin Kennelly <mkennell>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: urgent CC: ffernand, gspence, mifiedle, mkennell, npinaeva, schoudha, vpickard
Version: 4.11Keywords: TestBlocker, Triaged
Target Milestone: ---Flags: qili: needinfo-
qili: needinfo-
Target Release: 4.11.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
No update is needed here because the OVS metrics feature was introduced in 4.11 and also this issue was fixed in 4.11.
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-10 11:13:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 6 Mike Fiedler 2022-05-27 11:54:43 UTC
Marking this as a TestBlocker for reliability testing on OVN

Comment 30 Mohamed Mahmoud 2022-06-02 11:23:58 UTC
*** Bug 2090669 has been marked as a duplicate of this bug. ***

Comment 45 Martin Kennelly 2022-06-09 17:28:50 UTC
I added a PR to remove the high cardinality metrics: https://github.com/ovn-org/ovn-kubernetes/pull/3032

Comment 46 Martin Kennelly 2022-06-13 14:58:11 UTC
Waiting on upstream reviews. I expect all patches downstreamed by end of week latest.

Comment 47 Martin Kennelly 2022-06-13 17:05:03 UTC
Approved upstream. Waiting on downstream merge. Attached is the downstream PR.

Comment 49 Martin Kennelly 2022-06-14 09:50:27 UTC
Do you have time to re-run your reliability test to validate that this is fixed?

Comment 55 Mike Fiedler 2022-06-15 18:18:36 UTC
Assigning QA to @qili

Comment 57 Peter Hunt 2022-06-17 15:22:48 UTC
*** Bug 2095280 has been marked as a duplicate of this bug. ***

Comment 59 Martin Kennelly 2022-06-18 14:45:13 UTC
Great work Qiujie Li! Thank you!

Comment 60 errata-xmlrpc 2022-08-10 11:13:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069