Bug 2115479 - ovnkube direct-lists pods on a node when the node object changes
Summary: ovnkube direct-lists pods on a node when the node object changes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.12
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.12.0
Assignee: Dan Williams
QA Contact: Mike Fiedler
URL:
Whiteboard: perfscale-ovn
Depends On:
Blocks: 2108679 2115481
TreeView+ depends on / blocked
 
Reported: 2022-08-04 18:52 UTC by Dan Williams
Modified: 2023-01-17 19:54 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2115481 (view as bug list)
Environment:
Last Closed: 2023-01-17 19:54:14 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1214 0 None Merged Bug 2111534: Downstream Merge: 27-07-2022 2022-08-04 18:53:26 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:54:35 UTC

Description Dan Williams 2022-08-04 18:52:43 UTC
2022-07-07T17:42:48.798533966Z I0707 17:42:48.798447      15 trace.go:205] Trace[17745800]: "List" url:/api/v1/pods,user-agent:ip-10-0-149-185/ovnkube@bd4f2094aeb5 (linux/amd64) kubernetes/,audit-id:6c114e16-15ac-4920-b8ed-a01cbd710909,client:10.0.196.114,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/2.0 (07-Jul-2022 17:42:36.762) (total time: 12035ms):
2022-07-07T17:42:48.889007528Z I0707 17:42:48.888869      15 trace.go:205] Trace[859805328]: "List" url:/api/v1/pods,user-agent:ip-10-0-149-185/ovnkube@bd4f2094aeb5 (linux/amd64) kubernetes/,audit-id:49ad7f2b-9c09-4d0b-9068-4e6fa606a4f8,client:10.0.196.114,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/2.0 (07-Jul-2022 17:42:36.761) (total time: 12127ms):
2022-07-07T17:42:49.272283449Z I0707 17:42:49.272192      15 trace.go:205] Trace[1792712204]: "List" url:/api/v1/pods,user-agent:ip-10-0-149-185/ovnkube@bd4f2094aeb5 (linux/amd64) kubernetes/,audit-id:e0269f80-38be-4c03-8317-6b63fd798ceb,client:10.0.196.114,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/2.0 (07-Jul-2022 17:42:36.723) (total time: 12548ms):
2022-07-07T17:42:49.596468418Z I0707 17:42:49.596368      15 trace.go:205] Trace[1248373466]: "List" url:/api/v1/pods,user-agent:ip-10-0-149-185/ovnkube@bd4f2094aeb5 (linux/amd64) kubernetes/,audit-id:9fb86c6a-4879-42de-bcce-c987b1c77931,client:10.0.196.114,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/2.0 (07-Jul-2022 17:42:36.763) (total time: 12832ms):
2022-07-07T17:42:49.771518271Z I0707 17:42:49.756590      15 trace.go:205] Trace[235168718]: "List" url:/api/v1/pods,user-agent:ip-10-0-149-185/ovnkube@bd4f2094aeb5 (linux/amd64) kubernetes/,audit-id:ea512964-5969-4612-b013-e9c63a4e72e0,client:10.0.196.114,accept:application/vnd.kubernetes.protobuf,application/json,protocol:HTTP/2.0 (07-Jul-2022 17:42:36.751) (total time: 13005ms):


ovnkube master would list pods directly from the API (not from a shared informer cache) whenever a node add/update event happened, and worse wasn't listing from the apiserver's cache either (by setting ResourceVersion:"0" in the ListOptions).

This places a bunch of load on the apiserver/etcd that's unecessary.

Comment 1 Mike Fiedler 2022-08-17 21:30:31 UTC
Verified on 4.12.0-0.nightly-2022-08-15-150248

- Comparison of 4.11.rc1 vs 4.12.0-0.nightly-2022-08-15-150248
- cluster-density workload with 1500 iterations on 120 nodes on AWS

4.11.rc1 - api-server cpu regular spikes to 10 cores and max 19GB rss memory
         - etcd cpu regular spikes to 1.5 cores and max 3GB rss memory

4.12.nightly - api-server cpu regular spikes to 2 cores and max 12GB memory
             - etcd cpu regular spikes to .7 cores and max 1.4GB rss memory

cluster-density workload successful on both version, significant reduction in cpu/memory on 4.12 with this fix.

Comment 4 errata-xmlrpc 2023-01-17 19:54:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.