Bug 1906151 - High CPU/Memory usage of Kube API server following a CNV installation
Summary: High CPU/Memory usage of Kube API server following a CNV installation
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Networking
Version: 2.4.3
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 4.10.0
Assignee: Quique Llorente
QA Contact: Meni Yakove
URL:
Whiteboard:
Depends On: 1845558
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-09 18:25 UTC by Sai Sindhur Malleni
Modified: 2022-03-16 15:51 UTC (History)
17 users (show)

Fixed In Version: cluster-network-addons-operator-container-v4.9.0-18
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-16 15:50:56 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
CPU usage (876.87 KB, image/png)
2020-12-09 18:25 UTC, Sai Sindhur Malleni
no flags Details
Memory Usage (310.49 KB, image/png)
2020-12-09 18:26 UTC, Sai Sindhur Malleni
no flags Details
CPU consumption from comment3 (666.51 KB, image/png)
2020-12-14 16:32 UTC, Sai Sindhur Malleni
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:0947 0 None None None 2022-03-16 15:51:05 UTC

Description Sai Sindhur Malleni 2020-12-09 18:25:57 UTC
Created attachment 1737980 [details]
CPU usage

Description of problem:
I'm not really convinced this falls under the installation component, but for the lack of a better space, will open it against that for now. Please feel free to move it where it needs to be.

Immediately following the iinstall the of CNV on a 500 node baremetal clusteer (3 masters + 497 workers), we see 3511 CNV related pods brought up which is quite a lot. But along with that, there is an increased consumption of kube-apiserver memory and CPU coinciding with the install of CNV and it continues to be high. It seems like we need to investigate if there is any unwanted polling that these pods are doing or too many watches

I will update the bug with more info as I find. Also happy to provide access to the cluster while it's up.

Version-Release number of selected component (if applicable):
2/4/3 (OCP 4.6.4)

How reproducible:
100% following a CNV install on a large scale environemnt.

Steps to Reproduce:
1. Install a large OCP cluster (500 nodes)
2. Install CNV
3.

Actual results:

API server on all masters is overloaded

Expected results:

The increase in load on API server should not be that significant
Additional info:
[kni@e16-h18-b03-fc640 ansible]$  oc get csv -n openshift-cnv
NAME                                      DISPLAY                    VERSION   REPLACES                                  PHASE
kubevirt-hyperconverged-operator.v2.4.3   OpenShift Virtualization   2.4.3     kubevirt-hyperconverged-operator.v2.4.2   Installing
[kni@e16-h18-b03-fc640 ansible]$ oc get ds
NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR                   AGE
bridge-marker                  498       498       498     498          498         beta.kubernetes.io/arch=amd64   4h57m
hostpath-provisioner           495       495       495     495          495         <none>                          4h32m
kube-cni-linux-bridge-plugin   498       498       498     498          498         beta.kubernetes.io/arch=amd64   4h57m
kubevirt-node-labeller         495       495       495     495          495         <none>                          4h57m
nmstate-handler                498       498       498     498          498         beta.kubernetes.io/arch=amd64   4h57m
ovs-cni-amd64                  498       498       498     498          498         beta.kubernetes.io/arch=amd64   4h57m
virt-handler                   495       495       495     495          495         <none>                          4h53m


In the attached pictures, you can see around 13:30 when the CNV install first happened, API server CPU/Memory usage increased and continued to be high.

Comment 1 Sai Sindhur Malleni 2020-12-09 18:26:20 UTC
Created attachment 1737981 [details]
Memory Usage

Comment 2 Roman Mohr 2020-12-10 09:50:16 UTC
I would be happy to have a closer look on the cluster. Could you share with me the connection details on a private feed?

Comment 3 Sai Sindhur Malleni 2020-12-14 16:24:03 UTC
Here are a few observations:

Following a CNV install on this 499 worker node + 3 master node environment, there is a significant increase the CPU each kube-apiserver process consumes and also the memory (the CNV operator install was initiated around 15:35 UTC):

CPU: https://snapshot.raintank.io/dashboard/snapshot/u7RdkJVT1sstpAGcZX03YQH1isjR68aw
Memory: https://snapshot.raintank.io/dashboard/snapshot/E46bUXkEee2zGNYzSj5uxJGTijAZyIHa

Looking at the API server requests, there is an obvious increase in the number of operations agains tthe API server

Requests rate increase: https://snapshot.raintank.io/dashboard/snapshot/0EJ1jYcAzZlkHzK50k6swD9e68EGWSzh?orgId=2

Also there is an increase in the total number of long-running requests (watches) across the three API server instances: 
https://snapshot.raintank.io/dashboard/snapshot/yNbi5YNhr445B461UWiyha9UzKxsrvfx

breaking down the watches by instance and resource
https://snapshot.raintank.io/dashboard/snapshot/MyQshIldkTfX7Nh2eqcYtq0EcXUdp00k
Seems like there is a big increase in the number of watches on secrets and configmaps

Following the CNV install, the 99% of he response size from the apiserver also shoots up drastically
https://snapshot.raintank.io/dashboard/snapshot/LGyj3T0vMGtkwX8UdHrOE18ZBefq0CI5

Comment 4 Sai Sindhur Malleni 2020-12-14 16:32:42 UTC
Created attachment 1739002 [details]
CPU consumption from comment3

Comment 5 Roman Mohr 2020-12-14 17:23:36 UTC
We debugged the cluster and we saw that nmstate-handler pods use significant amount of CPU all the time.


```
NAME                                                 CPU(cores)   MEMORY(bytes)
nmstate-handler-s9jq6                                1186m        161Mi
nmstate-handler-pbjhp                                579m         123Mi
nmstate-handler-fgk45                                577m         128Mi
nmstate-handler-5x96n                                577m         120Mi
nmstate-handler-n7rp7                                575m         127Mi
nmstate-handler-j9g9h                                574m         133Mi
nmstate-handler-nm42f                                573m         120Mi
nmstate-handler-gtzmx                                570m         122Mi
nmstate-handler-chbrx                                569m         127Mi
nmstate-handler-rjd5l                                567m         155Mi

```

In contrast to that the virt-handler cpu consumption:


```
virt-handler-tsmxt                                   8m           119Mi
virt-handler-jx5vz                                   6m           119Mi
virt-handler-rsz56                                   6m           116Mi
virt-handler-lsqjl                                   6m           119Mi
virt-handler-z2qtj                                   6m           119Mi
virt-handler-qdhll                                   5m           111Mi
virt-handler-dkqds                                   5m           115Mi
virt-handler-jg5rn                                   5m           119Mi

```


The api-server usage is very high:


```
NAME                      CPU(cores)   MEMORY(bytes)
kube-apiserver-master-0   29080m       21303Mi
kube-apiserver-master-1   28151m       10823Mi
kube-apiserver-master-2   27660m       8020Mi

```

We scaled down HCO and the network-addon-operator and then deleted the nmstate-handler daemonset.
The apiserver load significantly dropped by a third. Scaling back up made the load return.


There are a lot of 'GET: nodes' and 'UPDATE: nodenetworkstates:status' operations going on which seem to cause the load.
The nmstate-handler logs show nothing after the controller loop startup.

Comment 6 Roman Mohr 2020-12-14 17:25:06 UTC
Not sure if there is awareness of this and if this is fixed in 2.5 or 2.5, but raising severity and reassigning to network.

Comment 7 Quique Llorente 2020-12-15 08:40:22 UTC
This is not new it appears already at https://bugzilla.redhat.com/show_bug.cgi?id=1845558, but it was not happening anymore and bz get closed

The issue is the amount of CPU needed by python to startup, right now every 5 seconds the NodeNetworkState is updated, the solution is to run nmstatectl as a daemon using varlink to communicate between kubernetes-nmstate and nmstatectl this way we don't pay for python startup, I did a little PoC and cpu usage dropped drastically.

Also varlink support was backported to nmstate 0.3 with this in mind, let me prepare a PR with the fix.

Comment 8 Roman Mohr 2020-12-15 09:03:42 UTC
(In reply to Quique Llorente from comment #7)
> This is not new it appears already at
> https://bugzilla.redhat.com/show_bug.cgi?id=1845558, but it was not
> happening anymore and bz get closed
> 
> The issue is the amount of CPU needed by python to startup, right now every
> 5 seconds the NodeNetworkState is updated, the solution is to run nmstatectl
> as a daemon using varlink to communicate between kubernetes-nmstate and
> nmstatectl this way we don't pay for python startup, I did a little PoC and
> cpu usage dropped drastically.
> 

But does that not only address the hight CPU usage? I would highly recommend to not update NodeNetworkState every 5 seconds.
Note that the high cpu-usage on nmstate-handler made me just suspect that nmstate-handler is the reason for the high resource consumption on the apiserver.
In a 500 node setup this daemonset is responsible for 1/3 of the whole apiserver load.
  

> Also varlink support was backported to nmstate 0.3 with this in mind, let me
> prepare a PR with the fix.

Comment 9 Quique Llorente 2020-12-15 09:08:52 UTC
(In reply to Roman Mohr from comment #8)
> (In reply to Quique Llorente from comment #7)
> > This is not new it appears already at
> > https://bugzilla.redhat.com/show_bug.cgi?id=1845558, but it was not
> > happening anymore and bz get closed
> > 
> > The issue is the amount of CPU needed by python to startup, right now every
> > 5 seconds the NodeNetworkState is updated, the solution is to run nmstatectl
> > as a daemon using varlink to communicate between kubernetes-nmstate and
> > nmstatectl this way we don't pay for python startup, I did a little PoC and
> > cpu usage dropped drastically.
> > 
> 
> But does that not only address the hight CPU usage? I would highly recommend
> to not update NodeNetworkState every 5 seconds.
> Note that the high cpu-usage on nmstate-handler made me just suspect that
> nmstate-handler is the reason for the high resource consumption on the
> apiserver.
> In a 500 node setup this daemonset is responsible for 1/3 of the whole
> apiserver load.
> 

To reduce the traffic to apiserver we can update NodeNetworkState only if state has changed, we can store last version 
at memory and compare.
  
> 
> > Also varlink support was backported to nmstate 0.3 with this in mind, let me
> > prepare a PR with the fix.

Comment 10 Roman Mohr 2020-12-15 10:06:13 UTC
Thanks for putting it back to assigned. Let me just highlight again that this bug is about the apiserver being under pressure and that the high cpu-usage of nmstate-handler was just a symptom for finding the cause and while being needing a fix, it is not the main issue.

Indeed something like

> To reduce the traffic to apiserver we can update NodeNetworkState only if state has changed, we can store last version 
at memory and compare.

will have to happen.

Comment 11 Sai Sindhur Malleni 2020-12-15 13:28:00 UTC
Echoing Roman's words here, while nmstate could be one of the reasons, thre might be other reasons too. The fact that apiserver CPU usage is 8-10 cores without cnv and it shoot up to around 30 cores with cnv, only to fall back to 20ush cores with nmstate disabled, hints at other potential causes.

Comment 12 Roman Mohr 2020-12-16 07:33:34 UTC
(In reply to Sai Sindhur Malleni from comment #11)
> Echoing Roman's words here, while nmstate could be one of the reasons, thre
> might be other reasons too. The fact that apiserver CPU usage is 8-10 cores
> without cnv and it shoot up to around 30 cores with cnv, only to fall back
> to 20ush cores with nmstate disabled, hints at other potential causes.

After further investigations, it is not clear what makes the apiserver still use 20 cores when nmstate is disabled.
We uninstalled CNV and it stayed around 20 cores. Sai and I agreed on doing another install on a fresh scale cluster then next time they install CNV.

Comment 13 Quique Llorente 2020-12-17 10:30:07 UTC
I have put in place a PR to reduce the number of GET Node and UPDATE NodeNetworkState (https://github.com/nmstate/kubernetes-nmstate/pull/664)




Numbers of UPDATE NodeNetworkState:

cmd: ./cluster/kubectl.sh get --raw https://127.0.0.1:49267/metrics |grep nmstate |grep nodenetworkstates |grep apiserver |grep UPDATE |grep apiserver_request_total

Without the PR the number increase so clearly NNSs are updated every 5 seconds, but with the PR the number is static so we have a winner for this.



Numbers of GET Node:

cmd: ./cluster/kubectl.sh get --raw https://127.0.0.1:49267/metrics |grep nmstate |grep nodenetworkstates |grep apiserver |grep UPDATE |grep apiserver_request_total

The only metric that increase over time is 

apiserver_request_total{code="200",component="apiserver",contentType="application/vnd.kubernetes.protobuf",dry_run="",group="",resource="nodes",scope="cluster",subresource="",verb="GET",version="v1"}

And curiously this is happening even without kubernetes-nmstate installed and as the same frequency as with or without the PR.


About CPU usage I don't have access to a huge cluster so I don't see much difference. Also at a huge cluster the numbers above will be more evident.

Comment 14 Quique Llorente 2020-12-17 10:31:45 UTC
(In reply to Quique Llorente from comment #13)
> I have put in place a PR to reduce the number of GET Node and UPDATE
> NodeNetworkState (https://github.com/nmstate/kubernetes-nmstate/pull/664)
> 
> 
> 
> 
> Numbers of UPDATE NodeNetworkState:
> 
> cmd: ./cluster/kubectl.sh get --raw https://127.0.0.1:49267/metrics |grep
> nmstate |grep nodenetworkstates |grep apiserver |grep UPDATE |grep
> apiserver_request_total
> 
> Without the PR the number increase so clearly NNSs are updated every 5
> seconds, but with the PR the number is static so we have a winner for this.
> 
> 
> 
> Numbers of GET Node:
> 
> cmd: ./cluster/kubectl.sh get --raw https://127.0.0.1:49267/metrics |grep
> nmstate |grep nodenetworkstates |grep apiserver |grep UPDATE |grep
> apiserver_request_total

Correction: ./cluster/kubectl.sh get --raw https://127.0.0.1:49267/metrics |grep nodes |grep apiserver |grep GET |grep apiserver_request_total

> 
> The only metric that increase over time is 
> 
> apiserver_request_total{code="200",component="apiserver",
> contentType="application/vnd.kubernetes.protobuf",dry_run="",group="",
> resource="nodes",scope="cluster",subresource="",verb="GET",version="v1"}
> 
> And curiously this is happening even without kubernetes-nmstate installed
> and as the same frequency as with or without the PR.
> 
> 
> About CPU usage I don't have access to a huge cluster so I don't see much
> difference. Also at a huge cluster the numbers above will be more evident.

Comment 15 Petr Horáček 2020-12-17 11:53:27 UTC
Israel, I hope you don't mind me moving QE of this to you as this issue is not limited to network and will require scale testing.

Comment 16 Israel Pinto 2020-12-17 13:33:24 UTC
On CNV 2.6 cluster this the memory and cpu usage (3 masters , 3 workers):

$ oc adm top pods -A | grep nmstate-handler
openshift-cnv                                      nmstate-handler-bdtq6                                             101m         35Mi            
openshift-cnv                                      nmstate-handler-brkj4                                             94m          39Mi            
openshift-cnv                                      nmstate-handler-gdbjv                                             115m         58Mi            
openshift-cnv                                      nmstate-handler-p2mjg                                             85m          25Mi            
openshift-cnv                                      nmstate-handler-sxjbd                                             110m         44Mi            
openshift-cnv                                      nmstate-handler-w75vt                                             110m         56Mi            

$ oc adm top pods -A | grep virt-api
openshift-cnv                                      virt-api-66cdcd649f-9wnxr                                         4m           102Mi           
openshift-cnv                                      virt-api-66cdcd649f-k65hd                                         5m           109Mi           

$ oc adm top pods -A | grep kube-apiserver
openshift-kube-apiserver                           kube-apiserver-ssp02-2mlw2-master-0                               153m         1299Mi          
openshift-kube-apiserver                           kube-apiserver-ssp02-2mlw2-master-1                               469m         2171Mi          
openshift-kube-apiserver                           kube-apiserver-ssp02-2mlw2-master-2                               222m         2048Mi          

$ oc adm top pods -A | grep virt-handler
openshift-cnv                                      virt-handler-26bjn                                                7m           104Mi           
openshift-cnv                                      virt-handler-65czz                                                3m           120Mi           
openshift-cnv                                      virt-handler-q2jcm                                                3m           116Mi

Comment 17 Quique Llorente 2020-12-23 10:14:17 UTC
Hi a PR was merged at kubernetse-nmstate https://github.com/nmstate/kubernetes-nmstate/pull/664, we have to test again with that fix in place.

Comment 18 Quique Llorente 2020-12-23 10:15:27 UTC
Also latest release from CNAO include it https://github.com/kubevirt/cluster-network-addons-operator/releases/tag/v0.44.0

Comment 19 Quique Llorente 2020-12-23 10:48:02 UTC
To test that the fix is in place we can monitor "/metrics" to find the total number of NNS updates and also monitor the update timestamp at NNS Status.

Comment 20 Petr Horáček 2021-01-06 11:51:44 UTC
The nmstate resource consumption improvement is available on D/S. As mentioned in this BZ, nmstate was not the only cause of this issue. I'm moving this to QE to get results for the current D/S build and identify the next culprit.

Comment 21 Sai Sindhur Malleni 2021-01-06 17:48:55 UTC
Can someone let me know what "channel" is a CNV build with this fix available?

Comment 23 Israel Pinto 2021-01-07 11:57:10 UTC
With CNV 2.6 + OCP 4.7 and kubernetes-nmstate-handler-container-v2.6.0-14
nmstate-handler looks more relax on CPU usage:
$ oc adm top pods -A | grep nmstate-handler
openshift-cnv                                      nmstate-handler-5vlxm                                             110m         46Mi            
openshift-cnv                                      nmstate-handler-6fm9r                                             95m          26Mi            
openshift-cnv                                      nmstate-handler-6h9dm                                             93m          23Mi            
openshift-cnv                                      nmstate-handler-czb7n                                             77m          25Mi            
openshift-cnv                                      nmstate-handler-djcts                                             172m         25Mi            
openshift-cnv                                      nmstate-handler-k8b9h                                             67m          23Mi 

For other API server: 
openshift-cnv namespace 
cdi-apiserver-58ccb6859d-6tbmv                        15m          21Mi            
virt-api-6598bfdcd6-cjqps                             1m           87Mi            
virt-api-6598bfdcd6-xvxkj                             4m           86Mi   

Out of openshift-cnv:
openshift-apiserver                                apiserver-5dd646db98-kx59n                                        32m          250Mi           
openshift-apiserver                                apiserver-5dd646db98-qmdn6                                        15m          298Mi           
openshift-apiserver                                apiserver-5dd646db98-vcg5w                                        23m          287Mi           
openshift-apiserver-operator                       openshift-apiserver-operator-55fbff96-84ssh                       17m          145Mi           
openshift-kube-apiserver                           kube-apiserver-verify-26-jw5hm-master-0                           358m         2174Mi          
openshift-kube-apiserver                           kube-apiserver-verify-26-jw5hm-master-1                           71m          1425Mi          
openshift-kube-apiserver                           kube-apiserver-verify-26-jw5hm-master-2                           217m         1966Mi          
openshift-kube-apiserver-operator                  kube-apiserver-operator-5b574b9476-2vm74                          65m          154Mi           
openshift-machine-api                              cluster-autoscaler-operator-84c7c78f7c-ds7g7                      0m           60Mi            
openshift-machine-api                              cluster-baremetal-operator-77cf96f46b-5qcm5                       0m           71Mi            
openshift-machine-api                              machine-api-controllers-68cfd48-29f68                             1m           162Mi           
openshift-machine-api                              machine-api-operator-5c7456b446-qqcwt                             0m           62Mi            
openshift-oauth-apiserver                          apiserver-58b88b85c-2gjb8                                         3m           61Mi            
openshift-oauth-apiserver                          apiserver-58b88b85c-s8xsz                                         6m           72Mi            
openshift-oauth-apiserver                          apiserver-58b88b85c-xb4vh                                         6m           67Mi  

I see that kube-apiserver have the highest usage.
In the logs is see alot of connection error, see:
I0107 07:33:05.128468      21 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc0214a8980, {CONNECTING <nil>}
I0107 07:33:05.138002      21 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc0214a8980, {READY <nil>}
I0107 07:33:05.139040      21 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
I0107 07:33:06.259477      21 client.go:360] parsed scheme: "passthrough"
I0107 07:33:06.259527      21 passthrough.go:48] ccResolverWrapper: sending update to cc: {[{https://192.168.0.24:2379  <nil> 0 <nil>}] <nil> <nil>}
I0107 07:33:06.259536      21 clientconn.go:948] ClientConn switching balancer to "pick_first"
I0107 07:33:06.259644      21 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc023f546c0, {CONNECTING <nil>}
I0107 07:33:06.269286      21 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc023f546c0, {READY <nil>}
I0107 07:33:06.270159      21 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
W0107 07:33:07.424630      21 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
I0107 07:33:15.744069      21 client.go:360] parsed scheme: "endpoint"

it seem to be related to the BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1844448

Sei you plan to retest with deploying large cluster (500 nodes) ?

Comment 24 Israel Pinto 2021-01-07 16:36:27 UTC
Moving to verify.

Comment 25 Sai Sindhur Malleni 2021-01-08 14:56:03 UTC
I got a chance to test this at 500 node scale on OCP 4.6.4 with a 2.6 nightly of CNV

[kni@e16-h18-b03-fc640 clusterconfigs]$ oc describe sub/hco-operatorhub | grep 2.6
      UID:               261f3260-da9b-4a8a-889a-b5e1216fbbd3
  Current CSV:              kubevirt-hyperconverged-operator.v2.6.0
  Installed CSV:       kubevirt-hyperconverged-operator.v2.6.0


nmstate-handler pods are still hot on CPU and over kube-apiserver CPU usage is around 30 cores (per apiserver) and memory consumption is around 40G per apiserver. Not really sure if nmstate-handler is the only culprit here but the CPU consumption is still high as noted earlier.

Comment 26 Sai Sindhur Malleni 2021-01-08 14:56:50 UTC
nmstate-handler top output
https://gist.github.com/smalleni/75a8974766e62c234da42beda25165f8

Comment 27 Sai Sindhur Malleni 2021-01-08 15:40:19 UTC
What is the right status to put this bug in to reopen it?

Comment 28 Quique Llorente 2021-01-11 09:28:09 UTC
(In reply to Sai Sindhur Malleni from comment #25)
> I got a chance to test this at 500 node scale on OCP 4.6.4 with a 2.6
> nightly of CNV
> 
> [kni@e16-h18-b03-fc640 clusterconfigs]$ oc describe sub/hco-operatorhub |
> grep 2.6
>       UID:               261f3260-da9b-4a8a-889a-b5e1216fbbd3
>   Current CSV:              kubevirt-hyperconverged-operator.v2.6.0
>   Installed CSV:       kubevirt-hyperconverged-operator.v2.6.0
> 
> 
> nmstate-handler pods are still hot on CPU and over kube-apiserver CPU usage
> is around 30 cores (per apiserver) and memory consumption is around 40G per
> apiserver. Not really sure if nmstate-handler is the only culprit here but
> the CPU consumption is still high as noted earlier.

Can you similar thing done at https://bugzilla.redhat.com/show_bug.cgi?id=1906151#c13 so we verify that the number of updates on NodeNetworkState is reduced ? at the end this is more or less what has being done.

Comment 29 Quique Llorente 2021-01-11 12:25:18 UTC
@smalleni Can you also verify at the 500 workers cluster that that the NodeNetworkState is not updated ? there is a "lastSuccessfulUpdateTime" field at "status" that get refreshed everytime a change at network node is detected.

Comment 30 Quique Llorente 2021-01-12 09:24:47 UTC
We are preparing a 1 minute NNS refresh period veresion to see if it helps with the effort for 2.6 https://github.com/nmstate/kubernetes-nmstate/pull/682.

Comment 31 Sai Sindhur Malleni 2021-01-13 17:31:38 UTC
Removing needinfo for now as I clarified over email/chat that the environment is currently not around. A dedicated CNV scale test might be the best way to reproduce this and dig deeper.

Comment 32 Petr Horáček 2021-01-22 08:12:11 UTC
We tried to get CPU consumption of nmstate even lower.

Please see if the nmstate issue prevails on scaled-up clusters and if there is any other process that would be consuming too much.

Comment 33 Petr Horáček 2021-01-22 08:36:58 UTC
Moving back on dev to clarify couple of questions to see if we can make it even better.

Comment 34 Petr Horáček 2021-01-22 15:56:28 UTC
We identified other issues in kubernetes-nmstate's controllers mechanism that will cause performance issues on midsize and big clusters. They are rooted deeply in the code and will require serious changes, we won't be able to address them in 2.6. I believe that this specific nmstate issue is not a regression introduced in 2.6.

I suggest to defer this to 4.8.

Comment 36 Sai Sindhur Malleni 2021-01-25 20:12:49 UTC
Petr.

This bug was identified by me in our internal scale lab testing, if that helps. I'm more than happy to try out the bits but hardware resources are in high contention and we are currently using the cluster for some upgrade testing, so I will try to identify a slot to try this. Do I have to be on OCP 4.7 to try CNV 2.6 nightlies or is 4.^ fine too? And the way to get these fixes is through the CNV nightly operator, correct?

I will leave a needinfo on me.

Comment 37 Petr Horáček 2021-01-28 10:48:43 UTC
4.7 would be strongly preferred. It may work fine with 4.6, but there is a chance we would hit some incompatibility and I would not like to waste your time asking for suboptimal and risky solutions.

Yes, the nightly operator should deliver our partial fixes.

Comment 38 Roman Mohr 2021-01-28 11:12:12 UTC
(In reply to Petr Horáček from comment #34)
> We identified other issues in kubernetes-nmstate's controllers mechanism
> that will cause performance issues on midsize and big clusters. They are
> rooted deeply in the code and will require serious changes, we won't be able
> to address them in 2.6. I believe that this specific nmstate issue is not a
> regression introduced in 2.6.
> 
> I suggest to defer this to 4.8.

Lets be a little bit more specific to help the scale people understand what is going on: The issue is that nmstate delivers a daemonset which starts pods on every node. And every pod on every node watches all nodes.
Considering that kubevirt does 1 or 2 node updates per minute and the kubelet by default does 6 updates per minute on every node, we transfer every node object to every nmstate pod 6-8 times per minute.
That is where the main issue is coming from. To do the match: Let's suppose 500 nodes, with 400 nmstate pods and default node update intervals, we have : 500 * 400 * 7 node objects read from the apiserver per minute, which makes 14 million objects transferred per minute or 23 000 node objects transferred per second.

Comment 39 Roman Mohr 2021-01-28 11:13:15 UTC
> 14 million objects transferred per minute or 23 000 node objects transferred
> per second.

Sorry 1.4 million objects.

Comment 40 Quique Llorente 2021-02-03 09:22:54 UTC
(In reply to Sai Sindhur Malleni from comment #3)
> Also there is an increase in the total number of long-running requests
> (watches) across the three API server instances: 
> https://snapshot.raintank.io/dashboard/snapshot/
> yNbi5YNhr445B461UWiyha9UzKxsrvfx


> 
> breaking down the watches by instance and resource
> https://snapshot.raintank.io/dashboard/snapshot/
> MyQshIldkTfX7Nh2eqcYtq0EcXUdp00k
> Seems like there is a big increase in the number of watches on secrets and
> configmaps

A part from that I have see that since we mount a configmap at every handler pod for the FILTER_INTERFACES env var we have a watcher at apiserver per node so that would be 500 watchers just for that on the big cluster. 

Also the service account token is being watched too at every kubernetes-nmstate-handler since the pod mount it, so that make 500 secrets watcher there.

For the first one we can put in place a solution easily but for the second one I think it's only that is going to be there if the pod has a service account.

Comment 42 Quique Llorente 2021-02-04 12:57:00 UTC
(In reply to Roman Mohr from comment #38)
> (In reply to Petr Horáček from comment #34)
> > We identified other issues in kubernetes-nmstate's controllers mechanism
> > that will cause performance issues on midsize and big clusters. They are
> > rooted deeply in the code and will require serious changes, we won't be able
> > to address them in 2.6. I believe that this specific nmstate issue is not a
> > regression introduced in 2.6.
> > 
> > I suggest to defer this to 4.8.
> 
> Lets be a little bit more specific to help the scale people understand what
> is going on: The issue is that nmstate delivers a daemonset which starts
> pods on every node. And every pod on every node watches all nodes.
> Considering that kubevirt does 1 or 2 node updates per minute and the
> kubelet by default does 6 updates per minute on every node, we transfer
> every node object to every nmstate pod 6-8 times per minute.
> That is where the main issue is coming from. To do the match: Let's suppose
> 500 nodes, with 400 nmstate pods and default node update intervals, we have
> : 500 * 400 * 7 node objects read from the apiserver per minute, which makes
> 14 million objects transferred per minute or 23 000 node objects transferred
> per second.

Just to verify this I have put in places some traces at controller-runtime vendor with https://github.com/nmstate/kubernetes-nmstate/pull/694

And it's clear that all handlers receive all the Nodes and all the NodeNetworkStates (NodeNetworkConfigurationPolicies does not increase with cluster size).
[ellorent@localhost kubernetes-nmstate]$ ./cluster/kubectl.sh logs -n nmstate nmstate-handler-ct5vh |grep event 
{"level":"info","ts":1612442369.534102,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442369.5341187,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442369.5341403,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442369.534188,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442369.534203,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442369.534318,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442375.4340763,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442375.4341297,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442375.4341464,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442376.5333335,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442376.5543258,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442376.5719728,"logger":"controller-runtime.source.EventHandler.OnAdd","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442376.5807424,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node01","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442397.385236,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442397.3853154,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}
{"level":"info","ts":1612442397.3853505,"logger":"controller-runtime.source.EventHandler.OnUpdate","msg":"event received","name":"node02","kind":"&TypeMeta{Kind:,APIVersion:,}"}

And reading the controller-runtime cache code they create an informer for each "Watch" source with a resync of 10 hours so what you say makes totally sense.

Comment 43 Petr Horáček 2021-02-08 13:24:53 UTC
We are currently investigating this issue on a performance cluster. This won't make it to 2.6, deferring to the upcoming 4.8.

Comment 44 Quique Llorente 2021-02-16 09:30:59 UTC
So we have access to a 100 nodes cluster and Sarah and Rai installed CNV 2.4 on and graphana.

Following is the metrics posted here with the following events:

- 22:00 UTC 02/14/2021 - Installation of CNV 2.4
- 15:00 UTC 02/15/2021 - Deactivation of CNAO and kubernetes-nmstate

CPU: https://snapshot.raintank.io/dashboard/snapshot/kAV0sUY0Ptl1HOIhg5RTuAvmr2e9RERP?orgId=2

Memory: https://snapshot.raintank.io/dashboard/snapshot/WFJ2Xwfjo5Prr5i7GOyMhgpZ8DgoZ7jB

Long running request (watchers): https://snapshot.raintank.io/dashboard/snapshot/8srTE6TsJNJwnzGgJAmM35oyVnpyTqQ3

Response size: https://snapshot.raintank.io/dashboard/snapshot/Ur2JgjaNUjDsXgDf0m46foqkwa6iA4W7

Request rate (read vs write): https://snapshot.raintank.io/dashboard/snapshot/Ur2JgjaNUjDsXgDf0m46foqkwa6iA4W7

Network (total number of bytes send and recive by GRPC clients): https://snapshot.raintank.io/dashboard/snapshot/wTGfSgYTkZA0bzk5E6g18iiXzrcFVz9z

Client traffic in: https://snapshot.raintank.io/dashboard/snapshot/Xb6XEVieFppz6i2nGrtmNqt49gVdLhRI

Peer traffic in: https://snapshot.raintank.io/dashboard/snapshot/jqrQBCE99rnGhBn5Ngb38JYRSQ3I9IMl

Peer traffic out: https://snapshot.raintank.io/dashboard/snapshot/wTGfSgYTkZA0bzk5E6g18iiXzrcFVz9z

Network utilization at worker: https://snapshot.raintank.io/dashboard/snapshot/Ks6Z8k0zxW4FcyrmpxKFeLLWKYYcpVk2


Top:

[kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-cnv -l component=kubernetes-nmstate-handler
NAME                    CPU(cores)   MEMORY(bytes)   
nmstate-handler-2699f   138m         35Mi            
nmstate-handler-2fh29   121m         37Mi            
nmstate-handler-2hw2t   123m         36Mi            
nmstate-handler-2lhvc   110m         35Mi            
nmstate-handler-2psph   119m         37Mi            
nmstate-handler-44q5x   87m          39Mi            
nmstate-handler-477bb   119m         38Mi            
nmstate-handler-4cjsv   162m         38Mi            
nmstate-handler-4t2xw   122m         37Mi            
...


[kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-cnv -l kubevirt.io=virt-handler
NAME                 CPU(cores)   MEMORY(bytes)   
virt-handler-28cvv   0m           80Mi            
virt-handler-29w29   0m           77Mi            
virt-handler-2blb4   0m           79Mi            
virt-handler-2cp2z   0m           76Mi            
virt-handler-2kv6j   0m           77Mi            
virt-handler-2wdlz   0m           78Mi            
virt-handler-4cg5p   0m           85Mi            


api server with knmstate running

kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   359m         2152Mi          
kube-apiserver-master-1   1802m        3078Mi          
kube-apiserver-master-2   1586m        3620Mi          

api server without knmstate running

[kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   93m          2174Mi          
kube-apiserver-master-1   1284m        3134Mi          
kube-apiserver-master-2   2373m        3543Mi          


Good thing is that we are able to reproduce the issue with smaller cluster and bad news is that 
kubernetes-nmstate looks like the part of the problem but not all.

So from the kubernetes-nmstate we have a pair of big issues:
- The refresh of NodeNetworkState, it has being mitigated at CNV 2.6 reducing the frequency to 1 minute (but forcing it if nncp is applied) and comparing before update NNS.
- The controller-runtime does not filter by field or label so all the nmstate-handlers are retrieving all the nodes/nns everytime they change or after 10 hours, to fix this
  we have a pair of PoCs we want to test.

I suppose Sai will be able to add much more info to this.

Comment 45 Quique Llorente 2021-02-16 11:38:25 UTC
I have continue looking at issue without kubernetes-nmstate and looks like "GET" operations for nodes is quite high, so I have take a look at daemonsets
do something with nodes like bridge-marker

So disabling bridge-marker (11:00 UTC 02/16/2021 ) the number of GET operations over nodes drops:

https://snapshot.raintank.io/dashboard/snapshot/S0LGUWt3T53PetORZRkz0lhTKBVGG1r8

Although apiservers not affected that much.

[kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   79m          2328Mi          
kube-apiserver-master-1   1062m        3154Mi          
kube-apiserver-master-2   2552m        3638Mi          


And same thing happends with ovs-cni since it's the analogue of bridge marker but for ovs bridges

after removing it (11:30 UTC 02/16/2021 ) we have a drop on node GET operations:
https://snapshot.raintank.io/dashboard/snapshot/kjEjlIE5nVD3HJAiB6NDmEgp9PMgGPLj

But still not much difference on apiservers
[kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   90m          2258Mi          
kube-apiserver-master-1   1169m        3147Mi          
kube-apiserver-master-2   2435m        3573Mi          

Network graphs for etcd cluster does not show much difference.

Comment 46 Quique Llorente 2021-02-16 12:57:44 UTC
Checking what api request keeps in the top the next candidate is configmaps.

After killing virt-handler we get a drop on GET configmaps 
https://snapshot.raintank.io/dashboard/snapshot/LQYwBEMfslWxlMovjIKpy3MDKiY8uh16


apiserver continues with the same load/memory
[kni@f08-h17-b01-5039ms ~]$ oc adm top pods -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   109m         2342Mi          
kube-apiserver-master-1   1191m        3142Mi          
kube-apiserver-master-2   2179m        3575Mi

Comment 47 Sai Sindhur Malleni 2021-02-16 17:09:39 UTC
One thing to note is after CNV install the 99% API response size becomes very large. Do we know why? If there are requests that return a large amount of data now?

This graph captures the increase in API server CPU usage after the install of CNV around 22:00 UTC.

Theoretically if we uninstall CNV completely, the APi server usage should go back to previous levels, is that happening?

Comment 48 Quique Llorente 2021-02-17 07:10:11 UTC
I also see that the response size does not change after disabling knmstate, it has to be something else.

Let's start tackling what we know:

- Try to check latest knmstate improvements either with CNV 2.6 or CNV 4.8 (I think this one is not ready yet)
- Deploy kubernetes-nmstate PoC that refresh NNS from a Deployment instead of DaemonSet, so we don't faces the N^2 issue retrieving nodes.

We have also to improve bridge-marker and ovs-cni so they don't retrieve Node so often, but this is not as impactful.

The other little issue is that we have some configmaps GET either from virt-operator or virt-handler but we didn't found where.

Comment 49 Quique Llorente 2021-02-17 07:18:43 UTC
I am going to try to deploy in top of the OCP we have the kubernetes-nmstate u/s version with the PoC so we have results earlier.

Comment 50 Quique Llorente 2021-02-17 11:13:38 UTC
I have being able to deploy latest u/s kubernetes-nmstate with the latest fixed and implementing the PoC
to refresh NNS from a deployment (Wed Feb 17 11:05:38 UTC 2021)

Numbers looks better at top 

nmstate-handler-v4vq8               0m           24Mi            
nmstate-handler-w8ngq               12m          24Mi            
nmstate-handler-wl5pz               12m          24Mi            
nmstate-handler-wmv6g               0m           23Mi            
nmstate-handler-xgrnt               12m          24Mi            
nmstate-handler-z74g5               0m           26Mi            
nmstate-handler-zfvhx               12m          26Mi            
nmstate-handler-zpgr9               0m           24Mi            
nmstate-monitor-9489cd4f9-6dcfs     26m          225Mi           
nmstate-operator-665949d7cb-5k8wg   0m           31Mi            
nmstate-webhook-6548b4b9d9-jd5l8    0m           33Mi            
nmstate-webhook-6548b4b9d9-pp4kh    1m           69Mi

Now the workers are doing less and all the load is at "nmstate-monitor" the 12m at "nmstate-handler" is related to "nmstate-monitor" calling "nmstatectl show" there

The apiserver looks better too

[kni@f08-h17-b01-5039ms knmstate]$ kubectl top pod -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   122m         1839Mi          
kube-apiserver-master-1   1335m        2778Mi          
kube-apiserver-master-2   2732m        3708Mi          
          
@smalleni Can you characterize the cluster with this installed ?

Comment 51 Sai Sindhur Malleni 2021-02-19 17:32:35 UTC
OK just checked on the cluster, it seems like nmstate-handler isn't running now. While definitely the nmstate-handler induced CPU/network business is solved, having a CNV install on the cluster still seems to add some load on the API server. While not so pronounced on the 100 node cluster, this might cause problems on larger clusters. We also need to define what % of increase in API server load is acceptabled with CNV.

Comment 52 David Vossel 2021-02-19 21:58:57 UTC
> While definitely the nmstate-handler induced CPU/network business is solved, having a CNV install on the cluster still seems to add some load on the API server. 

Yeah, i think we have more issues. I noticed recently that when introducing new VMIs into a cluster, that every virt-handler is watching every VMI. That's a global vmi watcher on every node in the cluster that a vmi can run on. This will likely impact performance on large clusters as the density of VMIs increase. For instance, if we introduce a single VMI then every virt handler on every node is sent that VMI. Instead only the virt-handler present on the node the VMI is scheduled to should receive that information.

There's a PR here to address that specific issue upstream. https://github.com/kubevirt/kubevirt/pull/5060

> We also need to define what % of increase in API server load is acceptabled with CNV.

Does it make sense to define this at both an idle state and at load? At an idle state, my expectation is our controllers should only result in a slight increase of API calls. Under load, I'd want to see a linear progression that correlates to how many VMIs are introduced into the system.

Comment 53 Roman Mohr 2021-02-20 07:51:32 UTC
(In reply to David Vossel from comment #52)
> > While definitely the nmstate-handler induced CPU/network business is solved, having a CNV install on the cluster still seems to add some load on the API server. 
> 
> Yeah, i think we have more issues. I noticed recently that when introducing
> new VMIs into a cluster, that every virt-handler is watching every VMI.
> That's a global vmi watcher on every node in the cluster that a vmi can run
> on. This will likely impact performance on large clusters as the density of
> VMIs increase. For instance, if we introduce a single VMI then every virt
> handler on every node is sent that VMI. Instead only the virt-handler
> present on the node the VMI is scheduled to should receive that information.
> 
> There's a PR here to address that specific issue upstream.
> https://github.com/kubevirt/kubevirt/pull/5060

Whoa, that is even worse than the nmstate issue. I would propose porting that to 2.5 and 2.6. It is a very simple fix.

> 
> > We also need to define what % of increase in API server load is acceptabled with CNV.
> 
> Does it make sense to define this at both an idle state and at load? At an
> idle state, my expectation is our controllers should only result in a slight
> increase of API calls. Under load, I'd want to see a linear progression that
> correlates to how many VMIs are introduced into the system.

For me that makes absolute sense.

Comment 54 Quique Llorente 2021-02-22 07:10:41 UTC
(In reply to Roman Mohr from comment #53)
> (In reply to David Vossel from comment #52)
> > > While definitely the nmstate-handler induced CPU/network business is solved, having a CNV install on the cluster still seems to add some load on the API server. 
> > 
> > Yeah, i think we have more issues. I noticed recently that when introducing
> > new VMIs into a cluster, that every virt-handler is watching every VMI.
> > That's a global vmi watcher on every node in the cluster that a vmi can run
> > on. This will likely impact performance on large clusters as the density of
> > VMIs increase. For instance, if we introduce a single VMI then every virt
> > handler on every node is sent that VMI. Instead only the virt-handler
> > present on the node the VMI is scheduled to should receive that information.
> > 
> > There's a PR here to address that specific issue upstream.
> > https://github.com/kubevirt/kubevirt/pull/5060
> 
> Whoa, that is even worse than the nmstate issue. I would propose porting
> that to 2.5 and 2.6. It is a very simple fix.
> 
> > 
> > > We also need to define what % of increase in API server load is acceptabled with CNV.
> > 
> > Does it make sense to define this at both an idle state and at load? At an
> > idle state, my expectation is our controllers should only result in a slight
> > increase of API calls. Under load, I'd want to see a linear progression that
> > correlates to how many VMIs are introduced into the system.
> 
> For me that makes absolute sense.

But at this bugzilla we have to focus on idle and try to understand what else we may need to fix Roman also
pointed to https://github.com/kubevirt/kubevirt/pull/5061 but I have stop virt-handler and the different is not 
that much.

Comment 55 Quique Llorente 2021-02-22 07:33:23 UTC
(In reply to Sai Sindhur Malleni from comment #51)
> OK just checked on the cluster, it seems like nmstate-handler isn't running
> now. While definitely the nmstate-handler induced CPU/network business is
> solved, having a CNV install on the cluster still seems to add some load on
> the API server. While not so pronounced on the 100 node cluster, this might
> cause problems on larger clusters. We also need to define what % of increase
> in API server load is acceptabled with CNV.

The nmstate pods are running under different namespace "nmstate" that's why you don't see it, 
it's the u/s project nmstate.

Comment 56 Quique Llorente 2021-02-22 12:10:41 UTC
After some investigation the kubernetes-nmstate PoC give us better results at CPU/Memory but it increase the req duration time
since network configuration is is passed from handler to monitor.

The second PoC hacks controller-runtime cache so it filters by field name it has being deployed at Mon Feb 22 12:07:13 UTC 2021

Top is similar to the first PoC
[kni@f08-h17-b01-5039ms manifest-poc-nmstate-monitor]$ kubectl top pod -n openshift-kube-apiserver
NAME                      CPU(cores)   MEMORY(bytes)   
kube-apiserver-master-0   162m         2446Mi          
kube-apiserver-master-1   979m         3332Mi          
kube-apiserver-master-2   2354m        3821Mi          

And nmstate numbers looks similar 
nmstate-handler-vvjgt               12m          25Mi            
nmstate-handler-w56pt               0m           23Mi            
nmstate-handler-x4q5r               0m           29Mi            
nmstate-handler-xbgtm               0m           23Mi            
nmstate-handler-xh2qw               12m          25Mi            
nmstate-handler-xpvpt               12m          26Mi            
nmstate-handler-xq95g               0m           25Mi            
nmstate-handler-zc7g6               13m          26Mi            
nmstate-handler-znmrd               0m           25Mi            
nmstate-operator-665949d7cb-bcxtn   0m           28Mi            
nmstate-webhook-6548b4b9d9-9k87s    2m           68Mi            
nmstate-webhook-6548b4b9d9-g7ggb    0m           35Mi

Comment 57 Quique Llorente 2021-02-22 12:26:58 UTC
Also disabling virt-handler at Mon Feb 22 12:26:25 UTC 2021

Comment 58 Quique Llorente 2021-02-22 15:11:50 UTC
enabling virt-handler at Mon Feb 22 15:11:20 UTC 2021, not much differences.

Comment 62 Quique Llorente 2021-02-23 11:44:25 UTC
After removing HCO and the operators, looks like cluster does not go back to initial request duration before installing CNV 2.4:

request duration read vs write: https://snapshot.raintank.io/dashboard/snapshot/xgmamp3xavxd7Igp1LYr0kOtrTPGIpJV?orgId=2

Maybe the monitoring is adding some noise ?

Comment 64 Sai Sindhur Malleni 2021-03-31 17:47:20 UTC
Removing testblocker as we are not actively testing this scenario currently and will get back to it when the full set of changes are in. I am adding the CNV perf/scale team to the cc list.

Comment 71 Yossi Segev 2021-12-23 09:44:20 UTC
Petr,

The bug was retargetted for 4.10, but the fixed-in-version is CNAO v4.9.0-18.
Before attempting to verifying it, I would like to ensure that the fix is available in an 4.10 CNAO.
If it is  - please specify exactly in which version, so we know we are attempting to verify in a correct cluster.

Thanks,
Yossi

Comment 72 Dan Kenigsberg 2021-12-28 08:45:07 UTC
Yossi, the code is already in 4.9.0. The bug was postponed to 4.10 time frame only due to QE capacity. Please verify it with the 500-node cluster currently used by Guy Chen.

Comment 73 Yossi Segev 2021-12-28 09:12:19 UTC
> Yossi, the code is already in 4.9.0. The bug was postponed to 4.10 time frame only due to QE capacity. Please verify it with the 500-node cluster currently used by Guy Chen.

My question still stands - the bug is targetted to 4.10, so I need to know on which CNAO *4.10.x-y* version the fix is included.
I guess it's not available from CNAO 4.10.0-0 and up, and I wouldn't want to waste any time checking a version which doesn't include the fix.

Comment 74 Yossi Segev 2021-12-28 12:21:55 UTC
Verified on a 500-nodes cluster from CNV SSP QE.
CNV Version: 4.9.1
CNAO version: v4.9.1-9

nmstate-handler, which was found to responsible to the heavy load on kube-apiserver, now shows legitimate CPU values:
[kni@e16-h12-b02-fc640 ~]$ kubectl top pods -nopenshift-cnv | grep nmstate-handler
NAME                                                 CPU(cores)   MEMORY(bytes)
nmstate-handler-24h2l                                 2m           48Mi            
nmstate-handler-24lqv                                 35m          60Mi            
nmstate-handler-257hq                                 18m          52Mi            
nmstate-handler-2626d                                 18m          60Mi            
nmstate-handler-294b5                                 17m          62Mi            
nmstate-handler-2knms                                 18m          54Mi            
nmstate-handler-2lnsv                                 33m          47Mi            
nmstate-handler-2pkv4                                 20m          50Mi            
nmstate-handler-2ppmn                                 2m           52Mi            

in comparison to the values presented in comment #5.

However, kube-apiserver still shows high CPU levels.
[kni@e16-h12-b02-fc640 ~]$ kubectl top pods -A | grep kube-apiserver
NAMESPACE                                          NAME                                                         CPU(cores)   MEMORY(bytes)
openshift-kube-apiserver                           kube-apiserver-master-0                                      23724m       28373Mi         
openshift-kube-apiserver                           kube-apiserver-master-1                                      21347m       26880Mi         
openshift-kube-apiserver                           kube-apiserver-master-2                                      14941m       27082Mi         

We will now check if kube-apiserver still consumes big amount of CPU after uninstalling CNV, and if it doesn't - we will open a new bug.

Comment 79 errata-xmlrpc 2022-03-16 15:50:56 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0947


Note You need to log in before you can comment on or make changes to this bug.