1920995 – kuryr-cni pods using unreasonable amount of CPU

Bug 1920995 - kuryr-cni pods using unreasonable amount of CPU

Summary: kuryr-cni pods using unreasonable amount of CPU

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Networking
Sub Component:
Version:	4.7
Hardware:	All
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.6.z
Assignee:	Michał Dulko
QA Contact:	GenadiC
Docs Contact:
URL:
Whiteboard:
Depends On:	1920481
Blocks:
TreeView+	depends on / blocked

Reported:	2021-01-27 11:45 UTC by OpenShift BugZilla Robot
Modified:	2024-06-14 00:03 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-02-08 13:51:26 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift kuryr-kubernetes pull 440	0	None	closed	[release-4.6] Bug 1920995: Decrease CPU usage of Prometheus exporter	2021-02-10 16:39:38 UTC
Red Hat Product Errata	RHSA-2021:0308	0	None	None	None	2021-02-08 13:51:57 UTC

Comment 3 rlobillo 2021-01-28 11:22:43 UTC

Verified on 4.6.0-0.nightly-2021-01-28-042639 over OSP16.1 (RHOS-16.1-RHEL-8-20201214.n.3) with OVN-Octavia.

Prometheus Exporter worker process is consuming ~0% of CPU on kuryr-cni when the cluster is idle. Prometheus service is provided normally.

$ oc get pods -n openshift-kuryr -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP             NODE                          NOMINATED NODE   READINESS GATES
kuryr-cni-d8t7r                    1/1     Running   0          47m   10.196.3.114   ostest-mjts6-master-2         <none>           <none>
kuryr-cni-fbwtb                    1/1     Running   0          27m   10.196.2.57    ostest-mjts6-worker-0-2z569   <none>           <none>
kuryr-cni-gf2td                    1/1     Running   0          47m   10.196.0.57    ostest-mjts6-master-1         <none>           <none>
kuryr-cni-k9vfz                    1/1     Running   0          26m   10.196.0.162   ostest-mjts6-worker-0-n9bbl   <none>           <none>
kuryr-cni-l88nt                    1/1     Running   0          47m   10.196.2.163   ostest-mjts6-master-0         <none>           <none>                                                                                                                                     
kuryr-cni-nrwbs                    1/1     Running   0          25m   10.196.0.216   ostest-mjts6-worker-0-tnz55   <none>           <none>                                                                                                                                     
kuryr-controller-ddb697794-69b8l   1/1     Running   1          47m   10.196.2.163   ostest-mjts6-master-0         <none>           <none>

#kuryr-cni pod on master-0:

$ oc rsh pod -n openshift-kuryr kuryr-cni-l88nt
Error from server (NotFound): pods "pod" not found
[stack@undercloud-0 ~]$ oc rsh -n openshift-kuryr kuryr-cni-l88nt
sh-4.4# top -b -c -n 1
top - 11:10:22 up 51 min,  0 users,  load average: 7.04, 6.20, 4.81
Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
%Cpu(s): 26.7 us,  8.3 sy,  0.0 ni, 60.0 id,  0.0 wa,  3.3 hi,  1.7 si,  0.0 st
MiB Mem :  16034.3 total,   5534.2 free,   5327.1 used,   5173.1 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  11174.5 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1 root      20   0  386868  79084  13464 S   0.0   0.5   0:03.26 kuryr-daemon: master process [/usr/bin/kuryr-daemon --config-file /etc/kuryr/kuryr.conf]
     17 root      20   0  953152  74408   7588 S   0.0   0.5   0:03.72 kuryr-daemon: master process [/usr/bin/kuryr-daemon --config-file /etc/kuryr/kuryr.conf]
     27 root      20   0  474852  77980   8624 S   0.0   0.5   0:08.22 kuryr-daemon: watcher worker(0)
     33 root      20   0  395328  73600   5752 S   0.0   0.4   0:01.19 kuryr-daemon: server worker(0)
     37 root      20   0  761280  84900  10084 S   0.0   0.5   0:08.70 kuryr-daemon: health worker(0)
     42 root      20   0  542916  76820   8468 S   0.0   0.5   0:02.16 kuryr-daemon: Prometheus Exporter worker(0)
   3232 root      20   0   12020   3092   2664 S   0.0   0.0   0:00.02 /bin/sh
   3244 root      20   0   51020   3840   3288 R   0.0   0.0   0:00.00 top -b -c -n 1
sh-4.4# exit
exit

#kuryr-cni pod on worker-0-2z569:


[stack@undercloud-0 ~]$ oc rsh -n openshift-kuryr kuryr-cni-fbwtb
sh-4.4# top -b -c -n 1
top - 11:10:59 up 29 min,  0 users,  load average: 5.42, 3.81, 2.60
Tasks:   8 total,   1 running,   7 sleeping,   0 stopped,   0 zombie
%Cpu(s):  6.7 us,  6.7 sy,  0.0 ni, 86.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  16034.3 total,   4899.5 free,   6261.7 used,   4873.1 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  10080.6 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1 root      20   0  386940  79224  13608 S   0.0   0.5   0:03.73 kuryr-daemon: master process [/usr/bin/kuryr-daemon --config-file /etc/kuryr/kuryr.conf]                                                                                                                
     17 root      20   0 1739400  74560   7556 S   0.0   0.5   0:02.61 kuryr-daemon: master process [/usr/bin/kuryr-daemon --config-file /etc/kuryr/kuryr.conf]                                                                                                                
     27 root      20   0  473864  75112   8552 S   0.0   0.5   0:04.83 kuryr-daemon: watcher worker(0)
     33 root      20   0  395400  71600   5760 S   0.0   0.4   0:00.75 kuryr-daemon: server worker(0)
     37 root      20   0  695304  82612  10152 S   0.0   0.5   0:04.72 kuryr-daemon: health worker(0)
     40 root      20   0  542988  74712   8492 S   0.0   0.5   0:01.49 kuryr-daemon: Prometheus Exporter worker(0)
   2214 root      20   0   12020   3116   2688 S   0.0   0.0   0:00.01 /bin/sh
   2225 root      20   0   51020   3844   3292 R   0.0   0.0   0:00.00 top -b -c -n 1
sh-4.4# exit

Prometheus service is provided normally:

$ token=`oc sa get-token prometheus-k8s -n openshift-monitoring`
$ curl -sk -H "Authorization: Bearer $token" 'https://prometheus-k8s-openshift-monitoring.apps.ostest.shiftstack.com/api/v1/alerts' | jq '.data.alerts[] |
select(.labels.alertname) | .labels.alertname'                            
"AlertmanagerReceiversNotConfigured"                                       
"Watchdog"

Comment 6 errata-xmlrpc 2021-02-08 13:51:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.6.16 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0308

Note You need to log in before you can comment on or make changes to this bug.