1965368 – ClusterQuotaAdmission received non-meta object - message constantly reported in OpenShift Container Platform 4.7

Bug 1965368 - ClusterQuotaAdmission received non-meta object - message constantly reported in OpenShift Container Platform 4.7

Summary: ClusterQuotaAdmission received non-meta object - message constantly reported ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-apiserver
Sub Component:
Version:	4.7
Hardware:	x86_64
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	4.10.0
Assignee:	Filip Krepinsky
QA Contact:	Ke Wang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2104482
TreeView+	depends on / blocked

Reported:	2021-05-27 14:52 UTC by Simon Reber
Modified:	2024-12-20 20:08 UTC (History)
CC List:	24 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2022-03-12 04:35:20 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	openshift apiserver-library-go pull 62	None	Merged	Bug 1965368: fix clusterresourcequota admission	2022-09-11 13:04:12 UTC
Github	openshift kubernetes pull 971	None	Merged	Bug 1965368: UPSTREAM: <drop>: bump(apiserver-library-go)	2022-09-11 13:04:13 UTC
Red Hat Product Errata	RHSA-2022:0056	None	None	None	2022-03-12 04:35:40 UTC

Description Simon Reber 2021-05-27 14:52:01 UTC

Description of problem:

OpenShift Container Platform 4.7.9 is reporting constantly the below warning message in all 3 kube-apiserver logs. So far no trace what is causing this and whether this is causing harm or similar.

~~~
021-05-26T14:31:55.510328426Z W0526 14:31:55.510221      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:31:55.753492861Z W0526 14:31:55.753287      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:31:55.770704006Z W0526 14:31:55.770576      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:31:58.819233215Z W0526 14:31:58.819153      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
[...]
2021-05-26T14:32:10.387368002Z W0526 14:32:10.386344      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:32:10.712314068Z W0526 14:32:10.712221      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:32:11.505397096Z W0526 14:32:11.505271      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:32:11.713592085Z W0526 14:32:11.713484      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:32:11.972648154Z W0526 14:32:11.972529      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-05-26T14:32:12.384759006Z W0526 14:32:12.384576      19 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
~~~

Version-Release number of selected component (if applicable):

OpenShift Container Platform 4.7.9

How reproducible:

N/A

Steps to Reproduce:
1. N/A

Actual results:

Warning "ClusterQuotaAdmission received non-meta object *unstructured.Unstructured" issued constantly without information why and what is causing it.

Expected results:

No such warning being reported

Additional info:

Comment 2 Lukasz Szaszkiewicz 2021-06-11 11:56:19 UTC

I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.

Comment 3 Lukasz Szaszkiewicz 2021-07-05 12:40:03 UTC

I’m adding UpcomingSprint, because I was occupied by fixing bugs with higher priority/severity, developing new features with higher priority, or developing new features to improve stability at a macro level. I will revisit this bug next sprint.

Comment 4 Mridul Markandey 2021-07-12 19:17:30 UTC

We have a customer facing the same issue that the kube-api server is consuming max of CPU resources and the Kube-apiserver pods logs shows this output:
~~~
ccResolverWrapper: sending update to cc: {[{https://localhost:2379  <nil> 0 <nil>}] <nil> <nil>}
2021-07-09T06:36:50.297990740Z I0709 06:36:50.297885      17 clientconn.go:948] ClientConn switching balancer to "pick_first"
2021-07-09T06:36:50.300944216Z I0709 06:36:50.298272      17 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc02705cd60, {CONNECTING <nil>}
2021-07-09T06:36:50.318796017Z I0709 06:36:50.313817      17 balancer_conn_wrappers.go:78] pickfirstBalancer: HandleSubConnStateChange: 0xc02705cd60, {READY <nil>}
2021-07-09T06:36:50.318796017Z I0709 06:36:50.315980      17 controlbuf.go:508] transport: loopyWriter.run returning. connection error: desc = "transport is closing"
2021-07-09T06:36:52.532112913Z W0709 06:36:52.531817      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-07-09T06:36:52.582434039Z W0709 06:36:52.582254      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-07-09T06:36:52.731282776Z W0709 06:36:52.731187      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-07-09T06:36:52.931948178Z W0709 06:36:52.931846      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-07-09T06:36:53.131199569Z W0709 06:36:53.131091      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-07-09T06:36:53.330927116Z W0709 06:36:53.330835      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
2021-07-09T06:36:53.531801993Z W0709 06:36:53.531689      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
~~~
Can you suggest a workaround for this? If more information is required please let me know I'll update the bug.

Comment 5 Mridul Markandey 2021-07-13 09:49:47 UTC

Hello Team, 

Any update on this? 

Thanks.

Comment 6 Lukasz Szaszkiewicz 2021-07-13 10:11:24 UTC

Hi, I am really sorry but I haven't had time to dig into this issue and I cannot give you any ETA.

Comment 7 Lukasz Szaszkiewicz 2021-07-13 10:12:31 UTC

BTW: How do you know that entries like "ClusterQuotaAdmission received non-meta object *unstructured.Unstructured" spike the CPU?

Comment 8 Mridul Markandey 2021-07-13 11:26:44 UTC

As per the screenshot of the console provided by the Customer, Kube-apiserver was consuming the max amount of CPU resource and the Kube-apiserver pods were overloaded with these logs:
~~~
2021-07-09T06:36:52.532112913Z W0709 06:36:52.531817      17 admission.go:98] ClusterQuotaAdmission received non-meta object *unstructured.Unstructured
~~~

Is there any way to confirm the increased load on Kube-apiserver is due to these spamming logs? Or is it because of something else? Shall I ask for Prometheus metrics?

Comment 10 Robert Baumgartner 2021-08-11 10:13:49 UTC

I have the issue on my cluster, 4.7.22
interesting is that the pod on master-1 only consumes 80-100 mCores
the pod on master-0 consumes 230-150 mCores
the pod on master-2 consumes 380-400 mCores

only log on master-2 contains "ClusterQuotaAdmission received non-meta object" messages

Comment 12 David J. M. Karlsen 2021-08-16 22:49:31 UTC

+1:

```
for pod in kube-apiserver-ilp-dts-g-c01ocp01 kube-apiserver-ilp-dts-g-c01ocp02 kube-apiserver-ilp-dts-g-c01ocp03;do k logs $pod |grep ClusterQuotaAdmission|wc -l;done
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/kube-apiserver-ilp-dts-g-c01ocp01. Please use `kubectl.kubernetes.io/default-container` instead
     995
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/kube-apiserver-ilp-dts-g-c01ocp02. Please use `kubectl.kubernetes.io/default-container` instead
   54025
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/kube-apiserver-ilp-dts-g-c01ocp03. Please use `kubectl.kubernetes.io/default-container` instead
    6076
```

```
k top pods
NAME                                CPU(cores)   MEMORY(bytes)   
kube-apiserver-ilp-dts-g-c01ocp01   198m         4068Mi          
kube-apiserver-ilp-dts-g-c01ocp02   480m         4164Mi          
kube-apiserver-ilp-dts-g-c01ocp03   424m         4475Mi          
```

Comment 13 David J. M. Karlsen 2021-08-16 22:50:21 UTC

4.7.23

Comment 14 David J. M. Karlsen 2021-08-16 22:51:12 UTC

```
k get pods
NAME                                     READY   STATUS      RESTARTS   AGE
kube-apiserver-ilp-dts-g-c01ocp01        5/5     Running     0          2d8h
kube-apiserver-ilp-dts-g-c01ocp02        5/5     Running     0          2d8h
kube-apiserver-ilp-dts-g-c01ocp03        5/5     Running     0          2d8h
```
so alive for about same time...

Comment 17 Lukasz Szaszkiewicz 2021-09-03 13:35:39 UTC

This BZ was initially created to resolve only the log entries like "ClusterQuotaAdmission received non-meta object *unstructured.Unstructured"
I haven't had time to look into it yet but is very unlikely that it would cause increased usage of CPU. 
Has anyone done an analysis that would explain increased CPU?

Comment 25 Filip Krepinsky 2021-09-14 13:02:34 UTC

The reason the messages appear is superfluous logging when parsing some types (mainly CRDs like monitoring.coreos.com/v1 ServiceMonitor or PrometheusRule) in ClusterResourceQuota admission controller.

This should not be the reason for high CPU usage. The high amount of objects processed by the APIServer causes the logging to appear more often. But the logging itself does not indicate any errors

I have posted a fix to disable the logging.

Regarding the high CPU usage: 

> must-gather from customer Cluster where this message is reported can be found on https://drive.google.com/file/d/1a6BPJdvCzQR77nwXa1AHwtzT8DRi1RDW/view?usp=sharing

I did not find anything unusual in this must-gather (focused on kube-apiserver)

@jkaur I could not access your must-gather logs and the other evidence uploaded to this BZ is not that much useful.

Comment 27 Robert Baumgartner 2021-09-20 14:44:23 UTC

@fkrepins I have another must-gather https://drive.google.com/drive/folders/1_QO0_4HTmpur0mLY0ynvMwKUXpwJ3urY?usp=sharing and it happens on my newly created 4.8.11 cluster as well.

Comment 28 Filip Krepinsky 2021-09-27 12:13:22 UTC

Thanks, I looked at these, but couldn't found anything abnormal with regards to this issue. There is probably some other reason why the cluster got unhealthy.

Comment 32 Ke Wang 2021-09-29 09:29:59 UTC

Verification steps, 

$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-09-29-021903   True        False         58m     Cluster version is 4.10.0-0.nightly-2021-09-29-021903

- There is no warning message 'ClusterQuotaAdmission received non-meta object *unstructured.Unstructured' in openshift-kube-apiserver log files,

$ for kas_pod in $(oc get pods -n openshift-kube-apiserver -l apiserver | sed '1d' | awk '{print $1}'); do oc logs $kas_pod -n openshift-kube-apiserver|grep ClusterQuotaAdmission|wc -l;done
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/kube-apiserver-kewang-29410g1-7flqf-master-0.c.openshift-qe.internal. Please use `kubectl.kubernetes.io/default-container` instead
0
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/kube-apiserver-kewang-29410g1-7flqf-master-1.c.openshift-qe.internal. Please use `kubectl.kubernetes.io/default-container` instead
0
Using deprecated annotation `kubectl.kubernetes.io/default-logs-container` in pod/kube-apiserver-kewang-29410g1-7flqf-master-2.c.openshift-qe.internal. Please use `kubectl.kubernetes.io/default-container` instead
0

- Check the nodes workload, they are normal.

$ for node in $(oc get no | sed '1d' | awk '{print $1}'); do echo "$node --->"; oc describe node/$node |  sed -n '/^Allocated resources/,/hugepages-2Mi/p';echo;done
kewang-29410g1-7flqf-master-0.c.openshift-qe.internal --->
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        2020m (57%)   0 (0%)
  memory                     6837Mi (49%)  0 (0%)
  ephemeral-storage          0 (0%)        0 (0%)
  hugepages-1Gi              0 (0%)        0 (0%)
  hugepages-2Mi              0 (0%)        0 (0%)

kewang-29410g1-7flqf-master-1.c.openshift-qe.internal --->
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        1755m (50%)   0 (0%)
  memory                     5512Mi (39%)  0 (0%)
  ephemeral-storage          0 (0%)        0 (0%)
  hugepages-1Gi              0 (0%)        0 (0%)
  hugepages-2Mi              0 (0%)        0 (0%)

kewang-29410g1-7flqf-master-2.c.openshift-qe.internal --->
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        1809m (51%)   0 (0%)
  memory                     5607Mi (40%)  0 (0%)
  ephemeral-storage          0 (0%)        0 (0%)
  hugepages-1Gi              0 (0%)        0 (0%)
  hugepages-2Mi              0 (0%)        0 (0%)

kewang-29410g1-7flqf-worker-a-95kb7.c.openshift-qe.internal --->
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        727m (20%)    0 (0%)
  memory                     3333Mi (23%)  0 (0%)
  ephemeral-storage          0 (0%)        0 (0%)
  hugepages-1Gi              0 (0%)        0 (0%)
  hugepages-2Mi              0 (0%)        0 (0%)

kewang-29410g1-7flqf-worker-b-wwkbp.c.openshift-qe.internal --->
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                   Requests      Limits
  --------                   --------      ------
  cpu                        670m (19%)    0 (0%)
  memory                     2980Mi (21%)  0 (0%)
  ephemeral-storage          0 (0%)        0 (0%)
  hugepages-1Gi              0 (0%)        0 (0%)
  hugepages-2Mi              0 (0%)        0 (0%)

Comment 34 Filip Krepinsky 2021-10-15 11:12:15 UTC

I think there is no plan to backport this, unless @maszulik or @sttts have a different opinion.

Comment 35 Maciej Szulik 2021-10-19 17:23:56 UTC

(In reply to Filip Krepinsky from comment #34)
> I think there is no plan to backport this, unless @maszulik or
> @sttts have a different opinion.

My default answer is not to backport, unless there's a high demand and the impact is pretty bad, which is not the case with this fix.

Comment 40 Filip Krepinsky 2022-03-10 23:45:25 UTC

@sprashar The error should be inconsequential and not cause the higher CPU usage. So no workaround is needed, apart from not looking at the logs.

Nevertheless if you really feel you have a case for unusual CPU utilization I advise you to open a new bug and attach measurement/profiling of a CPU utilization, metrics, SOS and MG.

Comment 41 errata-xmlrpc 2022-03-12 04:35:20 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Comment 43 Filip Krepinsky 2023-01-26 15:34:56 UTC

Hi, there has not been any change in the direction and there are no plans to backport this.

Comment 44 Red Hat Bugzilla 2023-09-18 00:27:01 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days

Note You need to log in before you can comment on or make changes to this bug.

adsoni
aos-bugs
david.karlsen
dhellmann
dyocum
fkrepins
hjaiswal
jkaur
lszaszki
lukas.bauer
maszulik
mfojtik
mmarkand
musman
ocohen
palonsor
pmagotra
rbaumgar
skrenger
skudupud
sprashar
sttts
wlewis
xxia