1904028 – [release-4.6] The quota controllers should resync on new resources and make progress

Bug 1904028 - [release-4.6] The quota controllers should resync on new resources and make progress

Summary: [release-4.6] The quota controllers should resync on new resources and make p...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	kube-controller-manager
Sub Component:
Version:	4.6
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.6.z
Assignee:	Lukasz Szaszkiewicz
QA Contact:	zhou ying
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1906649 (view as bug list)
Depends On:	1904026
Blocks:	1904030 1904032 1906649
TreeView+	depends on / blocked

Reported:	2020-12-03 11:53 UTC by Lukasz Szaszkiewicz
Modified:	2020-12-21 13:24 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:	1904026
Clones:	1904030 (view as bug list)
Environment:
Last Closed:	2020-12-21 13:23:59 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift cluster-policy-controller pull 50	0	None	closed	Bug 1904028: The quota controllers should resync on new resources and make progress.	2021-02-01 18:50:00 UTC
Red Hat Product Errata	RHSA-2020:5614	0	None	None	None	2020-12-21 13:24:24 UTC

Description Lukasz Szaszkiewicz 2020-12-03 11:53:57 UTC

+++ This bug was initially created as a clone of Bug #1904026 +++

The quota controllers act on resources retrieved from the discovery endpoint which might contain only a fraction of all resources due to a network error.

Both controllers should periodically resync when new resources are observed from the discovery.

Additionally, the CRQ should always ensure the current set of monitors are running.
The CRQ should not block when new resources are observed (deadlock).

Comment 2 Lukasz Szaszkiewicz 2020-12-11 08:25:10 UTC

*** Bug 1906649 has been marked as a duplicate of this bug. ***

Comment 3 zhou ying 2020-12-14 08:46:24 UTC

Confirmed with latest payload: 4.6.0-0.nightly-2020-12-13-230909, the issue has fixed:

[root@dhcp-140-138 ~]#  oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-12-13-230909   True        False         17m     Cluster version is 4.6.0-0.nightly-2020-12-13-230909


Follow steps: 
1) Scale down the CVO to replicas==0;
2) Chang the openshiftapiservers to Unmanaged status;
3) Scale down the openshift apiserver to 0, to turn off the openshift-apiserver;
4) restart all KCMs;
5) Turn on the openshift-apiserver;
6)Create test project and imagestream resource and quota about imagestream:
7) Delete all the imagestream, check the quota :

[root@dhcp-140-138 ~]# oc get quota test1
NAME    AGE   REQUEST                           LIMIT
test1   15s   openshift.io/imagestreams: 1/10   
[root@dhcp-140-138 ~]# oc get is 
NAME                       IMAGE REPOSITORY                                                                   TAGS   UPDATED
rails-postgresql-example   image-registry.openshift-image-registry.svc:5000/zhouyt/rails-postgresql-example          
[root@dhcp-140-138 ~]# oc delete all --all 
......
[root@dhcp-140-138 ~]# oc get is 
No resources found in zhouyt namespace.
[root@dhcp-140-138 ~]# oc get quota test1
NAME    AGE   REQUEST                           LIMIT
test1   90s   openshift.io/imagestreams: 0/10

Comment 6 errata-xmlrpc 2020-12-21 13:23:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.6.9 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5614

Note You need to log in before you can comment on or make changes to this bug.