Bug 1904026 - The quota controllers should resync on new resources and make progress
Summary: The quota controllers should resync on new resources and make progress
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Lukasz Szaszkiewicz
QA Contact: zhou ying
URL:
Whiteboard:
Depends On:
Blocks: 1904028 1904032
TreeView+ depends on / blocked
 
Reported: 2020-12-03 11:46 UTC by Lukasz Szaszkiewicz
Modified: 2021-02-24 15:37 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 1904028 (view as bug list)
Environment:
Last Closed: 2021-02-24 15:37:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-policy-controller pull 48 0 None closed the quota controllers should resync on new resources and make progress 2021-01-11 13:25:19 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:37:55 UTC

Description Lukasz Szaszkiewicz 2020-12-03 11:46:45 UTC
The quota controllers act on resources retrieved from the discovery endpoint which might contain only a fraction of all resources due to a network error.

Both controllers should periodically resync when new resources are observed from the discovery.

Additionally, the CRQ should always ensure the current set of monitors are running.
The CRQ should not block when new resources are observed (deadlock).

Comment 2 zhou ying 2020-12-08 12:42:07 UTC
Reproduce with payload: 4.7.0-0.nightly-2020-12-03-083300 follow the steps here:
1) Scale down the CVO to replicas==0;
2) turn off openshift-apiserver;
3) change kubecontrollermanagers cluster to restart KCMs;
4) turn on openshift-apiserver;
5) create test project and create imagestream;
6) create quota about imagestream: `oc create quota test1 --hard=openshift.io/imagestreams=10`
7) create and delete the imagestream resource , could reproduce the issue, the quota only  increase, can't decrease .

Comment 3 zhou ying 2020-12-10 02:41:22 UTC
Confirmed with latest payload , can't reproduce the issue now: 
[root@dhcp-140-138 ~]# oc get clusterversion 
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.7.0-0.nightly-2020-12-09-112139   True        False         53m     Cluster version is 4.7.0-0.nightly-2020-12-09-112139

Follow steps: 
1) Scale down the CVO to replicas==0;
2) Chang the openshiftapiservers to Unmanaged status;
3) Scale down the openshift apiserver to 0, to turn off the openshift-apiserver;
4) restart all KCMs;
5) Turn on the openshift-apiserver;
6)Create test project and imagestream resource and quota about imagestream:
7) Delete all the imagestream, check the quota :

[root@dhcp-140-138 ~]# oc get is 
NAME                       IMAGE REPOSITORY                                                                  TAGS   UPDATED
httpd-example              image-registry.openshift-image-registry.svc:5000/zhouy/httpd-example                     
rails-postgresql-example   image-registry.openshift-image-registry.svc:5000/zhouy/rails-postgresql-example          
[root@dhcp-140-138 ~]# oc describe quota 
Name:                      test1
Namespace:                 zhouy
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  2     10
[root@dhcp-140-138 ~]# oc delete all --all
[root@dhcp-140-138 ~]# oc get is 
No resources found in zhouy namespace.
[root@dhcp-140-138 ~]# oc describe quota 
Name:                      test1
Namespace:                 zhouy
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  0     10

Comment 6 errata-xmlrpc 2021-02-24 15:37:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.