Bug 1910096 - [release-4.4] The quota controllers should resync on new resources and make progress
Summary: [release-4.4] The quota controllers should resync on new resources and make p...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: kube-controller-manager
Version: 4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.4.z
Assignee: Lukasz Szaszkiewicz
QA Contact: zhou ying
URL:
Whiteboard:
: 1904032 (view as bug list)
Depends On: 1904030
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-22 16:30 UTC by OpenShift BugZilla Robot
Modified: 2021-02-03 10:12 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The quota controllers are now periodically resynced when new resources are observed from the discovery. Before due to a network error on startup they could have missed some resources.
Clone Of:
Environment:
Last Closed: 2021-02-03 10:11:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift cluster-policy-controller pull 52 0 None closed Bug 1910096: The quota controllers should resync on new resources and make progress 2021-02-01 18:29:35 UTC
Red Hat Product Errata RHSA-2021:0281 0 None None None 2021-02-03 10:12:10 UTC

Description OpenShift BugZilla Robot 2020-12-22 16:30:22 UTC
+++ This bug was initially created as a clone of Bug #1904030 +++

+++ This bug was initially created as a clone of Bug #1904028 +++

+++ This bug was initially created as a clone of Bug #1904026 +++

The quota controllers act on resources retrieved from the discovery endpoint which might contain only a fraction of all resources due to a network error.

Both controllers should periodically resync when new resources are observed from the discovery.

Additionally, the CRQ should always ensure the current set of monitors are running.
The CRQ should not block when new resources are observed (deadlock).

--- Additional comment from yinzhou on 2020-12-15 08:21:21 UTC ---

Confirmed with payload: 4.5.0-0.ci.test-2020-12-15-072346-ci-ln-x6p47n2 , the issue has fixed:


[root@dhcp-140-138 roottest]# oc get clusterversion 
NAME      VERSION                                           AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.5.0-0.ci.test-2020-12-15-072346-ci-ln-x6p47n2   True        False         5m16s   Cluster version is 4.5.0-0.ci.test-2020-12-15-072346-ci-ln-x6p47n2

Follow steps: 
1) Scale down the CVO to replicas==0;
2) Chang the openshiftapiservers to Unmanaged status;
3) Scale down the openshift apiserver to 0, to turn off the openshift-apiserver;
4) restart all KCMs;
5) Turn on the openshift-apiserver;
6)Create test project and imagestream resource and quota about imagestream:
7) Delete all the imagestream, check the quota :


[root@dhcp-140-138 roottest]# oc get is
NAME                       IMAGE REPOSITORY                                                                  TAGS   UPDATED
rails-postgresql-example   image-registry.openshift-image-registry.svc:5000/zhouy/rails-postgresql-example          
[root@dhcp-140-138 roottest]#  oc create quota test1 --hard=openshift.io/imagestreams=10
resourcequota/test1 created
[root@dhcp-140-138 roottest]# oc describe quota test1
Name:                      test1
Namespace:                 zhouy
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  1     10
[root@dhcp-140-138 roottest]# oc delete all --all
......
[root@dhcp-140-138 roottest]# oc get is
No resources found in zhouy namespace.
[root@dhcp-140-138 roottest]# oc describe quota test1
Name:                      test1
Namespace:                 zhouy
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  0     10

Comment 1 zhou ying 2021-01-11 07:52:23 UTC
Confirmed with the 4.4.0-0.ci.test-2021-01-11-064333-ci-ln-bgbiyzk, the issue has fixed:

[root@dhcp-140-138 ~]# oc get  clusterversion 
NAME      VERSION                                           AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.ci.test-2021-01-11-064333-ci-ln-bgbiyzk   True        False         25m     Cluster version is 4.4.0-0.ci.test-2021-01-11-064333-ci-ln-bgbiyzk

Follow steps: 
1) Scale down the CVO to replicas==0;
2) Change the openshiftapiservers to Unmanaged status;
3) Scale down the openshift apiserver to 0, to turn off the openshift-apiserver;
4) restart all KCMs;
5) Turn on the openshift-apiserver;
6)Create test project and imagestream resource and quota about imagestream:
7) Delete all the imagestream, check the quota :

[root@dhcp-140-138 ~]# oc create quota test1 --hard=openshift.io/imagestreams=10
resourcequota/test1 created
[root@dhcp-140-138 ~]# oc describe quota test1
Name:                      test1
Namespace:                 zhouyt
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  0     10
[root@dhcp-140-138 ~]# oc new-app rails-postgresql-example
...
[root@dhcp-140-138 ~]# oc get is 
NAME                       IMAGE REPOSITORY                                                                   TAGS   UPDATED
rails-postgresql-example   image-registry.openshift-image-registry.svc:5000/zhouyt/rails-postgresql-example          
[root@dhcp-140-138 ~]# oc describe quota test1
Name:                      test1
Namespace:                 zhouyt
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  1     10
[root@dhcp-140-138 ~]# oc new-app httpd-example
...
[root@dhcp-140-138 ~]# oc get is
NAME                       IMAGE REPOSITORY                                                                   TAGS   UPDATED
httpd-example              image-registry.openshift-image-registry.svc:5000/zhouyt/httpd-example                     
rails-postgresql-example   image-registry.openshift-image-registry.svc:5000/zhouyt/rails-postgresql-example          
[root@dhcp-140-138 ~]# oc describe quota test1
Name:                      test1
Namespace:                 zhouyt
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  2     10
[root@dhcp-140-138 ~]# oc delete all --all 
...
[root@dhcp-140-138 ~]# oc get is
No resources found in zhouyt namespace.
[root@dhcp-140-138 ~]# oc describe quota test1
Name:                      test1
Namespace:                 zhouyt
Resource                   Used  Hard
--------                   ----  ----
openshift.io/imagestreams  0     10

Comment 2 Lukasz Szaszkiewicz 2021-01-11 08:25:31 UTC
*** Bug 1904032 has been marked as a duplicate of this bug. ***

Comment 6 errata-xmlrpc 2021-02-03 10:11:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.4.33 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0281


Note You need to log in before you can comment on or make changes to this bug.