Bug 1858798
| Summary: | rangeallocations.data is never updated when a project is removed | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Maciej Szulik <maszulik> | |
| Component: | kube-controller-manager | Assignee: | Maciej Szulik <maszulik> | |
| Status: | CLOSED ERRATA | QA Contact: | RamaKasturi <knarra> | |
| Severity: | urgent | Docs Contact: | ||
| Priority: | urgent | |||
| Version: | 4.2.z | CC: | aaleman, aos-bugs, arghosh, bleanhar, bmilne, bshirren, calfonso, chuffman, fhirtz, knarra, maszulik, mfojtik, oarribas, pmuller, rhowe, tnozicka, travi, vlaad, wking, yinzhou | |
| Target Milestone: | --- | |||
| Target Release: | 4.5.z | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | Bug Fix | ||
| Doc Text: |
Cause:
UID range allocation is never updated when a project is removed. Only restarting kube-controller-manager pod was triggering repair procedure which was clearing that range.
Consequence:
It is possible to exhaust the UID range on cluster with high namespace create+remove turnover.
Fix:
Periodically run the repair job.
Result:
The UID range allocation should be freed periodically (currently every 8 hours) which should not require additional kube-controller-manager restarts. It should also ensure that the range is not exhausted.
|
Story Points: | --- | |
| Clone Of: | 1808588 | |||
| : | 1858800 (view as bug list) | Environment: | ||
| Last Closed: | 2020-08-10 13:50:20 UTC | Type: | --- | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 1808588 | |||
| Bug Blocks: | 1858800 | |||
|
Comment 3
RamaKasturi
2020-08-02 16:56:12 UTC
we've synced offline, there is 8 hours reclaim period Verified the bug with the payload below [ramakasturinarra@dhcp35-60 cucushift]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-07-29-051236 True False 35h Cluster version is 4.5.0-0.nightly-2020-07-29-051236 Below are the steps which i have followed to verify the bug: =========================================================== 1) Install 4.5 cluster with the above payload 2) check rangeallocations.data by running the command below [ramakasturinarra@dhcp35-60 cucushift]$ oc get rangeallocations scc-uid -o yaml | grep -o "/" | wc -l 16 3) Created 10K projects 4) Ran the command below to check the rangeallocations.data [ramakasturinarra@dhcp35-60 cucushift]$ oc get rangeallocations scc-uid -o yaml | grep -o "/" | wc -l 1707 4) Deleted all the 10K projects and i see that the rangeallocations.data goes down to original value which is 16 after 4 hours of project deletions. [ramakasturinarra@dhcp35-60 cucushift]$ oc get rangeallocations scc-uid -o yaml | grep -o "/" | wc -l 16 P.s: Looks like reclaim period is 8 hours for rangeallocations.data to go come back to its original value once all the created projects are deleted. For more info please refer comment 4, also in QE test it took around 4-5 hours for the same. Based on the above moving the bug to verified state, thanks !! Tried to test the bug again and this time i do not see any delay i.e need not have to wait for 8 hours for rangeallocations.data to be updated once all projects are deleted. Before 10K project creation: ================================= [ramakasturinarra@dhcp35-60 cucushift]$ oc get projects | wc -l 58 [ramakasturinarra@dhcp35-60 cucushift]$ oc get rangeallocations scc-uid -o yaml | grep -o "/" | wc -l 16 [ramakasturinarra@dhcp35-60 cucushift]$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.5.0-0.nightly-2020-08-01-204100 True False 12m Cluster version is 4.5.0-0.nightly-2020-08-01-204100 After 10K project creation: =================================== [ramakasturinarra@dhcp35-60 cucushift]$ oc get projects | wc -l 10099 [ramakasturinarra@dhcp35-60 cucushift]$ oc get projects | grep knarra | wc -l 10042 [ramakasturinarra@dhcp35-60 cucushift]$ oc get rangeallocations scc-uid -o yaml | grep -o "/" | wc -l 2011 After 10K project deletion: ================================= [ramakasturinarra@dhcp35-60 cucushift]$ oc get rangeallocations scc-uid -o yaml | grep -o "/" | wc -l 19 [ramakasturinarra@dhcp35-60 cucushift]$ oc get projects | wc -l 58 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.5 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3188 |