Bug 2051457
| Summary: | [RFE] PDB for cloud-controller-manager to avoid going too many replicas down | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Jan Chaloupka <jchaloup> |
| Component: | Cloud Compute | Assignee: | dmoiseev |
| Cloud Compute sub component: | Cloud Controller Manager | QA Contact: | Huali Liu <huliu> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | ddonati, zhsun |
| Version: | 4.11 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.11.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-08-10 10:47:50 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 2099499 | ||
|
Description
Jan Chaloupka
2022-02-07 10:01:06 UTC
If necessary, please consider backporting the PDB into 4.10 as well. This is definitely something we should be adding for the CCCMO, and I think most likely backport to 4.10 Some feedback has been left on the PR to fix this, will need to be updated before we can merge Checked on alibaba and aws, pdb is there, labels and selectors are correct, will check more providers tomorrow.
on alibaba:
1. Install a fresh cluster
liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-16-163450 True False 5h58m Cluster version is 4.11.0-0.nightly-2022-04-16-163450
liuhuali@Lius-MacBook-Pro huali-test %
2. Check pdb is there
liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
alibabacloud-cloud-controller-manager 1 N/A 1 6h18m
liuhuali@Lius-MacBook-Pro huali-test %
3. Check labels and selectors are correct
liuhuali@Lius-MacBook-Pro huali-test % oc get all -n openshift-cloud-controller-manager
NAME READY STATUS RESTARTS AGE
pod/alibaba-cloud-controller-manager-69bd7cbd9c-d4qrp 1/1 Running 0 6h21m
pod/alibaba-cloud-controller-manager-69bd7cbd9c-lkstz 1/1 Running 0 6h21m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/alibaba-cloud-controller-manager 2/2 2 2 6h21m
NAME DESIRED CURRENT READY AGE
replicaset.apps/alibaba-cloud-controller-manager-69bd7cbd9c 2 2 2 6h21m
liuhuali@Lius-MacBook-Pro huali-test % oc edit deploy alibaba-cloud-controller-manager -n openshift-cloud-controller-manager
Edit cancelled, no changes made.
...
labels:
infrastructure.openshift.io/cloud-controller-manager: AlibabaCloud
k8s-app: alibaba-cloud-controller-manager
...
selector:
matchLabels:
infrastructure.openshift.io/cloud-controller-manager: AlibabaCloud
k8s-app: alibaba-cloud-controller-manager
...
on aws:
1. Install a fresh cluster
liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.11.0-0.nightly-2022-04-16-163450 True False 92m Cluster version is 4.11.0-0.nightly-2022-04-16-163450
liuhuali@Lius-MacBook-Pro huali-test %
2. Edit featuregate, then wait for nodes restart successfully until all nodes get READY status
change
spec: {}
to
spec:
featureSet: TechPreviewNoUpgrade
liuhuali@Lius-MacBook-Pro huali-test % oc edit featuregate cluster
featuregate.config.openshift.io/cluster edited
liuhuali@Lius-MacBook-Pro huali-test % oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-128-92.us-east-2.compute.internal Ready worker 103m v1.23.3+54654d2
ip-10-0-134-118.us-east-2.compute.internal Ready master 116m v1.23.3+54654d2
ip-10-0-182-199.us-east-2.compute.internal Ready master 116m v1.23.3+54654d2
ip-10-0-185-140.us-east-2.compute.internal Ready worker 110m v1.23.3+54654d2
ip-10-0-212-38.us-east-2.compute.internal Ready worker 109m v1.23.3+54654d2
ip-10-0-213-209.us-east-2.compute.internal Ready master 116m v1.23.3+54654d2
liuhuali@Lius-MacBook-Pro huali-test %
3. Check pdb is there
liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
aws-cloud-controller-manager 1 N/A 1 24m
liuhuali@Lius-MacBook-Pro huali-test %
4. Check labels and selectors are correct
liuhuali@Lius-MacBook-Pro huali-test % oc get all -n openshift-cloud-controller-manager
NAME READY STATUS RESTARTS AGE
pod/aws-cloud-controller-manager-846dd9f85c-64kp6 1/1 Running 0 25m
pod/aws-cloud-controller-manager-846dd9f85c-m6kdh 1/1 Running 0 25m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/aws-cloud-controller-manager 2/2 2 2 25m
NAME DESIRED CURRENT READY AGE
replicaset.apps/aws-cloud-controller-manager-846dd9f85c 2 2 2 25m
liuhuali@Lius-MacBook-Pro huali-test % oc edit deploy aws-cloud-controller-manager -n openshift-cloud-controller-manager
Edit cancelled, no changes made.
...
labels:
infrastructure.openshift.io/cloud-controller-manager: AWS
k8s-app: aws-cloud-controller-manager
...
selector:
matchLabels:
infrastructure.openshift.io/cloud-controller-manager: AWS
k8s-app: aws-cloud-controller-manager
...
Checked on gcp, azure, ibm, vsphere and openstack, pdb is there, labels and selectors are correct. Move this to Verifed. clusterversion: 4.11.0-0.nightly-2022-04-16-163450 liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE gcp-cloud-controller-manager 1 N/A 1 70m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE azure-cloud-controller-manager 1 N/A 1 67m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE ibmcloud-cloud-controller-manager 1 N/A 1 111m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE vsphere-cloud-controller-manager 1 N/A 1 82m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE openstack-cloud-controller-manager 1 N/A 1 33m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |