Bug 2051457
Summary: | [RFE] PDB for cloud-controller-manager to avoid going too many replicas down | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jan Chaloupka <jchaloup> |
Component: | Cloud Compute | Assignee: | dmoiseev |
Cloud Compute sub component: | Cloud Controller Manager | QA Contact: | Huali Liu <huliu> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | ddonati, zhsun |
Version: | 4.11 | ||
Target Milestone: | --- | ||
Target Release: | 4.11.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2022-08-10 10:47:50 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 2099499 |
Description
Jan Chaloupka
2022-02-07 10:01:06 UTC
If necessary, please consider backporting the PDB into 4.10 as well. This is definitely something we should be adding for the CCCMO, and I think most likely backport to 4.10 Some feedback has been left on the PR to fix this, will need to be updated before we can merge Checked on alibaba and aws, pdb is there, labels and selectors are correct, will check more providers tomorrow. on alibaba: 1. Install a fresh cluster liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-04-16-163450 True False 5h58m Cluster version is 4.11.0-0.nightly-2022-04-16-163450 liuhuali@Lius-MacBook-Pro huali-test % 2. Check pdb is there liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE alibabacloud-cloud-controller-manager 1 N/A 1 6h18m liuhuali@Lius-MacBook-Pro huali-test % 3. Check labels and selectors are correct liuhuali@Lius-MacBook-Pro huali-test % oc get all -n openshift-cloud-controller-manager NAME READY STATUS RESTARTS AGE pod/alibaba-cloud-controller-manager-69bd7cbd9c-d4qrp 1/1 Running 0 6h21m pod/alibaba-cloud-controller-manager-69bd7cbd9c-lkstz 1/1 Running 0 6h21m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/alibaba-cloud-controller-manager 2/2 2 2 6h21m NAME DESIRED CURRENT READY AGE replicaset.apps/alibaba-cloud-controller-manager-69bd7cbd9c 2 2 2 6h21m liuhuali@Lius-MacBook-Pro huali-test % oc edit deploy alibaba-cloud-controller-manager -n openshift-cloud-controller-manager Edit cancelled, no changes made. ... labels: infrastructure.openshift.io/cloud-controller-manager: AlibabaCloud k8s-app: alibaba-cloud-controller-manager ... selector: matchLabels: infrastructure.openshift.io/cloud-controller-manager: AlibabaCloud k8s-app: alibaba-cloud-controller-manager ... on aws: 1. Install a fresh cluster liuhuali@Lius-MacBook-Pro huali-test % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.11.0-0.nightly-2022-04-16-163450 True False 92m Cluster version is 4.11.0-0.nightly-2022-04-16-163450 liuhuali@Lius-MacBook-Pro huali-test % 2. Edit featuregate, then wait for nodes restart successfully until all nodes get READY status change spec: {} to spec: featureSet: TechPreviewNoUpgrade liuhuali@Lius-MacBook-Pro huali-test % oc edit featuregate cluster featuregate.config.openshift.io/cluster edited liuhuali@Lius-MacBook-Pro huali-test % oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-128-92.us-east-2.compute.internal Ready worker 103m v1.23.3+54654d2 ip-10-0-134-118.us-east-2.compute.internal Ready master 116m v1.23.3+54654d2 ip-10-0-182-199.us-east-2.compute.internal Ready master 116m v1.23.3+54654d2 ip-10-0-185-140.us-east-2.compute.internal Ready worker 110m v1.23.3+54654d2 ip-10-0-212-38.us-east-2.compute.internal Ready worker 109m v1.23.3+54654d2 ip-10-0-213-209.us-east-2.compute.internal Ready master 116m v1.23.3+54654d2 liuhuali@Lius-MacBook-Pro huali-test % 3. Check pdb is there liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE aws-cloud-controller-manager 1 N/A 1 24m liuhuali@Lius-MacBook-Pro huali-test % 4. Check labels and selectors are correct liuhuali@Lius-MacBook-Pro huali-test % oc get all -n openshift-cloud-controller-manager NAME READY STATUS RESTARTS AGE pod/aws-cloud-controller-manager-846dd9f85c-64kp6 1/1 Running 0 25m pod/aws-cloud-controller-manager-846dd9f85c-m6kdh 1/1 Running 0 25m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/aws-cloud-controller-manager 2/2 2 2 25m NAME DESIRED CURRENT READY AGE replicaset.apps/aws-cloud-controller-manager-846dd9f85c 2 2 2 25m liuhuali@Lius-MacBook-Pro huali-test % oc edit deploy aws-cloud-controller-manager -n openshift-cloud-controller-manager Edit cancelled, no changes made. ... labels: infrastructure.openshift.io/cloud-controller-manager: AWS k8s-app: aws-cloud-controller-manager ... selector: matchLabels: infrastructure.openshift.io/cloud-controller-manager: AWS k8s-app: aws-cloud-controller-manager ... Checked on gcp, azure, ibm, vsphere and openstack, pdb is there, labels and selectors are correct. Move this to Verifed. clusterversion: 4.11.0-0.nightly-2022-04-16-163450 liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE gcp-cloud-controller-manager 1 N/A 1 70m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE azure-cloud-controller-manager 1 N/A 1 67m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE ibmcloud-cloud-controller-manager 1 N/A 1 111m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE vsphere-cloud-controller-manager 1 N/A 1 82m liuhuali@Lius-MacBook-Pro huali-test % oc get pdb -n openshift-cloud-controller-manager NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE openstack-cloud-controller-manager 1 N/A 1 33m Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:5069 |