Bug 1978340
Summary: | packageserver isn't following the OpenShift HA conventions | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Damien Grisonnet <dgrisonn> |
Component: | OLM | Assignee: | tflannag |
OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | tflannag |
Version: | 4.9 | Keywords: | Triaged |
Target Milestone: | --- | ||
Target Release: | 4.9.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2021-10-18 17:37:30 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Damien Grisonnet
2021-07-01 15:31:30 UTC
The PDB resource is documented after the High Availability section as part of the Upgrade and Reconfiguration section [1]. [1] https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#upgrade-and-reconfiguration 1, Create a HA cluster, mac:openshift-tests-private jianzhang$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-08-30-232019 True False 137m Cluster version is 4.9.0-0.nightly-2021-08-30-232019 mac:openshift-tests-private jianzhang$ oc exec catalog-operator-5556959747-b58n4 -- olm --version OLM version: 0.18.3 git commit: 01e1cf8ca9b4ec532d4b134b11e09bed8efc5b60 mac:openshift-tests-private jianzhang$ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-129-64.us-east-2.compute.internal Ready master 166m v1.22.0-rc.0+249ab87 ip-10-0-136-66.us-east-2.compute.internal Ready worker 162m v1.22.0-rc.0+249ab87 ip-10-0-169-191.us-east-2.compute.internal Ready worker 162m v1.22.0-rc.0+249ab87 ip-10-0-178-145.us-east-2.compute.internal Ready master 167m v1.22.0-rc.0+249ab87 ip-10-0-195-135.us-east-2.compute.internal Ready worker 158m v1.22.0-rc.0+249ab87 ip-10-0-206-170.us-east-2.compute.internal Ready master 168m v1.22.0-rc.0+249ab87 2, Check the pod anti-affinity configuration and check if the two pods are in different nodes. mac:openshift-tests-private jianzhang$ oc get deployment packageserver -o yaml ... spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchLabels: app: packageserver topologyKey: kubernetes.io/hostname mac:openshift-tests-private jianzhang$ oc get pods -l app=packageserver -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES packageserver-bdbb545d6-55mhs 1/1 Running 0 170m 10.129.0.5 ip-10-0-178-145.us-east-2.compute.internal <none> <none> packageserver-bdbb545d6-72ppk 1/1 Running 0 170m 10.128.0.41 ip-10-0-206-170.us-east-2.compute.internal <none> <none> 3, Recreate one packageserver pods, mac:openshift-tests-private jianzhang$ oc delete pods packageserver-bdbb545d6-55mhs pod "packageserver-bdbb545d6-55mhs" deleted mac:openshift-tests-private jianzhang$ oc get pods -l app=packageserver -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES packageserver-bdbb545d6-72ppk 1/1 Running 0 171m 10.128.0.41 ip-10-0-206-170.us-east-2.compute.internal <none> <none> packageserver-bdbb545d6-hqh4m 0/1 ContainerCreating 0 7s <none> ip-10-0-129-64.us-east-2.compute.internal <none> <none> mac:openshift-tests-private jianzhang$ oc get pods -l app=packageserver -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES packageserver-bdbb545d6-72ppk 1/1 Running 0 171m 10.128.0.41 ip-10-0-206-170.us-east-2.compute.internal <none> <none> packageserver-bdbb545d6-hqh4m 1/1 Running 0 20s 10.130.0.39 ip-10-0-129-64.us-east-2.compute.internal <none> <none> LGTM, the packageserver pods never running on the same node. 3, Create a non-HA cluster, check the pods and pdb. [cloud-user@preserve-olm-env jian]$ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-159-81.us-east-2.compute.internal Ready master,worker 133m v1.22.0-rc.0+249ab87 [cloud-user@preserve-olm-env jian]$ oc get deployment packageserver -o yaml apiVersion: apps/v1 ... spec: affinity: {} I guess we don't need to support the PDB in SNO since SNO doesn't support the HA. But, it doesn't have any negative impact on the SNO since the "maxUnavailable=1". I will verify it, please let me know if any problem, thanks! [cloud-user@preserve-olm-env jian]$ oc get pdb NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE packageserver-pdb N/A 1 1 134m [cloud-user@preserve-olm-env jian]$ oc get pdb packageserver-pdb -o yaml apiVersion: policy/v1 kind: PodDisruptionBudget metadata: annotations: include.release.openshift.io/ibm-cloud-managed: "true" include.release.openshift.io/self-managed-high-availability: "true" include.release.openshift.io/single-node-developer: "true" creationTimestamp: "2021-08-31T03:47:52Z" generation: 1 name: packageserver-pdb namespace: openshift-operator-lifecycle-manager ownerReferences: - apiVersion: config.openshift.io/v1 kind: ClusterVersion name: version uid: 824bda17-3f7d-4665-81a4-4f825b612de8 resourceVersion: "6737" uid: 7ae3700d-3b0b-4885-a3a7-ca0655bd0fb9 spec: maxUnavailable: 1 selector: matchLabels: app: packageserver status: conditions: - lastTransitionTime: "2021-08-31T03:50:24Z" message: "" observedGeneration: 1 reason: SufficientPods status: "True" type: DisruptionAllowed currentHealthy: 1 desiredHealthy: 0 disruptionsAllowed: 1 expectedPods: 1 observedGeneration: 1 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |