Bug 1978340
| Summary: | packageserver isn't following the OpenShift HA conventions | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Damien Grisonnet <dgrisonn> |
| Component: | OLM | Assignee: | tflannag |
| OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | tflannag |
| Version: | 4.9 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:37:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Damien Grisonnet
2021-07-01 15:31:30 UTC
The PDB resource is documented after the High Availability section as part of the Upgrade and Reconfiguration section [1]. [1] https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#upgrade-and-reconfiguration 1, Create a HA cluster,
mac:openshift-tests-private jianzhang$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.0-0.nightly-2021-08-30-232019 True False 137m Cluster version is 4.9.0-0.nightly-2021-08-30-232019
mac:openshift-tests-private jianzhang$ oc exec catalog-operator-5556959747-b58n4 -- olm --version
OLM version: 0.18.3
git commit: 01e1cf8ca9b4ec532d4b134b11e09bed8efc5b60
mac:openshift-tests-private jianzhang$ oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-129-64.us-east-2.compute.internal Ready master 166m v1.22.0-rc.0+249ab87
ip-10-0-136-66.us-east-2.compute.internal Ready worker 162m v1.22.0-rc.0+249ab87
ip-10-0-169-191.us-east-2.compute.internal Ready worker 162m v1.22.0-rc.0+249ab87
ip-10-0-178-145.us-east-2.compute.internal Ready master 167m v1.22.0-rc.0+249ab87
ip-10-0-195-135.us-east-2.compute.internal Ready worker 158m v1.22.0-rc.0+249ab87
ip-10-0-206-170.us-east-2.compute.internal Ready master 168m v1.22.0-rc.0+249ab87
2, Check the pod anti-affinity configuration and check if the two pods are in different nodes.
mac:openshift-tests-private jianzhang$ oc get deployment packageserver -o yaml
...
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: packageserver
topologyKey: kubernetes.io/hostname
mac:openshift-tests-private jianzhang$ oc get pods -l app=packageserver -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
packageserver-bdbb545d6-55mhs 1/1 Running 0 170m 10.129.0.5 ip-10-0-178-145.us-east-2.compute.internal <none> <none>
packageserver-bdbb545d6-72ppk 1/1 Running 0 170m 10.128.0.41 ip-10-0-206-170.us-east-2.compute.internal <none> <none>
3, Recreate one packageserver pods,
mac:openshift-tests-private jianzhang$ oc delete pods packageserver-bdbb545d6-55mhs
pod "packageserver-bdbb545d6-55mhs" deleted
mac:openshift-tests-private jianzhang$ oc get pods -l app=packageserver -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
packageserver-bdbb545d6-72ppk 1/1 Running 0 171m 10.128.0.41 ip-10-0-206-170.us-east-2.compute.internal <none> <none>
packageserver-bdbb545d6-hqh4m 0/1 ContainerCreating 0 7s <none> ip-10-0-129-64.us-east-2.compute.internal <none> <none>
mac:openshift-tests-private jianzhang$ oc get pods -l app=packageserver -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
packageserver-bdbb545d6-72ppk 1/1 Running 0 171m 10.128.0.41 ip-10-0-206-170.us-east-2.compute.internal <none> <none>
packageserver-bdbb545d6-hqh4m 1/1 Running 0 20s 10.130.0.39 ip-10-0-129-64.us-east-2.compute.internal <none> <none>
LGTM, the packageserver pods never running on the same node.
3, Create a non-HA cluster, check the pods and pdb.
[cloud-user@preserve-olm-env jian]$ oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-159-81.us-east-2.compute.internal Ready master,worker 133m v1.22.0-rc.0+249ab87
[cloud-user@preserve-olm-env jian]$ oc get deployment packageserver -o yaml
apiVersion: apps/v1
...
spec:
affinity: {}
I guess we don't need to support the PDB in SNO since SNO doesn't support the HA. But, it doesn't have any negative impact on the SNO since the "maxUnavailable=1".
I will verify it, please let me know if any problem, thanks!
[cloud-user@preserve-olm-env jian]$ oc get pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
packageserver-pdb N/A 1 1 134m
[cloud-user@preserve-olm-env jian]$ oc get pdb packageserver-pdb -o yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
annotations:
include.release.openshift.io/ibm-cloud-managed: "true"
include.release.openshift.io/self-managed-high-availability: "true"
include.release.openshift.io/single-node-developer: "true"
creationTimestamp: "2021-08-31T03:47:52Z"
generation: 1
name: packageserver-pdb
namespace: openshift-operator-lifecycle-manager
ownerReferences:
- apiVersion: config.openshift.io/v1
kind: ClusterVersion
name: version
uid: 824bda17-3f7d-4665-81a4-4f825b612de8
resourceVersion: "6737"
uid: 7ae3700d-3b0b-4885-a3a7-ca0655bd0fb9
spec:
maxUnavailable: 1
selector:
matchLabels:
app: packageserver
status:
conditions:
- lastTransitionTime: "2021-08-31T03:50:24Z"
message: ""
observedGeneration: 1
reason: SufficientPods
status: "True"
type: DisruptionAllowed
currentHealthy: 1
desiredHealthy: 0
disruptionsAllowed: 1
expectedPods: 1
observedGeneration: 1
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |