Bug 1985697
| Summary: | package-server-manager needs to handle 60 seconds downtime of API server gracefully in SNO | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Naga Ravi Chaitanya Elluri <nelluri> |
| Component: | OLM | Assignee: | tflannag |
| OLM sub component: | OLM | QA Contact: | Jian Zhang <jiazha> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | high | CC: | jiazha, nelluri |
| Version: | 4.9 | Keywords: | Triaged |
| Target Milestone: | --- | ||
| Target Release: | 4.9.0 | ||
| Hardware: | Unspecified | ||
| OS: | Linux | ||
| Whiteboard: | chaos | ||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-10-18 17:40:56 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1984730 | ||
|
Description
Naga Ravi Chaitanya Elluri
2021-07-25 03:59:57 UTC
Cluster version is 4.9.0-0.nightly-2021-08-06-020133
[cloud-user@preserve-olm-env jian]$ oc -n openshift-operator-lifecycle-manager exec catalog-operator-6cd746b48b-plwtn -- olm --version
OLM version: 0.18.3
git commit: 3a21821b786493b59da83ab4ce16d6ed16dcccad
1, Install an SNO cluster.
[cloud-user@preserve-olm-env jian]$ oc get infrastructure cluster -o=jsonpath='{.status.infrastructureTopology}'
SingleReplica
[cloud-user@preserve-olm-env jian]$ oc get nodes
NAME STATUS ROLES AGE VERSION
ip-10-0-133-45.us-east-2.compute.internal Ready master,worker 49m v1.21.1+8268f88
2, Trigger kube-apiserver rollout or outage which lasts for at least 60 seconds
[cloud-user@preserve-olm-env jian]$ oc patch kubeapiserver/cluster --type merge -p '{"spec":{"forceRedeploymentReason":"ITERATION1"}}'
kubeapiserver.operator.openshift.io/cluster patched
[cloud-user@preserve-olm-env jian]$ oc get pods -n openshift-kube-apiserver
NAME READY STATUS RESTARTS AGE
installer-10-ip-10-0-133-45.us-east-2.compute.internal 1/1 Running 0 42s
installer-5-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 47m
installer-7-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 46m
installer-8-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 40m
installer-9-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 39m
kube-apiserver-ip-10-0-133-45.us-east-2.compute.internal 5/5 Running 0 38m
kube-apiserver-startup-monitor-ip-10-0-133-45.us-east-2.compute.internal 0/1 Pending 0 32s
revision-pruner-10-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 49s
revision-pruner-7-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 44m
revision-pruner-8-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 41m
revision-pruner-9-ip-10-0-133-45.us-east-2.compute.internal 0/1 Completed 0 39m
3, Observe the state of package-server-manager. No crash, looks good, verify it.
[cloud-user@preserve-olm-env jian]$ oc get pods
NAME READY STATUS RESTARTS AGE
catalog-operator-6cd746b48b-plwtn 1/1 Running 0 54m
collect-profiles-27136980-g4448 0/1 Completed 0 40m
collect-profiles-27136995-zcqx6 0/1 Completed 0 25m
collect-profiles-27137010-q4nwq 0/1 Completed 0 10m
olm-operator-c9cb64896-5wlfq 1/1 Running 0 54m
package-server-manager-5856789494-6nhgc 1/1 Running 0 54m
packageserver-5ddb77696f-59nwr 1/1 Running 0 52m
*** Bug 1989418 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |