Bug 1881938
| Summary: | migrator deployment doesn't tolerate masters | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Raif Ahmed <rahmed> |
| Component: | kube-storage-version-migrator | Assignee: | Luis Sanchez <sanchezl> |
| Status: | CLOSED ERRATA | QA Contact: | Ke Wang <kewang> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.5 | CC: | aos-bugs, cblecker, sanchezl, surbania, wking |
| Target Milestone: | --- | ||
| Target Release: | 4.8.0 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-07-27 22:33:30 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Raif Ahmed
2020-09-23 12:38:30 UTC
*** Bug 1935347 has been marked as a duplicate of this bug. *** *** Bug 1935347 has been marked as a duplicate of this bug. *** $ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.0-0.nightly-2021-03-16-221720 True False 151m Cluster version is 4.8.0-0.nightly-2021-03-16-221720
Check what master node kube-storage pods is running,
$ oc get pod -A -o wide | grep kube-storage
openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-564cdcc96c-xzprc 1/1 Running 0 4h7m 10.130.0.63 ip-10-0-176-115.us-east-2.compute.internal <none> <none>
openshift-kube-storage-version-migrator migrator-8bdb5f65f-22prn 1/1 Running 0 4h7m 10.130.0.62 ip-10-0-176-115.us-east-2.compute.internal <none> <none>
New deployment were applied to pods,
$ oc describe pod -n openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-564cdcc96c-xzprc
Name: kube-storage-version-migrator-operator-564cdcc96c-xzprc
Namespace: openshift-kube-storage-version-migrator-operator
...
Node-Selectors: node-role.kubernetes.io/master=
Tolerations: node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 120s
node.kubernetes.io/unreachable:NoExecute op=Exists for 120s
Events: <none>
----------
$ oc describe pod -n openshift-kube-storage-version-migrator migrator-8bdb5f65f-22prn
Name: migrator-8bdb5f65f-22prn
Namespace: openshift-kube-storage-version-migrator
...
Node-Selectors: <none>
Tolerations: node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 120s
node.kubernetes.io/unreachable:NoExecute op=Exists for 120s
Events: <none>
To stop kubelet service on the master node which the kube-srorage pods are located,
$ oc debug node/ip-10-0-176-115.us-east-2.compute.internal
Starting pod/ip-10-0-176-115us-east-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.176.115
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# systemctl stop kubelet
Removing debug pod ...
About 2 minutes after the master which stopped kubelet service status is NotReady, kube-storage operator pods running on that master were changed to status Terminating and those pods were scheduled to other master.
$ oc get no
NAME STATUS ROLES AGE VERSION
ip-10-0-140-229.us-east-2.compute.internal Ready master 5h11m v1.20.0+e1bc274
ip-10-0-157-114.us-east-2.compute.internal Ready worker 5h2m v1.20.0+e1bc274
ip-10-0-176-115.us-east-2.compute.internal NotReady master 5h6m v1.20.0+e1bc274
ip-10-0-185-50.us-east-2.compute.internal Ready worker 5h2m v1.20.0+e1bc274
ip-10-0-221-102.us-east-2.compute.internal Ready master 5h7m v1.20.0+e1bc274
$ date;echo;oc get pod -A -o wide | grep kube-storage
Wed Mar 17 05:19:24 EDT 2021
openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-564cdcc96c-8hv7z 1/1 Running 0 31s 10.129.0.89 ip-10-0-221-102.us-east-2.compute.internal <none> <none>
openshift-kube-storage-version-migrator-operator kube-storage-version-migrator-operator-564cdcc96c-xzprc 1/1 Terminating 0 4h42m 10.130.0.63 ip-10-0-176-115.us-east-2.compute.internal <none> <none>
openshift-kube-storage-version-migrator migrator-8bdb5f65f-22prn 1/1 Terminating 0 4h42m 10.130.0.62 ip-10-0-176-115.us-east-2.compute.internal <none> <none>
openshift-kube-storage-version-migrator migrator-8bdb5f65f-cq26s 1/1 Running 0 31s 10.128.2.155 ip-10-0-157-114.us-east-2.compute.internal <none> <none>
From above, the results is as expected, so move the bug VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |