Bug 2022308
| Summary: | Pods in unstable state on node shutdown | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Data Foundation | Reporter: | Sonia Garudi <sgarudi> |
| Component: | csi-driver | Assignee: | yati padia <ypadia> |
| Status: | CLOSED NOTABUG | QA Contact: | Elad <ebenahar> |
| Severity: | medium | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 4.9 | CC: | dzaken, etamir, jrivera, madam, mmuench, muagarwa, ndevos, ocs-bugs, odf-bz-bot, svenkat, ypadia |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | ppc64le | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-11-23 00:41:01 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Sonia Garudi
2021-11-11 10:06:44 UTC
Link to must-gather logs: https://drive.google.com/drive/folders/1F9aqyIRiao4z5yTciYyGv0OxinFakXi4?usp=sharing If you really need to stop Kubelet for some reason, you need to make sure that PVCs are unmounted before. You can do this by draining the pods from the node (moving it into maintenance mode). Because volumes are still mounted when application pods are running when `systemctl start kubelet.service` is run, checks that the (Ceph RBD) volumes are in use will not allow mounting the volume on an other worker node. Can you explain what the reason is to run `systemctl start kubelet.service` without stopping application pods in advance? If moving the worker node into maintenance mode is an option, please close this bug. Thanks! @ndevos Thanks for your comment. We will add a step to execute 'oc drain xxx' to drain the pods running on a specific node before shutting down the kubelet service to simulate the disruption of nodes. This BZ can be closed and we will reopen if the problem persists after following the new set of procedures. |