Bug 2040131
| Summary: | no pod use the PVC, but PVC status is still Bound if replicas number changed after upgrade | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Junqi Zhao <juzhao> | ||||
| Component: | Monitoring | Assignee: | Brian Burt <bburt> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Junqi Zhao <juzhao> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.10 | CC: | amuller, anpicker, aos-bugs, brad.ison, erooth, hongyli, jsafrane, spasquie | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Known Issue | |||||
| Doc Text: |
For this release, the number of Alertmanager replicas in the monitoring stack was reduced from three to two. However, the persistent volume claim (PVC) for the removed third replica is not automatically removed as part of the upgrade process. After the upgrade, an administrator can remove this PVC manually from the Cluster Monitoring Operator.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-03-30 15:24:32 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Junqi Zhao
2022-01-13 03:23:10 UTC
Kubernetes does not delete PVCs created by a StatefulSet when it gets scaled down. It does not know if the user is going to scale the StatefulSet back up. There is a KEP upstream to add automatic deletion as opt-in. It's alpha in 1.23, and it will take few releases to reach GA. Moving to monitoring team to consider if they want to delete PVC automatically in cluster-monitoring-operator during/after upgrade or just document it as post-upgrade step. Forgot a link to the upstream KEP: https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/1847-autoremove-statefulset-pvcs IMO we need at least a note in the OCP documentation. Ideally the cluster monitoring operator should clean this up but: 1. It might be tricky to understand exactly which volume to delete, the best approach is probably to get the alertmanager-main-2 pod definition (if it exists), find the bounded PVC and delete it before scaling down the statefulset. 2. The operator deleting user data automatically is a bit scary to me. The 4.10 release notes mentions the "issue" and explains how it should be fixed: https://docs.openshift.com/container-platform/4.10/release_notes/ocp-4-10-release-notes.html#ocp-4-10-monitoring-added-hard-anti-affinity-rules-and-pod-distruption-budgets Junqi, I'm not sure how we want to proceed with this bug? Would you move it to VERIFIED directly? updated doc, added note for this issue Added the text in the Doc Text field above to the "Known Issues" section of the OCP 4.10 Release Notes: https://docs.openshift.com/container-platform/4.10/release_notes/ocp-4-10-release-notes.html#ocp-4-10-known-issues |