Bug 2040131 - no pod use the PVC, but PVC status is still Bound if replicas number changed after upgrade
Summary: no pod use the PVC, but PVC status is still Bound if replicas number changed ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: ---
Assignee: Brian Burt
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-01-13 03:23 UTC by Junqi Zhao
Modified: 2022-03-30 15:24 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
For this release, the number of Alertmanager replicas in the monitoring stack was reduced from three to two. However, the persistent volume claim (PVC) for the removed third replica is not automatically removed as part of the upgrade process. After the upgrade, an administrator can remove this PVC manually from the Cluster Monitoring Operator.
Clone Of:
Environment:
Last Closed: 2022-03-30 15:24:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
pv/pvc/sc info (9.65 KB, text/plain)
2022-01-13 03:23 UTC, Junqi Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker RHDEVDOCS-3646 0 None None None 2022-01-17 21:28:12 UTC

Description Junqi Zhao 2022-01-13 03:23:10 UTC
Created attachment 1850495 [details]
pv/pvc/sc info

Description of problem:
alertmanager replicas changed from 3 to 2 since 4.10
4.9 has 3 replicas: https://github.com/openshift/cluster-monitoring-operator/blob/release-4.9/assets/alertmanager/alertmanager.yaml#L121
4.10 has 2 replicas: https://github.com/openshift/cluster-monitoring-operator/blob/release-4.10/assets/alertmanager/alertmanager.yaml#L151

4.9.13 cluster, bind PVs for alertmanager via configmap and upgrade to 4.10.0-fc.0
******************************
apiVersion: v1
kind: ConfigMap
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    alertmanagerMain:
      volumeClaimTemplate:
        metadata:
          name: alertmanager
        spec:
          volumeMode: Filesystem
          resources:
            requests:
              storage: 4Gi
******************************
# oc get sc
NAME                 PROVISIONER             RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
standard (default)   kubernetes.io/gce-pd    Delete          WaitForFirstConsumer   true                   23h
standard-csi         pd.csi.storage.gke.io   Delete          WaitForFirstConsumer   true                   23h


after upgrade, PVC alertmanager-alertmanager-main-2 status is still Bound, but no pod use the PVC
# oc -n openshift-monitoring get pvc | grep alertmanager
alertmanager-alertmanager-main-0   Bound    pvc-8a3c4a5c-29b6-4a36-809d-a4850d78e2a7   4Gi        RWO            standard       20h
alertmanager-alertmanager-main-1   Bound    pvc-4da37a8e-f0e4-4f74-b838-c99adc031752   4Gi        RWO            standard       20h
alertmanager-alertmanager-main-2   Bound    pvc-215f87e7-0913-4163-8840-5b14784f38f3   4Gi        RWO            standard       20h

# oc -n openshift-monitoring get pod | grep alertmanager-main
alertmanager-main-0                           6/6     Running   0          18h
alertmanager-main-1                           6/6     Running   0          18h

# oc -n openshift-monitoring get pod alertmanager-main-0 -oyaml | grep persistentVolumeClaim -A1
    persistentVolumeClaim:
      claimName: alertmanager-alertmanager-main-0

# oc -n openshift-monitoring get pod alertmanager-main-1 -oyaml | grep persistentVolumeClaim -A1
    persistentVolumeClaim:
      claimName: alertmanager-alertmanager-main-1

Version-Release number of selected component (if applicable):
4.9.13 cluster, bind PVs for alertmanager via configmap and upgrade to 4.10.0-fc.0

How reproducible:
always

Steps to Reproduce:
1. 4.9.13 cluster, bind PVs for alertmanager via configmap and upgrade to 4.10.0-fc.0
2.
3.

Actual results:
PVC alertmanager-alertmanager-main-2 status is still Bound, but no pod use the PVC

Expected results:
PVC alertmanager-alertmanager-main-2 should be recycled, Bound status will make user confused and think there is pod use the PVC

Master Log:

Node Log (of failed PODs):

PV Dump:
see the attached file

PVC Dump:
see the attached file

StorageClass Dump (if StorageClass used by PV/PVC):
see the attached file

Additional info:

Comment 1 Jan Safranek 2022-01-13 14:04:18 UTC
Kubernetes does not delete PVCs created by a StatefulSet when it gets scaled down. It does not know if the user is going to scale the StatefulSet back up. There is a KEP upstream to add automatic deletion as opt-in. It's alpha in 1.23, and it will take few releases to reach GA.

Moving to monitoring team to consider if they want to delete PVC automatically in cluster-monitoring-operator during/after upgrade or just document it as post-upgrade step.

Comment 2 Jan Safranek 2022-01-13 14:04:46 UTC
Forgot a link to the upstream KEP: https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/1847-autoremove-statefulset-pvcs

Comment 3 Simon Pasquier 2022-01-13 14:37:11 UTC
IMO we need at least a note in the OCP documentation. Ideally the cluster monitoring operator should clean this up but: 
1. It might be tricky to understand exactly which volume to delete, the best approach is probably to get the alertmanager-main-2 pod definition (if it exists), find the bounded PVC and delete it before scaling down the statefulset.  
2. The operator deleting user data automatically is a bit scary to me.

Comment 5 Simon Pasquier 2022-03-16 09:43:15 UTC
The 4.10 release notes mentions the "issue" and explains how it should be fixed:
https://docs.openshift.com/container-platform/4.10/release_notes/ocp-4-10-release-notes.html#ocp-4-10-monitoring-added-hard-anti-affinity-rules-and-pod-distruption-budgets

Junqi, I'm not sure how we want to proceed with this bug? Would you move it to VERIFIED directly?

Comment 6 Junqi Zhao 2022-03-21 03:09:03 UTC
updated doc, added note for this issue

Comment 7 Brian Burt 2022-03-30 15:24:32 UTC
Added the text in the Doc Text field above to the "Known Issues" section of the OCP 4.10 Release Notes: 

https://docs.openshift.com/container-platform/4.10/release_notes/ocp-4-10-release-notes.html#ocp-4-10-known-issues


Note You need to log in before you can comment on or make changes to this bug.