Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1323596

Summary: PersistentVolume stuck in 'Released' state
Product: OpenShift Container Platform Reporter: Tomas Schlosser <tschloss>
Component: StorageAssignee: Sami Wagiaalla <swagiaal>
Status: CLOSED DUPLICATE QA Contact: Jianwei Hou <jhou>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.2.0CC: aos-bugs, avagarwa, bchilds, jokerman, lxia, maschmid, mmccomas, mmcgrath, tschloss
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-16 19:36:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Tomas Schlosser 2016-04-04 08:00:14 UTC
Description of problem:
When pv-recycler pod is deleted (pod eviction from restarted node), the PV is stuck in 'Released' state and is not recycled until next master restart. 

Version-Release number of selected component (if applicable):
atomic-openshift-3.1.1.6-4.git.21.cd70c35.el7aos.x86_64
atomic-openshift-3.2.0.4-1.git.0.4717e23.el7.x86_64

How reproducible:
Every time on clean environment.

Steps to Reproduce:
1. Remove recycler image from all OSE nodes
2. Release PV
3. Delete the recycler pod (or restart node to cause pod eviction)

Actual results:
Recycler pod is deleted, PV in 'Released' state, recycling not attempted again

Expected results:
Recycler pod is recreated OR
PV in 'Available' state OR
recycling is attempted again (periodically)

Additional info:
Deleting the recycler image on all nodes is required, otherwise, the PV can be scrubbed before you get the chance to delete the pod.

Comment 1 Avesh Agarwal 2016-04-04 19:12:17 UTC
I dont think I am able to reproduce it on latest kube, here are my steps, it seems to be working as expected. I think when you say recycler image, you mean the busybox image used by upstream kube for recycling.

Here are details in my setup:

1. PV:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv0001
  labels:
    type: local
spec:
  capacity:
    storage: 1.5Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  hostPath:
    path: "/tmp/data01

2. PVC:
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: myclaim-1
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi

3. Pod:

kind: Pod
apiVersion: v1
metadata:
  name: mypod
  labels:
    name: frontendhttp
spec:
  containers:
    - name: myfrontend
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
      - mountPath: "/usr/share/nginx/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
       claimName: myclaim-1

After I created above PV, PVC and Pod, I made sure that the busybox image does not exist on the node and then deleted PVC. I notice that Pod is recreated and PV is in Available state.

Comment 2 Avesh Agarwal 2016-04-04 19:14:57 UTC
I haven't checked if https://github.com/kubernetes/kubernetes/pull/23548 fixed this too. Or I am doing something wrong in my steps above.

Comment 3 Avesh Agarwal 2016-04-05 14:58:37 UTC
I tried it with latest origin (a few times), here are the steps I did:

1. /tmp/data01/ has data: 354MB of orign repo
2. created pv
3. created pvc
4. created pod
(Note: All files used for creating pv, pvc, and pod are the same as in the comment 1. and all are running fine now.)
5. made sure that origin-recyle image does not exist locally
6. deleted pvc and pod as follows:
oc delete -f ~/data-json-yaml-files/claim-01.yaml; oc delete -f ~/data-json-yaml-files/pod-pvc.yaml (the fastest I could run these 2 commands).

After the step 6, I notice that PV is in available state, and there is no pod though, and recylcer has cleared data (origin repo) in /tmp/data01.

It seems to be working as expected and so I am leaving it here for time being unless I get any more feedback on this.

Comment 4 Bradley Childs 2016-04-06 15:03:01 UTC
@tschloss do you have an environment we can use to reproduce this?

Comment 5 Tomas Schlosser 2016-04-08 08:56:04 UTC
I have created a reproducer that is easy to use:

Open a few terminals, in one run the following command (deletes the recycler pods as soon as they appear):
while true; do oc get po -n openshift-infra | grep recycler | awk '{print $1}'| xargs --no-run-if-empty oc -n openshift-infra delete po '{}' --grace-period=0; done

In another, watch the PVs: watch oc get pv

In third, create an application with PV:
oc new-app --template=postgresql-persistent
(wait a moment to bind the PVC)
oc delete svc,pvc,dc --all --grace-period=0

Now if I check the terminal with killer script, I see:
pod "pv-recycler-nfs-4zbh2" deleted

And in the one that watches PVs:
vol02     1Gi        RWO,RWX	   Released    tschloss/postgresql                           8d

The PVC is deleted and unless I restart the master (which deploys the recycler again), the volume stays in 'Released' state.

# oc version
oc v3.2.0.4
kubernetes v1.2.0-origin-41-g91d3e75# rpm -qa | grep atomic-openshift
atomic-openshift-master-3.2.0.4-1.git.0.4717e23.el7.x86_64
tuned-profiles-atomic-openshift-node-3.2.0.4-1.git.0.4717e23.el7.x86_64
atomic-openshift-sdn-ovs-3.2.0.4-1.git.0.4717e23.el7.x86_64
atomic-openshift-utils-3.0.59-1.git.0.917a1bf.el7.noarch
atomic-openshift-3.2.0.4-1.git.0.4717e23.el7.x86_64
atomic-openshift-node-3.2.0.4-1.git.0.4717e23.el7.x86_64
atomic-openshift-clients-3.2.0.4-1.git.0.4717e23.el7.x86_64

This is the latest version I was able to get. The environment is running in our private environment, Send me a mail if you can't reproduce it and I'll give you access to our OSE instance.

Comment 6 Sami Wagiaalla 2016-05-16 19:36:41 UTC

*** This bug has been marked as a duplicate of bug 1310587 ***