Description of problem: Cannot start pod with persistant volume, persistant volume recycling fails Version-Release number of selected component (if applicable): 1.0.5 How reproducible: Tried two upgrades from 1.0.4 to 1.0.5 same error Steps to Reproduce: 1. Upgrade from version 1.0.4 to version 1.0.5 2. Set up persistant volume with NFS 3. Try to start a pod that uses the persistantvolume 4. set persistentVolumeReclaimPolicy to Recycle 5. Delete project Actual results: Pod newer starts Volume does not get recycled Expected results: Pod starts Volume gets recycled Additional info: Recycling error: Unexpected error creating a pod to scrub volume : Pod "pv-scrubber-nfs-kgp5y" is invalid: [spec.volumes[0].name: required value, spec.containers[0].volumeMounts[0].name: not found 'vol'] phase: Failed
This is fixed in Origin HEAD several days ago. It is not in 1.0.5. https://github.com/openshift/origin/pull/4384
(In reply to Mark Turansky from comment #1) > This is fixed in Origin HEAD several days ago. It is not in 1.0.5. > > https://github.com/openshift/origin/pull/4384 I tried to build a version of HEAD to test this, either I'm doing something wrong, or it's not quite fixed. I tried to do this with the wordpress example (for the test I deliberately skipped doing anything else that creating the mysql pod) the empty "claim-wp" got available no problems, but the "claim-mysql" that contained data, failed. error message from cleanup: status: message: 'Recycling error: Pod failed, pod.Status.Message unknown.' phase: Failed
Move back based on #comment 2
Please look at the container logs to see the error. If the error says that the volume isn't empty, it's because dotfiles are not scrubbed in the recycler. This PR fixes this issue: https://github.com/openshift/origin/pull/3657
(In reply to Mark Turansky from comment #4) > Please look at the container logs to see the error. > > If the error says that the volume isn't empty, it's because dotfiles are not > scrubbed in the recycler. This PR fixes this issue: > https://github.com/openshift/origin/pull/3657 No, it looks like it's a permission error. docker logs da650e86b71c removed '/scrub/ib_logfile0' removed '/scrub/ib_logfile1' removed '/scrub/ibdata1' rm: cannot remove '/scrub/mysql': Permission denied scrub directory /scrub is not empty rm: cannot remove '/scrub/performance_schema': Permission denied rm: cannot remove '/scrub/replication': Permission denied rm: cannot remove '/scrub/wp_db': Permission denied file rights: pv0002]$ ls -lah total 24K drwxrwxrwx. 6 nfsnobody nfsnobody 4.0K Sep 14 08:28 . drwxrwxrwx. 4 nfsnobody nfsnobody 4.0K Sep 14 08:23 .. drwx------. 2 1000030000 root 4.0K Sep 14 08:27 mysql drwx------. 2 1000030000 root 4.0K Sep 14 08:27 performance_schema drwx------. 2 1000030000 root 4.0K Sep 14 08:27 replication drwx------. 2 1000030000 root 4.0K Sep 14 08:27 wp_db I tried again, this time, running "chmod -R 777 *" on the entire directory before setting the pv to recycle, when I did that, the recycler ran successfully. docker logs 8d140bb2efd8 removed directory: '/scrub/replication' removed '/scrub/wp_db/db.opt' removed directory: '/scrub/wp_db' scrub directory /scrub is empty Scrub OK
When set persistentVolumeReclaimPolicy to Recycle. pvc can't bound to pv. pvc always pending. $ openshift version openshift v1.0.6-2-g1e58d08 kubernetes v1.1.0-alpha.0-1605-g44c91b1 [fedora@ip-172-18-6-78 db-templates]$ cat pv.json apiVersion: v1 kind: PersistentVolume metadata: name: pv0001 spec: capacity: storage: 1Gi accessModes: - ReadWriteOnce - ReadWriteMany persistentVolumeReclaimPolicy: Recycle nfs: server: 172.18.6.78 path: /myshare logs:http://fpaste.org/266883/27918144/
Setting a reclamation policy should have no effect on its binding. If the PV/C you are using bound before, it should bind again. Open permissions recursively through the export is, at this time, required. Is this still a bug if you've run "chmod -R 777 *" and the recycler works as expected?
Re: the binding issue, I see this in the logs: NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE pv0001 <none> 1073741824 RWX Available 3s The PV's access modes should be accurate (i.e, RWO+ROX+RWX). The PVC can use just "RWX" for binding when this PR is rebased into OS: https://github.com/kubernetes/kubernetes/pull/10833
(In reply to Mark Turansky from comment #7) > Setting a reclamation policy should have no effect on its binding. If the > PV/C you are using bound before, it should bind again. > > Open permissions recursively through the export is, at this time, required. > > Is this still a bug if you've run "chmod -R 777 *" and the recycler works as > expected? I'm not shure, shoudn't the recycler be able to recycle the volume regardless of the rights on the claim that used it? Maybe this is something that should be solved outside the recycler? Either that or grant the recycler more rights?
Just to clarify. If open permissions recursively at export time is required for the recycler to work, then I don't consider it a bug when the recycler runs fine when I first run "chmod +R 777 *"
The bug is fixed. Version: openshift v1.0.6-12-gb0c065c kubernetes v1.1.0-alpha.0-1605-g44c91b1 The step to verify this bug: 1.Create a pv set "persistentVolumeReclaimPolicy" to "Recycle" [fedora@ip-172-18-5-61 db-templates]$ cat /etc/exports /myshare *(rw,all_squash) [fedora@ip-172-18-5-61 db-templates]$ oc get pv NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE nfs <none> 1073741824 RWX Available 12m 2.Create mysql to use this pv [fedora@ip-172-18-5-61 db-templates]$ oc new-app mysql-persistent-template.json -n dma1 services/mysql persistentvolumeclaims/mysql deploymentconfigs/mysql Service "mysql" created at 172.30.14.182 with port mappings 3306. Run 'oc status' to view your app. 3.Check the pvc and mysql pod [fedora@ip-172-18-5-61 db-templates]$ oc get pvc -n dma1 NAME LABELS STATUS VOLUME AGE mysql map[template:mysql-persistent-template] Bound nfs 8s [fedora@ip-172-18-5-61 db-templates]$ oc get pod -n dma1 NAME READY STATUS RESTARTS AGE mysql-1-4xrx9 1/1 Running 0 11s 4.Check the data in the nfs shared directory [fedora@ip-172-18-5-61 db-templates]$ ls /myshare/ ibdata1 ib_logfile0 ib_logfile1 mysql mysql-1-4xrx9.pid performance_schema replication sampledb 5.Delete the project [fedora@ip-172-18-5-61 db-templates]$ oc delete project dma1 project "dma1" deleted 6.Check the pv is Available and shared directory is empty [fedora@ip-172-18-5-61 db-templates]$ oc get pv NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON AGE nfs <none> 1073741824 RWX Available 14m [fedora@ip-172-18-5-61 db-templates]$ ls /myshare/ Actual results: 6.pv recycled again. Expected results: 6.pv recycled again.