Description of problem: After a PVC bound to a PV, but no pod is using the PVC. Deleting the PVC will cause PV Recycling failed with error: Pod failed, pod.Status.Message unknown Version-Release number of selected component (if applicable): openshift v1.0.8-4-gabfc3c4 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 How reproducible: Always Steps to Reproduce: 1.Prepare a NFS server on an All-In-One env. bash -x nfs-provisioning-localhost.sh 2.Create 5 PVs. oc new-app -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/nfs/template-pv.json 3.Create 10 PVCs. oc new-app -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/nfs/template-pvc.json 4.Delete a PVC which has bound with PV. oc get pvc oc delete pvc template-pvc-1 5.Check PV status. oc get pv oc describe pv template-pv-5 Actual results: # oc describe pv template-pv-5 Name: template-pv-5 Labels: template=PVs Status: Failed Claim: default/template-pvc-1 Reclaim Policy: Recycle Access Modes: RWO Capacity: 5Gi Message: Recycling error: Pod failed, pod.Status.Message unknown. Source: Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: localhost Path: /home/data/pv05 ReadOnly: false Expected results: PV should be available. Additional info: cat nfs-provisioning-localhost.sh #!/bin/bash if ! ( rpm -qa | grep -q nfs-utils ) then yum install -y nfs-utils fi mkdir -p /home/data/pv{01..10} chmod -R 700 /home/data/pv{01..09..2} chmod -R 770 /home/data/pv{02..10..2} for PV in pv{01..10} do if ! ( grep -q "/home/data/$PV" /etc/exports ) then echo "/home/data/$PV *(rw,sync)" >> /etc/exports fi done systemctl start rpcbind systemctl start nfs-server exportfs -a if ( getsebool virt_use_nfs | grep -q off ) then setsebool -P virt_use_nfs 1 fi
The new security functionality caused errors in the recycler, which was used to NFS volumes without any security. The recycler needs to work with UID:GID. This BZ should be fixed by https://github.com/openshift/origin/pull/5792
*** This bug has been marked as a duplicate of bug 1279335 ***
https://github.com/openshift/origin/pull/5792 is superceded by https://github.com/openshift/origin/pull/5847 ForkAMI available at https://ci.openshift.redhat.com/jenkins/job/fork_ami/132/ Reopening, since this was related to NFS permissions. The other bug was related to hostmount SCC.
https://github.com/openshift/origin/pull/5847 is in the merge queue
Check again on devenv-rhel7_2695, following exactly the same steps as in #comment 0 , PV still can not be recycled. # oc describe pv template-pv-5 Name: template-pv-5 Labels: template=PVs Status: Failed Claim: lxiap001/template-pvc-1 Reclaim Policy: Recycle Access Modes: RWO Capacity: 5Gi Message: Recycling error: Pod was active on the node longer than specified deadline Source: Type: NFS (an NFS mount that lasts the lifetime of a pod) Server: localhost Path: /home/data/pv05 ReadOnly: false The error is a little confuse since it says "Pod was active on the node longer than specified deadline", but actually there are no pods on this environment. # openshift version openshift v1.0.8-40-g42ad235 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2
https://bugzilla.redhat.com/show_bug.cgi?id=1281726 contains the same error ("Pod was active on the node longer than specified"). Are these two dupes?
I attempted a MySQL pod with NFS using 700 and 770 (as indicated above). Only 777 worked. 700 had an error when mounting, others when writing. Try again with 777, please.
Tried again on devenv-rhel7_2712 with openshift version openshift v1.1-25-g0c0e452 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 and PV can be recycled when NFS exported with 777, adn PV Failed to recycle when NFS exported with 700/770.
Hi Mark, Since NFS exported with 777 is not good. Could you confirm that NFS exported with 777 is required ? Thanks, Liang
Assgin back to get the confirmation.
There is a feature request for the automatic addition of GID to pod's running shared storage volumes (NFS, Gluster). The recycler would run a pod using the same GID that is stored on the PV. This allows permissions less than 777. Reassigning to Sami who I believe is handling that feature. Otherwise, Sami, please reassign to the feature owner.
PR opened upstream to support a GID annotation which indicates the GID with which to access the volume. The recycler pod will use the same feature. https://github.com/kubernetes/kubernetes/pull/20490
Check on version, openshift v3.1.1.904 kubernetes v1.2.0-alpha.7-703-gbc4550d etcd 2.2.5 The PV (persistent volume) can be recycled now. Once the bug moved to ON_QA, we can move it to verified.
On a closer look at this bug it seems like the UID GID setting is not the issue. The recycler script merged here: https://github.com/openshift/origin/pull/5847 and referenced above has a 'becomeUser' method which switches the UID to that of the file which requires deletion. Liang, This is working for you now then ? I think what happened is that only your most recent test contained the patch referenced above. Moving to ON_QA. Please reopen of you encounter this issue again.
Moving the verified.