1324418 – Unable to bound recycled nfs pv after release

Bug 1324418 - Unable to bound recycled nfs pv after release

Summary: Unable to bound recycled nfs pv after release

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.2.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Sami Wagiaalla
QA Contact:	Wenqi He
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-04-06 09:44 UTC by Wenqi He
Modified:	2016-06-07 22:46 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2016-05-12 16:35:45 UTC
Target Upstream Version:
Embargoed:
Flags:	wehe: needinfo-

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2016:1064	0	normal	SHIPPED_LIVE	Important: Red Hat OpenShift Enterprise 3.2 security, bug fix, and enhancement update	2016-05-12 20:19:17 UTC

Description Wenqi He 2016-04-06 09:44:30 UTC

Description of problem:
The pv cannot be bound again with the reclaim policy "recycle"

Version-Release number of selected component (if applicable):
openshift v1.1.6-7-gb022375
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

How reproducible:
Always

Steps to Reproduce:
1. docker pull openshift/origin
2. edit your /etc/hosts to add your ip and hostname
3. sudo docker run -d --name "origin" --privileged --pid=host --net=host -v /:/rootfs:ro -v /var/run:/var/run:rw -v /sys:/sys -v /var/lib/docker:/var/lib/docker:rw -v /var/lib/origin/openshift.local.volumes:/var/lib/origin/openshift.local.volumes openshift/origin start --loglevel=5
4. sudo docker exec -it origin bash
5. Create a nfs pv with reclaim policy "Recycle"
6. Create a pvc
7. create a pod
8. Delete the pod and pvc
9. Create the pvc again as step 6
10 Wait for 1h to check the pv status

Actual results:
oc describe pv nfs

Name: nfs
Labels: <none>
Status: Failed
Claim: default/nfsc
Reclaim Policy: Recycle
Access Modes: RWO
Capacity: 5Gi
Message: Recycling error: Unexpected error creating recycler pod: pods "pv-recycler-nfs-" is forbidden: unable to validate against any security context constraint: [securityContext.runAsUser: Invalid value: 0: UID on container pv-recycler does not match required range. Found 0, required min: 1000010000 max: 1000019999 spec.containers[0].securityContext.volumes[0]: Invalid value: "nfs": nfs volumes are not allowed to be used spec.containers[0].securityContext.volumes[0]: Invalid value: "nfs": nfs volumes are not allowed to be used]

Expected results:
No error and the pv and pvc can be bound quickly

Additional info:
Bug#1293805
And I have also tested in origin in AWS instance. Also repro on devenv-rhel7_3916
openshift version of:
openshift v1.1.6-12-ge74de69
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

Comment 1 Paul Morie 2016-04-06 21:08:28 UTC

Could you supply the output of `oc get pod <name of recycler pod> -o yaml`?  It looks like the recycler pod was matched against an SCC that didn't allow use of uid 0 or the nfs volume plugin.

Comment 2 Wenqi He 2016-04-07 02:05:02 UTC

This error is from pv after the pod and pvc is deleted, and should wait 1 hour after re-create the pvc.
The pod I created is like below:

`oc get pod nfs -o yaml`

apiVersion: v1
kind: Pod
metadata:
  annotations:
    openshift.io/scc: anyuid
  creationTimestamp: 2016-04-07T01:58:09Z
  labels:
    name: frontendhttp
  name: nfs
  namespace: default
  resourceVersion: "8500"
  selfLink: /api/v1/namespaces/default/pods/nfs
  uid: 2d42ac10-fc64-11e5-a73f-6c0b849ad3e9
spec:
  containers:
  - image: aosqe/hello-openshift
    imagePullPolicy: IfNotPresent
    name: myfrontend
    ports:
    - containerPort: 80
      name: http-server
      protocol: TCP
    resources: {}
    securityContext:
      capabilities:
        drop:
        - KILL
        - MKNOD
        - SYS_CHROOT
      privileged: false
      seLinuxOptions:
        level: s0:c5,c0
    terminationMessagePath: /dev/termination-log
    volumeMounts:
    - mountPath: /mnt/nfs
      name: pvol
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: default-token-esl5w
      readOnly: true
  dnsPolicy: ClusterFirst
  host: dhcp-137-221.nay.redhat.com
  imagePullSecrets:
  - name: default-dockercfg-1vrdk
  nodeName: dhcp-137-221.nay.redhat.com
  restartPolicy: Always
  securityContext:
    seLinuxOptions:
      level: s0:c5,c0
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  volumes:
  - name: pvol
    persistentVolumeClaim:
      claimName: nfsc
  - name: default-token-esl5w
    secret:
      secretName: default-token-esl5w
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: 2016-04-07T01:58:13Z
    status: "True"
    type: Ready
  containerStatuses:
  - containerID: docker://802783f457dd085d434472281b525436efa180932d1acb3d01aa5eddc3fcb56f
    image: aosqe/hello-openshift
    imageID: docker://cddcd4ab363acd31256ed7880d4b669fa45227e49eec41429f80a4f252dfb0da
    lastState: {}
    name: myfrontend
    ready: true
    restartCount: 0
    state:
      running:
        startedAt: 2016-04-07T01:58:13Z
  hostIP: 10.66.136.120
  phase: Running
  podIP: 172.17.0.2
  startTime: 2016-04-07T01:58:09Z

`oc get scc`

anyuid             false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    10         false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
hostaccess         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath persistentVolumeClaim secret]
hostmount-anyuid   false     []        MustRunAs   RunAsAny           RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir hostPath persistentVolumeClaim secret]
hostnetwork        false     []        MustRunAs   MustRunAsRange     MustRunAs   MustRunAs   <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
nonroot            false     []        MustRunAs   MustRunAsNonRoot   RunAsAny    RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]
privileged         true      []        RunAsAny    RunAsAny           RunAsAny    RunAsAny    <none>     false            [*]
restricted         false     []        MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <none>     false            [configMap downwardAPI emptyDir persistentVolumeClaim secret]

`oc get pv nfs -o yaml`

apiVersion: v1
kind: PersistentVolume
metadata:
  creationTimestamp: 2016-04-07T01:57:53Z
  name: nfs
  resourceVersion: "8490"
  selfLink: /api/v1/persistentvolumes/nfs
  uid: 24100fc0-fc64-11e5-a73f-6c0b849ad3e9
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: nfsc
    namespace: default
    resourceVersion: "8486"
    uid: 280074db-fc64-11e5-a73f-6c0b849ad3e9
  nfs:
    path: /mnt/nfs
    server: <ip addr of nfs server>
  persistentVolumeReclaimPolicy: Recycle
status:
  phase: Bound

Comment 3 Jianwei Hou 2016-04-07 02:53:14 UTC

(In reply to Paul Morie from comment #1)
> Could you supply the output of `oc get pod <name of recycler pod> -o yaml`? 
> It looks like the recycler pod was matched against an SCC that didn't allow
> use of uid 0 or the nfs volume plugin.

Yes, I think so, the pv recycler pod wasn't created because SCC constrained it. I tried to work around this issue by updating restricted SCC, then the issue disappeared. My updated SCC was:

```
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: RunAsAny
supplementalGroups:
  type: RunAsAny
volumes:
- configMap
- downwardAPI
- emptyDir
- hostPath
- nfs
- persistentVolumeClaim
- secret
```

Comment 4 Sami Wagiaalla 2016-04-07 15:02:18 UTC

> Yes, I think so, the pv recycler pod wasn't created because SCC constrained
> it. I tried to work around this issue by updating restricted SCC, then the
> issue disappeared. 
 
Can this bug be closed then or is there still an outstanding issue ?

Comment 5 Paul Morie 2016-04-07 17:11:49 UTC

I'm wondering which serviceaccount the recycler pod was being run under.

Can you repeat this test with --loglevel=4 and post the API server log?

We are looking for a message like 'getting security context constraints for pod'.

Comment 6 Wenqi He 2016-04-08 03:18:33 UTC

Here is the information I search from docker logs:
I0408 02:31:12.495997    2597 admission.go:109] getting security context constraints for pod  (generate: pv-recycler-nfs-) in namespace openshift-infra with user info &{system:serviceaccount:openshift-infra:pv-recycler-controller 5e78dce3-fd2c-11e5-ae17-6c0b849ad3e9 [system:serviceaccounts system:serviceaccounts:openshift-infra system:authenticated]}
I0408 02:31:12.496020    2597 admission.go:118] getting security context constraints for pod  (generate: pv-recycler-nfs-) with service account info &{system:serviceaccount:openshift-infra:default  [system:serviceaccounts system:serviceaccounts:openshift-infra]}

Tried to re-pull the latest image and test, still repro

Comment 7 Paul Morie 2016-04-08 18:46:09 UTC

The problem is that the hostmount-anyuid SCC, which the PV recycler controller service account is supposed to use, doesn't have access to the nfs plugin:

From the controller manager log:

408 13:53:56.948588   31157 admission.go:109] getting security context constraints for pod  (generate: pv-recycler-nfs-) in namespace openshift-infra with user info &{system:serviceaccount:openshift-infra:pv-r
ecycler-controller d1efb715-fdb2-11e5-8f80-68f728db1985 [system:serviceaccounts system:serviceaccounts:openshift-infra system:authenticated]}
I0408 13:53:56.948642   31157 admission.go:118] getting security context constraints for pod  (generate: pv-recycler-nfs-) with service account info &{system:serviceaccount:openshift-infra:default  [system:servi
ceaccounts system:serviceaccounts:openshift-infra]}
I0408 13:53:56.949928   31157 admission.go:368] got preallocated values for min: 1000020000, max: 1000029999 for uid range in namespace openshift-infra
I0408 13:53:56.950211   31157 admission.go:381] got preallocated value for level: s0:c5,c0 for selinux options in namespace openshift-infra
I0408 13:53:56.950227   31157 admission.go:411] got preallocated value for groups: 1000020000/10000 in namespace openshift-infra
I0408 13:53:56.950268   31157 admission.go:381] got preallocated value for level: s0:c5,c0 for selinux options in namespace openshift-infra
I0408 13:53:56.950286   31157 admission.go:525] validating pod  (generate: pv-recycler-nfs-) against providers restricted,hostmount-anyuid
I0408 13:53:56.950355   31157 admission.go:154] unable to validate pod  (generate: pv-recycler-nfs-) against any security context constraint: [securityContext.runAsUser: Invalid value: 0: UID on container pv-rec
ycler does not match required range.  Found 0, required min: 1000020000 max: 1000029999 spec.containers[0].securityContext.volumes[0]: Invalid value: "nfs": nfs volumes are not allowed to be used spec.containers
[0].securityContext.volumes[0]: Invalid value: "nfs": nfs volumes are not allowed to be used]
E0408 13:53:56.979235   31157 persistentvolume_recycler_controller.go:245] PersistentVolume[test-pv] failed recycling: Unexpected error creating recycler pod:  pods "pv-recycler-nfs-" is forbidden: unable to val
idate against any security context constraint: [securityContext.runAsUser: Invalid value: 0: UID on container pv-recycler does not match required range.  Found 0, required min: 1000020000 max: 1000029999 spec.co
ntainers[0].securityContext.volumes[0]: Invalid value: "nfs": nfs volumes are not allowed to be used spec.containers[0].securityContext.volumes[0]: Invalid value: "nfs": nfs volumes are not allowed to be used]

The problem is that the `hostmount-anyuid` SCC that the PV recycler is supposed to use doesn't have permission to use the NFS plugin:

$ oc get scc hostmount-anyuid -o yaml                                                                                                                                                                              
allowHostDirVolumePlugin: true
allowHostIPC: false
allowHostNetwork: false
allowHostPID: false
allowHostPorts: false
allowPrivilegedContainer: false
allowedCapabilities: null
apiVersion: v1
defaultAddCapabilities: null
fsGroup:
  type: RunAsAny
kind: SecurityContextConstraints
metadata:
  annotations:
    kubernetes.io/description: 'hostmount-anyuid provides all the features of the
      restricted SCC but allows host mounts and any UID by a pod.  This is primarily
      used by the persistent volume recycler. WARNING: this SCC allows host file system
      access as any UID, including UID 0.  Grant with caution.'
  creationTimestamp: 2016-04-08T17:44:09Z
  name: hostmount-anyuid
  resourceVersion: "75"
  selfLink: /api/v1/securitycontextconstraints/hostmount-anyuid
  uid: 7f85172c-fdb1-11e5-8b50-68f728db1985
priority: null
readOnlyRootFilesystem: false
requiredDropCapabilities: null
runAsUser:
  type: RunAsAny
seLinuxContext:
  type: MustRunAs
supplementalGroups:
  type: RunAsAny
users:
- system:serviceaccount:openshift-infra:pv-recycler-controller
volumes:
- configMap
- downwardAPI
- emptyDir
- hostPath
- persistentVolumeClaim
- secret

Comment 8 Paul Morie 2016-04-08 19:49:16 UTC

PR for the fix: https://github.com/openshift/origin/pull/8432

Comment 9 Wenqi He 2016-04-15 09:08:08 UTC

I have verified this on OSE and it is fixed below version:
openshift v3.2.0.15
kubernetes v1.2.0-36-g4a3f9c5
etcd 2.2.5

Comment 10 Wenqi He 2016-04-19 01:55:39 UTC

Already verified from last comment.

Comment 12 errata-xmlrpc 2016-05-12 16:35:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1064

Note You need to log in before you can comment on or make changes to this bug.