Description of problem: Pod can not mount dynamic provisoned volume with below error: MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32 Version-Release number of selected component (if applicable): openshift v3.4.0.3 kubernetes v1.4.0-beta.3+d19513f etcd 3.0.9 How reproducible: Always Steps to Reproduce: 1.Create a StorageClass 2.Create a PVC reference above StorageClass 3.Create a pod using above PVC. 4.Check pod status. Actual results: Pod failed to mount the volume. Expected results: Pod is running with the volume. Additional info: # oc describe pods pod-ehlt0 -n ehlt0 Name: pod-ehlt0 Namespace: ehlt0 Security Policy: restricted Node: qe-lxia-ocp34-node-registry-router-1/10.240.0.11 Start Time: Tue, 27 Sep 2016 03:41:20 -0400 Labels: <none> Status: Pending IP: Controllers: <none> Containers: dynamic: Container ID: Image: aosqe/hello-openshift Image ID: Port: 80/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Volume Mounts: /mnt/iaas from dynamic (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-m3xpy (ro) Environment Variables: <none> Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: dynamic: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: pvc-ehlt0 ReadOnly: false default-token-m3xpy: Type: Secret (a volume populated by a Secret) SecretName: default-token-m3xpy QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 19m 19m 1 {default-scheduler } Normal Scheduled Successfully assigned pod-ehlt0 to qe-lxia-ocp34-node-registry-router-1 17m 1m 12 {kubelet qe-lxia-ocp34-node-registry-router-1} Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32 Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist 17m 1m 8 {kubelet qe-lxia-ocp34-node-registry-router-1} Warning FailedMount Unable to mount volumes for pod "pod-ehlt0_ehlt0(c7fb9208-8485-11e6-9b14-42010af0000a)": timeout expired waiting for volumes to attach/mount for pod "pod-ehlt0"/"ehlt0". list of unattached/unmounted volumes=[dynamic] 17m 1m 8 {kubelet qe-lxia-ocp34-node-registry-router-1} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "pod-ehlt0"/"ehlt0". list of unattached/unmounted volumes=[dynamic] [root@qe-lxia-ocp34-node-registry-router-1 ~]# ls -a /var/lib/origin/openshift.local.volumes/plugins/ . .. Logs in node: Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.406827 17523 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") to pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a"). Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407011 17523 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/c7fb9208-8485-11e6-9b14-42010af0000a-default-token-m3xpy" (spec.Name: "default-token-m3xpy") to pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a"). Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407116 17523 secret.go:164] Setting up volume default-token-m3xpy for pod c7fb9208-8485-11e6-9b14-42010af0000a at /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407446 17523 empty_dir_linux.go:39] Determining mount medium of /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407474 17523 empty_dir_linux.go:49] Statfs_t of /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy: {Type:1481003842 Bsize:4096 Blocks:2618880 Bfree:2101542 Bavail:2101542 Files:10485760 Ffree:10448730 Fsid:{X__val:[64768 0]} Namelen:255 Frsize:4096 Flags:4128 Spare:[0 0 0 0]} Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407588 17523 empty_dir.go:258] pod c7fb9208-8485-11e6-9b14-42010af0000a: mounting tmpfs for volume wrapped_default-token-m3xpy with opts [rootcontext="system_u:object_r:svirt_sandbox_file_t:s0"] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407603 17523 mount_linux.go:103] Mounting tmpfs /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy tmpfs [rootcontext="system_u:object_r:svirt_sandbox_file_t:s0"] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415215 17523 gce_pd.go:253] PersistentDisk set up: /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a false stat /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a: no such file or directory, pd name kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a readOnly false Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415446 17523 gce_pd.go:274] attempting to mount /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415466 17523 mount_linux.go:103] Mounting /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420747 17523 mount_linux.go:108] Mount failed: exit status 32 Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420850 17523 gce_pd.go:300] Mount of disk /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a failed: mount failed: exit status 32 Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420942 17523 nestedpendingoperations.go:253] Operation for "\"kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a\" (\"c7fb9208-8485-11e6-9b14-42010af0000a\")" failed. No retries permitted until 2016-09-27 03:43:00.920911462 -0400 EDT (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32 Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist
It seems to me that kubelet + mount manager tries to mount something that has not yet been attached by attach/detach controller. Can you please start a fresh cluster and attach controller-manager (openshift master) and kubelet (node) logs with loglevel 10 when it happens again? Or ping me when I am online (UTC+2, I have some overlap with China business hours)
On a GCE machine provided by Liang: $ oc version oc v3.4.0.8 kubernetes v1.4.0+776c994 features: Basic-Auth GSSAPI Kerberos SPNEGO Node logs: volume_manager.go:324] Waiting for volumes to attach and mount for pod "dynamic_default(3453165b-8eba-11e6-ba91-42010af00005)" gce_util.go:188] Failed to get GCE Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead It seems that GCE PD volume plugin on the node does not have a valid reference to GCE volume provider and therefore it cannot check that a volume has been attached. Cloud provider *is* configured in /etc/origin/node/node-config.yaml: ... kubeletArguments: cloud-provider: - gce ... * It must be something in OpenShift that prevents the cloud provider to be passed to volume plugin. * As a separate bug, the error about missing cloud provider should find its way into `oc describe pod`. "timeout expired waiting for volumes to attach/mount" does not say anything useful.
Digging deeper into this, whole kubelet has cloud == nil. It is never initialized. It seems that in 1.3 kubelet got KubeletConfig with "--cloud-provider=gce --cloud-config=xyz" and it would initialized its cloud interface by itself. In 1.4 it expects the cloud already initialized in KubeletDeps. Compare RunKubelet in 1.3: https://github.com/openshift/origin/blob/07c01a63a1cc783446494323ddd7e4b8a6b49e57/pkg/cmd/server/kubernetes/node.go#L315 and 1.4: https://github.com/openshift/origin/blob/3de3eec624b410bcf7b6705133919ef98331f3f4/pkg/cmd/server/kubernetes/node.go#L322 @ccoleman, you reworked RunKubelet(), can you please add a cloud provider to KubeletDeps?
This is blocking OCP 3.4 storage testing.
Filled https://github.com/openshift/origin/pull/11620
This hasn't merged to origin yet.
Now it's merged :-)
This has been merged into ose and is in OSE v3.4.0.19 or newer.
Retried the same steps as in #comment 0 on GCE/AWS/OpenStack/Azure with below version, pod can be running with specified volumes. # openshift version openshift v3.4.0.19+346a31d kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 Move bug to verified.
*** Bug 1391285 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0066