Bug 1379600 - MountVolume.SetUp failed
Summary: MountVolume.SetUp failed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.4.0
Hardware: Unspecified
OS: Unspecified
urgent
high
Target Milestone: ---
: ---
Assignee: Jan Safranek
QA Contact: Liang Xia
URL:
Whiteboard:
: 1391285 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-27 08:16 UTC by Liang Xia
Modified: 2017-03-08 18:43 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
No documentation needed, OpenShift with this bug has never been released to customer.
Clone Of:
Environment:
Last Closed: 2017-01-18 12:41:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0066 0 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.4 RPM Release Advisory 2017-01-18 17:23:26 UTC

Description Liang Xia 2016-09-27 08:16:44 UTC
Description of problem:
Pod can not mount dynamic provisoned volume with below error:
MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32

Version-Release number of selected component (if applicable):
openshift v3.4.0.3
kubernetes v1.4.0-beta.3+d19513f
etcd 3.0.9

How reproducible:
Always

Steps to Reproduce:
1.Create a StorageClass
2.Create a PVC reference above StorageClass
3.Create a pod using above PVC.
4.Check pod status.

Actual results:
Pod failed to mount the volume.

Expected results:
Pod is running with the volume.

Additional info:
# oc describe pods pod-ehlt0 -n ehlt0
Name:			pod-ehlt0
Namespace:		ehlt0
Security Policy:	restricted
Node:			qe-lxia-ocp34-node-registry-router-1/10.240.0.11
Start Time:		Tue, 27 Sep 2016 03:41:20 -0400
Labels:			<none>
Status:			Pending
IP:			
Controllers:		<none>
Containers:
  dynamic:
    Container ID:	
    Image:		aosqe/hello-openshift
    Image ID:		
    Port:		80/TCP
    State:		Waiting
      Reason:		ContainerCreating
    Ready:		False
    Restart Count:	0
    Volume Mounts:
      /mnt/iaas from dynamic (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-m3xpy (ro)
    Environment Variables:	<none>
Conditions:
  Type		Status
  Initialized 	True 
  Ready 	False 
  PodScheduled 	True 
Volumes:
  dynamic:
    Type:	PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:	pvc-ehlt0
    ReadOnly:	false
  default-token-m3xpy:
    Type:	Secret (a volume populated by a Secret)
    SecretName:	default-token-m3xpy
QoS Class:	BestEffort
Tolerations:	<none>
Events:
  FirstSeen	LastSeen	Count	From						SubobjectPath	Type		Reason		Message
  ---------	--------	-----	----						-------------	--------	------		-------
  19m		19m		1	{default-scheduler }						Normal		Scheduled	Successfully assigned pod-ehlt0 to qe-lxia-ocp34-node-registry-router-1
  17m		1m		12	{kubelet qe-lxia-ocp34-node-registry-router-1}			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32
Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a  [bind]
Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist


  17m	1m	8	{kubelet qe-lxia-ocp34-node-registry-router-1}		Warning	FailedMount	Unable to mount volumes for pod "pod-ehlt0_ehlt0(c7fb9208-8485-11e6-9b14-42010af0000a)": timeout expired waiting for volumes to attach/mount for pod "pod-ehlt0"/"ehlt0". list of unattached/unmounted volumes=[dynamic]
  17m	1m	8	{kubelet qe-lxia-ocp34-node-registry-router-1}		Warning	FailedSync	Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "pod-ehlt0"/"ehlt0". list of unattached/unmounted volumes=[dynamic]


[root@qe-lxia-ocp34-node-registry-router-1 ~]# ls -a /var/lib/origin/openshift.local.volumes/plugins/
.  ..

Logs in node:
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.406827   17523 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") to pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a").
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407011   17523 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/c7fb9208-8485-11e6-9b14-42010af0000a-default-token-m3xpy" (spec.Name: "default-token-m3xpy") to pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a").
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407116   17523 secret.go:164] Setting up volume default-token-m3xpy for pod c7fb9208-8485-11e6-9b14-42010af0000a at /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407446   17523 empty_dir_linux.go:39] Determining mount medium of /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407474   17523 empty_dir_linux.go:49] Statfs_t of /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy: {Type:1481003842 Bsize:4096 Blocks:2618880 Bfree:2101542 Bavail:2101542 Files:10485760 Ffree:10448730 Fsid:{X__val:[64768 0]} Namelen:255 Frsize:4096 Flags:4128 Spare:[0 0 0 0]}
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407588   17523 empty_dir.go:258] pod c7fb9208-8485-11e6-9b14-42010af0000a: mounting tmpfs for volume wrapped_default-token-m3xpy with opts [rootcontext="system_u:object_r:svirt_sandbox_file_t:s0"]
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407603   17523 mount_linux.go:103] Mounting tmpfs /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy tmpfs [rootcontext="system_u:object_r:svirt_sandbox_file_t:s0"]
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415215   17523 gce_pd.go:253] PersistentDisk set up: /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a false stat /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a: no such file or directory, pd name kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a readOnly false
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415446   17523 gce_pd.go:274] attempting to mount /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415466   17523 mount_linux.go:103] Mounting /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a  [bind]
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420747   17523 mount_linux.go:108] Mount failed: exit status 32
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a  [bind]
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420850   17523 gce_pd.go:300] Mount of disk /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a failed: mount failed: exit status 32
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a  [bind]
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420942   17523 nestedpendingoperations.go:253] Operation for "\"kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a\" (\"c7fb9208-8485-11e6-9b14-42010af0000a\")" failed. No retries permitted until 2016-09-27 03:43:00.920911462 -0400 EDT (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a  [bind]
Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist

Comment 1 Jan Safranek 2016-10-07 12:41:44 UTC
It seems to me that kubelet + mount manager tries to mount something that has not yet been attached by attach/detach controller.

Can you please start a fresh cluster and attach controller-manager (openshift master) and kubelet (node) logs with loglevel 10 when it happens again? Or ping me when I am online (UTC+2, I have some overlap with China business hours)

Comment 2 Jan Safranek 2016-10-10 08:35:25 UTC
On a GCE machine provided by Liang:

$ oc version
oc v3.4.0.8
kubernetes v1.4.0+776c994
features: Basic-Auth GSSAPI Kerberos SPNEGO

Node logs:

volume_manager.go:324] Waiting for volumes to attach and mount for pod "dynamic_default(3453165b-8eba-11e6-ba91-42010af00005)"
gce_util.go:188] Failed to get GCE Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead


It seems that GCE PD volume plugin on the node does not have a valid reference to GCE volume provider and therefore it cannot check that a volume has been attached. Cloud provider *is* configured in /etc/origin/node/node-config.yaml:

...
kubeletArguments: 
  cloud-provider:
  - gce
...

* It must be something in OpenShift that prevents the cloud provider to be passed to volume plugin.

* As a separate bug, the error about missing cloud provider should find its way into `oc describe pod`. "timeout expired waiting for volumes to attach/mount"  does not say anything useful.

Comment 3 Jan Safranek 2016-10-10 15:04:50 UTC
Digging deeper into this, whole kubelet has cloud == nil. It is never initialized.

It seems that in 1.3 kubelet got KubeletConfig with "--cloud-provider=gce --cloud-config=xyz" and it would initialized its cloud interface by itself. In 1.4 it expects the cloud already initialized in KubeletDeps. 

Compare RunKubelet in 1.3: https://github.com/openshift/origin/blob/07c01a63a1cc783446494323ddd7e4b8a6b49e57/pkg/cmd/server/kubernetes/node.go#L315

and 1.4: https://github.com/openshift/origin/blob/3de3eec624b410bcf7b6705133919ef98331f3f4/pkg/cmd/server/kubernetes/node.go#L322

@ccoleman, you reworked RunKubelet(), can you please add a cloud provider to KubeletDeps?

Comment 4 Liang Xia 2016-10-25 03:14:12 UTC
This is blocking OCP 3.4 storage testing.

Comment 5 Jan Safranek 2016-10-27 15:08:04 UTC
Filled https://github.com/openshift/origin/pull/11620

Comment 6 Andy Goldstein 2016-10-31 14:10:59 UTC
This hasn't merged to origin yet.

Comment 7 Andy Goldstein 2016-10-31 16:18:19 UTC
Now it's merged :-)

Comment 8 Troy Dawson 2016-11-02 17:46:23 UTC
This has been merged into ose and is in OSE v3.4.0.19 or newer.

Comment 10 Liang Xia 2016-11-03 07:00:58 UTC
Retried the same steps as in #comment 0 on GCE/AWS/OpenStack/Azure with below version, pod can be running with specified volumes.

# openshift version
openshift v3.4.0.19+346a31d
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0

Move bug to verified.

Comment 11 Matthew Wong 2016-11-03 20:47:46 UTC
*** Bug 1391285 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2017-01-18 12:41:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066


Note You need to log in before you can comment on or make changes to this bug.