| Summary: | MountVolume.SetUp failed | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Liang Xia <lxia> |
| Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
| Status: | CLOSED ERRATA | QA Contact: | Liang Xia <lxia> |
| Severity: | high | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 3.4.0 | CC: | agoldste, aos-bugs, bingli, ccoleman, dakini, decarr, dma, haowang, jokerman, jsafrane, lxia, mmccomas, tdawson, xtian |
| Target Milestone: | --- | Keywords: | TestBlocker |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | No Doc Update | |
| Doc Text: |
No documentation needed, OpenShift with this bug has never been released to customer.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-01-18 12:41:35 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
It seems to me that kubelet + mount manager tries to mount something that has not yet been attached by attach/detach controller. Can you please start a fresh cluster and attach controller-manager (openshift master) and kubelet (node) logs with loglevel 10 when it happens again? Or ping me when I am online (UTC+2, I have some overlap with China business hours) On a GCE machine provided by Liang: $ oc version oc v3.4.0.8 kubernetes v1.4.0+776c994 features: Basic-Auth GSSAPI Kerberos SPNEGO Node logs: volume_manager.go:324] Waiting for volumes to attach and mount for pod "dynamic_default(3453165b-8eba-11e6-ba91-42010af00005)" gce_util.go:188] Failed to get GCE Cloud Provider. plugin.host.GetCloudProvider returned <nil> instead It seems that GCE PD volume plugin on the node does not have a valid reference to GCE volume provider and therefore it cannot check that a volume has been attached. Cloud provider *is* configured in /etc/origin/node/node-config.yaml: ... kubeletArguments: cloud-provider: - gce ... * It must be something in OpenShift that prevents the cloud provider to be passed to volume plugin. * As a separate bug, the error about missing cloud provider should find its way into `oc describe pod`. "timeout expired waiting for volumes to attach/mount" does not say anything useful. Digging deeper into this, whole kubelet has cloud == nil. It is never initialized. It seems that in 1.3 kubelet got KubeletConfig with "--cloud-provider=gce --cloud-config=xyz" and it would initialized its cloud interface by itself. In 1.4 it expects the cloud already initialized in KubeletDeps. Compare RunKubelet in 1.3: https://github.com/openshift/origin/blob/07c01a63a1cc783446494323ddd7e4b8a6b49e57/pkg/cmd/server/kubernetes/node.go#L315 and 1.4: https://github.com/openshift/origin/blob/3de3eec624b410bcf7b6705133919ef98331f3f4/pkg/cmd/server/kubernetes/node.go#L322 @ccoleman, you reworked RunKubelet(), can you please add a cloud provider to KubeletDeps? This is blocking OCP 3.4 storage testing. This hasn't merged to origin yet. Now it's merged :-) This has been merged into ose and is in OSE v3.4.0.19 or newer. Retried the same steps as in #comment 0 on GCE/AWS/OpenStack/Azure with below version, pod can be running with specified volumes. # openshift version openshift v3.4.0.19+346a31d kubernetes v1.4.0+776c994 etcd 3.1.0-rc.0 Move bug to verified. *** Bug 1391285 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0066 |
Description of problem: Pod can not mount dynamic provisoned volume with below error: MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32 Version-Release number of selected component (if applicable): openshift v3.4.0.3 kubernetes v1.4.0-beta.3+d19513f etcd 3.0.9 How reproducible: Always Steps to Reproduce: 1.Create a StorageClass 2.Create a PVC reference above StorageClass 3.Create a pod using above PVC. 4.Check pod status. Actual results: Pod failed to mount the volume. Expected results: Pod is running with the volume. Additional info: # oc describe pods pod-ehlt0 -n ehlt0 Name: pod-ehlt0 Namespace: ehlt0 Security Policy: restricted Node: qe-lxia-ocp34-node-registry-router-1/10.240.0.11 Start Time: Tue, 27 Sep 2016 03:41:20 -0400 Labels: <none> Status: Pending IP: Controllers: <none> Containers: dynamic: Container ID: Image: aosqe/hello-openshift Image ID: Port: 80/TCP State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Volume Mounts: /mnt/iaas from dynamic (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-m3xpy (ro) Environment Variables: <none> Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: dynamic: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: pvc-ehlt0 ReadOnly: false default-token-m3xpy: Type: Secret (a volume populated by a Secret) SecretName: default-token-m3xpy QoS Class: BestEffort Tolerations: <none> Events: FirstSeen LastSeen Count From SubobjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 19m 19m 1 {default-scheduler } Normal Scheduled Successfully assigned pod-ehlt0 to qe-lxia-ocp34-node-registry-router-1 17m 1m 12 {kubelet qe-lxia-ocp34-node-registry-router-1} Warning FailedMount MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32 Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist 17m 1m 8 {kubelet qe-lxia-ocp34-node-registry-router-1} Warning FailedMount Unable to mount volumes for pod "pod-ehlt0_ehlt0(c7fb9208-8485-11e6-9b14-42010af0000a)": timeout expired waiting for volumes to attach/mount for pod "pod-ehlt0"/"ehlt0". list of unattached/unmounted volumes=[dynamic] 17m 1m 8 {kubelet qe-lxia-ocp34-node-registry-router-1} Warning FailedSync Error syncing pod, skipping: timeout expired waiting for volumes to attach/mount for pod "pod-ehlt0"/"ehlt0". list of unattached/unmounted volumes=[dynamic] [root@qe-lxia-ocp34-node-registry-router-1 ~]# ls -a /var/lib/origin/openshift.local.volumes/plugins/ . .. Logs in node: Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.406827 17523 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") to pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a"). Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407011 17523 reconciler.go:299] MountVolume operation started for volume "kubernetes.io/secret/c7fb9208-8485-11e6-9b14-42010af0000a-default-token-m3xpy" (spec.Name: "default-token-m3xpy") to pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a"). Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407116 17523 secret.go:164] Setting up volume default-token-m3xpy for pod c7fb9208-8485-11e6-9b14-42010af0000a at /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407446 17523 empty_dir_linux.go:39] Determining mount medium of /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407474 17523 empty_dir_linux.go:49] Statfs_t of /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy: {Type:1481003842 Bsize:4096 Blocks:2618880 Bfree:2101542 Bavail:2101542 Files:10485760 Ffree:10448730 Fsid:{X__val:[64768 0]} Namelen:255 Frsize:4096 Flags:4128 Spare:[0 0 0 0]} Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407588 17523 empty_dir.go:258] pod c7fb9208-8485-11e6-9b14-42010af0000a: mounting tmpfs for volume wrapped_default-token-m3xpy with opts [rootcontext="system_u:object_r:svirt_sandbox_file_t:s0"] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.407603 17523 mount_linux.go:103] Mounting tmpfs /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~secret/default-token-m3xpy tmpfs [rootcontext="system_u:object_r:svirt_sandbox_file_t:s0"] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415215 17523 gce_pd.go:253] PersistentDisk set up: /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a false stat /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a: no such file or directory, pd name kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a readOnly false Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415446 17523 gce_pd.go:274] attempting to mount /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: I0927 03:43:00.415466 17523 mount_linux.go:103] Mounting /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420747 17523 mount_linux.go:108] Mount failed: exit status 32 Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420850 17523 gce_pd.go:300] Mount of disk /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a failed: mount failed: exit status 32 Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: E0927 03:43:00.420942 17523 nestedpendingoperations.go:253] Operation for "\"kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a\" (\"c7fb9208-8485-11e6-9b14-42010af0000a\")" failed. No retries permitted until 2016-09-27 03:43:00.920911462 -0400 EDT (durationBeforeRetry 500ms). Error: MountVolume.SetUp failed for volume "kubernetes.io/gce-pd/c7fb9208-8485-11e6-9b14-42010af0000a-pvc-c0d36772-8485-11e6-9b14-42010af0000a" (spec.Name: "pvc-c0d36772-8485-11e6-9b14-42010af0000a") pod "c7fb9208-8485-11e6-9b14-42010af0000a" (UID: "c7fb9208-8485-11e6-9b14-42010af0000a") with: mount failed: exit status 32 Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Mounting arguments: /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a /var/lib/origin/openshift.local.volumes/pods/c7fb9208-8485-11e6-9b14-42010af0000a/volumes/kubernetes.io~gce-pd/pvc-c0d36772-8485-11e6-9b14-42010af0000a [bind] Sep 27 03:43:00 qe-lxia-ocp34-node-registry-router-1 atomic-openshift-node: Output: mount: special device /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/gce-pd/mounts/kubernetes-dynamic-pvc-c0d36772-8485-11e6-9b14-42010af0000a does not exist