Bug 1657668
| Summary: | Persistent volume HostPath type check failed for character device | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Liang Xia <lxia> |
| Component: | Storage | Assignee: | Jan Safranek <jsafrane> |
| Status: | CLOSED ERRATA | QA Contact: | Liang Xia <lxia> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.9.0 | CC: | aos-bugs, aos-storage-staff, jsafrane, piqin |
| Target Milestone: | --- | ||
| Target Release: | 3.9.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2019-04-09 14:20:17 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Liang Xia
2018-12-10 09:07:59 UTC
Indeed, the code is wrong for character devices, kubernetes detects them as block devices: https://github.com/openshift/ose/blob/745e58e4cfa2e376ea638068b9c9d6b6c4aeaf45/vendor/k8s.io/kubernetes/pkg/util/mount/mount_linux.go#L441-L442 It's trivial to fix, however, does it make sense in 3.9? No customer is complaining and the bug is fixed in 3.10 and 3.11. Kubernetes 1.9 does not have such issue, https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/util/mount/mount_linux.go#L424 https://github.com/kubernetes/kubernetes/blob/release-1.9/pkg/util/mount/mount.go#L331 OCP 3.9 do have the issue, https://github.com/openshift/ose/blob/enterprise-3.9/vendor/k8s.io/kubernetes/pkg/util/mount/mount_linux.go#L424 So QE would like the issue be fixed. This way you could require backport of any fix in kubernetes 1.9.x that has not been fixed in 3.9.z and there's plenty of them ;-) For this time: 3.9 PR: https://github.com/openshift/ose/pull/1483 PR https://github.com/openshift/ose/pull/1483 merged, (Merge date is 2019-01-11) but no build contains the code change yet. (Latest build is atomic-openshift-3.9.64-1.git.0.13cd345.el7, which was build on 2019-01-05 09:42:35) QE will check again when new build is there. Move back to modified since still no build contains the fix. First, QE verified that the fix is in the build. Changelog * Mon Jan 21 2019 AOS Automation Release Team <***@redhat.com> 3.9.65-1 - UPSTREAM: 60510: fix bug where character devices are not recognized (jsafrane) - UPSTREAM: 62304: Remove isNotDir error check (jsafrane) Then, QE set up two clusters with the same version, one is containerized, one is rpm. # oc version oc v3.9.65 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://qe-lxia-39-container-master-etcd-1:8443 openshift v3.9.65 kubernetes v1.9.1+a0ce1bc657 # oc version oc v3.9.65 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://qe-lxia-39-rpm-master-etcd-1:8443 openshift v3.9.65 kubernetes v1.9.1+a0ce1bc657 And verified path /dev/zero exist on both cluster nodes. [root@qe-lxia-39-container-master-etcd-1 ~]# ls -l /dev/zero crw-rw-rw-. 1 root root 1, 5 Jan 23 02:37 /dev/zero [root@qe-lxia-39-rpm-master-etcd-1 ~]# ls -l /dev/zero crw-rw-rw-. 1 root root 1, 5 Jan 22 23:24 /dev/zero Then with the same content of pv/pvc/pod, pod on rpm cluster is up and running. But pod on containerized cluster failed with error: hostPath type check failed: /dev/zero is not a character device # oc describe pod mypod Name: mypod Namespace: default Node: qe-lxia-39-container-master-etcd-1/172.16.122.23 Start Time: Wed, 23 Jan 2019 05:10:50 +0000 Labels: <none> Annotations: openshift.io/scc=privileged Status: Pending IP: Containers: mycontainer: Container ID: Image: aosqe/hello-openshift Image ID: Port: <none> State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment: <none> Mounts: /mnt/ocp from my-volume (rw) /var/run/secrets/kubernetes.io/serviceaccount from default-token-gvcjv (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: my-volume: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: pvc0001 ReadOnly: false default-token-gvcjv: Type: Secret (a volume populated by a Secret) SecretName: default-token-gvcjv Optional: false QoS Class: BestEffort Node-Selectors: <none> Tolerations: <none> Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 5m default-scheduler Successfully assigned mypod to qe-lxia-39-container-master-etcd-1 Normal SuccessfulMountVolume 5m kubelet, qe-lxia-39-container-master-etcd-1 MountVolume.SetUp succeeded for volume "default-token-gvcjv" Warning FailedMount 1m (x10 over 5m) kubelet, qe-lxia-39-container-master-etcd-1 MountVolume.SetUp failed for volume "pv-qexoq" : hostPath type check failed: /dev/zero is not a character device Warning FailedMount 1m (x2 over 3m) kubelet, qe-lxia-39-container-master-etcd-1 Unable to mount volumes for pod "mypod_default(406a38d5-1ecd-11e9-91e3-fa163e7184ee)": timeout expired waiting for volumes to attach/mount for pod "default"/"mypod". list of unattached/unmounted volumes=[my-volume] Logs from node, Jan 23 05:19:04 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:04.880174 55543 volume_host.go:218] using default mounter/exec for kubernetes.io/host-path Jan 23 05:19:04 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:04.980361 55543 operation_executor.go:895] Starting operationExecutor.MountVolume for volume "pv-qexoq" (UniqueName: "kubernetes.io/host-path/406a38d5-1ecd-11e9-91e3-fa163e7184ee-pv-qexoq") pod "mypod" (UID: "406a38d5-1ecd-11e9-91e3-fa163e7184ee") Jan 23 05:19:04 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:04.980414 55543 volume_host.go:218] using default mounter/exec for kubernetes.io/host-path Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.080632 55543 operation_executor.go:895] Starting operationExecutor.MountVolume for volume "pv-qexoq" (UniqueName: "kubernetes.io/host-path/406a38d5-1ecd-11e9-91e3-fa163e7184ee-pv-qexoq") pod "mypod" (UID: "406a38d5-1ecd-11e9-91e3-fa163e7184ee") Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.080678 55543 volume_host.go:218] using default mounter/exec for kubernetes.io/host-path Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.177422 55543 kubelet.go:2123] Container runtime status: Runtime Conditions: RuntimeReady=true reason: message:, NetworkReady=true reason: message: Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.180849 55543 operation_executor.go:895] Starting operationExecutor.MountVolume for volume "pv-qexoq" (UniqueName: "kubernetes.io/host-path/406a38d5-1ecd-11e9-91e3-fa163e7184ee-pv-qexoq") pod "mypod" (UID: "406a38d5-1ecd-11e9-91e3-fa163e7184ee") Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.180903 55543 volume_host.go:218] using default mounter/exec for kubernetes.io/host-path Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.180946 55543 reconciler.go:262] operationExecutor.MountVolume started for volume "pv-qexoq" (UniqueName: "kubernetes.io/host-path/406a38d5-1ecd-11e9-91e3-fa163e7184ee-pv-qexoq") pod "mypod" (UID: "406a38d5-1ecd-11e9-91e3-fa163e7184ee") Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.181001 55543 nsenter.go:107] Running nsenter command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/stat -L --printf "%F" /dev/zero] Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: E0123 05:19:05.183314 55543 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/host-path/406a38d5-1ecd-11e9-91e3-fa163e7184ee-pv-qexoq\" (\"406a38d5-1ecd-11e9-91e3-fa163e7184ee\")" failed. No retries permitted until 2019-01-23 05:21:07.183283219 +0000 UTC m=+5819.451739875 (durationBeforeRetry 2m2s). Error: "MountVolume.SetUp failed for volume \"pv-qexoq\" (UniqueName: \"kubernetes.io/host-path/406a38d5-1ecd-11e9-91e3-fa163e7184ee-pv-qexoq\") pod \"mypod\" (UID: \"406a38d5-1ecd-11e9-91e3-fa163e7184ee\") : hostPath type check failed: /dev/zero is not a character device" Jan 23 05:19:05 qe-lxia-39-container-master-etcd-1 atomic-openshift-node[55517]: I0123 05:19:05.183690 55543 server.go:290] Event(v1.ObjectReference{Kind:"Pod", Namespace:"default", Name:"mypod", UID:"406a38d5-1ecd-11e9-91e3-fa163e7184ee", APIVersion:"v1", ResourceVersion:"23724", FieldPath:""}): type: 'Warning' reason: 'FailedMount' MountVolume.SetUp failed for volume "pv-qexoq" : hostPath type check failed: /dev/zero is not a character device Hi Jan, Can this issue be dropped from 3.9.z errata? # oc version oc v3.9.72 kubernetes v1.9.1+a0ce1bc657 features: Basic-Auth GSSAPI Kerberos SPNEGO Server https://qe-lxia-39-master-etcd-1:8443 openshift v3.9.72 kubernetes v1.9.1+a0ce1bc657 ========================================================= # ls -l /dev/zero crw-rw-rw-. 1 root root 1, 5 Mar 13 07:33 /dev/zero ========================================================= # oc get pods NAME READY STATUS RESTARTS AGE dynamic 1/1 Running 0 2m ========================================================= # oc rsh dynamic / # ls -lh /mnt/ocp_pv crw-rw-rw- 1 root root 1, 5 Mar 13 07:33 /mnt/ocp_pv ========================================================= # cat installation_matrix atomic-openshift version: v3.9.72 Operation System: Red Hat Enterprise Linux Atomic Host release 7.5 Cluster Install Method: docker container Docker Version: docker-1.13.1-58.git87f2fab.el7.x86_64 Docker Storage Driver: overlay2 OpenvSwitch Version: openvswitch-2.9.0-97.el7fdp.x86_64 etcd Version: etcd-3.2.22-1.el7.x86_64 Network Plugin: redhat/openshift-ovs-subnet Auth Method: allowall Registry Deployment Method: deploymentconfig Secure Registry: True Registry Backend Storage: cinder Load Balancer: None Docker System Container: False CRI-O Enable: False Firewall Service: iptables Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0619 |