Bug 1289921
Summary: | Containerized OpenShift needs to have necessary packages installed for various volume plugins | ||||||
---|---|---|---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Liang Xia <lxia> | ||||
Component: | Storage | Assignee: | Bradley Childs <bchilds> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Jianwei Hou <jhou> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 3.1.0 | CC: | abutcher, aos-bugs, bchilds, bleanhar, eparis, erich, jhou, jokerman, jsafrane, mmccomas, mturansk, pmorie, sdodson, swagiaal, tdawson, xtian | ||||
Target Milestone: | --- | ||||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Unspecified | ||||||
Whiteboard: | |||||||
Fixed In Version: | openshift-ansible-3.0.40-1.git.1.4385281.el7aos | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2016-05-06 13:57:22 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
Description
Liang Xia
2015-12-09 10:59:59 UTC
Could also affect ceph rbd because ceph-common is not installed -bash-4.2# docker exec -t 01345041730a rpm -q ceph-common package ceph-common is not installed nfs, ceph rbd, glusterfs, iscsi need corresponding packages installed Reassigned to Brad as this is the "containerized mounting" request feature that prevents the need for various binaries to be on all hosts. Taking this one since we've addressed this in the installer. We still need to verify that the nsenter mounter is used for these plugins when running containerized if anyone can comment on that. https://github.com/openshift/openshift-ansible/pull/1124 https://github.com/openshift/openshift-ansible/pull/1130 Paul confirmed that the nsenter mounter is used. Moving to ON_QA for the advanced installation method. Pull requests: https://github.com/openshift/openshift-ansible/pull/1124 https://github.com/openshift/openshift-ansible/pull/1130 aep3_beta/node:a530e7a3c9d4 openshift v3.1.1.1 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 We have tried again on nfs/ceph/gluster, still can not mount. From client side: $ oc get event FIRSTSEEN LASTSEEN COUNT NAME KIND SUBOBJECT REASON SOURCE MESSAGE 25m 25m 1 nfs Pod Scheduled {scheduler } Successfully assigned nfs to openshift-126.lab.eng.nay.redhat.com 25m 9s 155 nfs Pod FailedMount {kubelet openshift-126.lab.eng.nay.redhat.com} Unable to mount volumes for pod "nfs_lxiap": exit status 32 25m 9s 155 nfs Pod FailedSync {kubelet openshift-126.lab.eng.nay.redhat.com} Error syncing pod, skipping: exit status 32 $ oc describe pods Name: nfs Namespace: lxiap Image(s): aosqe/hello-openshift Node: openshift-126.lab.eng.nay.redhat.com/10.66.79.126 Start Time: Wed, 13 Jan 2016 13:41:18 +0800 Labels: name=frontendhttp Status: Pending Reason: Message: IP: Replication Controllers: <none> Containers: myfrontend: Container ID: Image: aosqe/hello-openshift Image ID: QoS Tier: memory: BestEffort cpu: BestEffort State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment Variables: Conditions: Type Status Ready False Volumes: pvol: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: nfsc ReadOnly: false default-token-yk1c5: Type: Secret (a secret that should populate this volume) SecretName: default-token-yk1c5 Events: FirstSeen LastSeen Count From SubobjectPath Reason Message ───────── ──────── ───── ──── ───────────── ────── ─────── 4m 4m 1 {scheduler } Scheduled Successfully assigned nfs to openshift-126.lab.eng.nay.redhat.com 4m 10s 29 {kubelet openshift-126.lab.eng.nay.redhat.com} FailedMount Unable to mount volumes for pod "nfs_lxiap": exit status 32 4m 10s 29 {kubelet openshift-126.lab.eng.nay.redhat.com} FailedSync Error syncing pod, skipping: exit status 32 From node: E0113 13:23:39.592957 2944 vnids.go:255] Error fetching Net ID for namespace: chao, skipped netNsEvent: &{ADDED chao 13} I0113 13:30:03.998239 2944 proxier.go:352] Setting endpoints for "chao/glusterfs-cluster:" to [10.66.79.108:1 10.66.79.134:1] W0113 13:32:50.481555 2944 reflector.go:224] pkg/kubelet/config/apiserver.go:43: watch of *api.Pod ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [1880/1774]) [2879] W0113 13:32:51.086720 2944 reflector.go:224] pkg/kubelet/kubelet.go:223: watch of *api.Service ended with: 401: The event in requested index is outdated and cleared (the requested history has been cleared [1880/1629]) [2879] E0113 13:34:47.728448 2944 vnids.go:255] Error fetching Net ID for namespace: lxiap, skipped netNsEvent: &{ADDED lxiap 14} I0113 13:41:17.972337 2944 kubelet.go:2109] SyncLoop (ADD, "api"): "nfs_lxiap" E0113 13:41:18.308289 2944 kubelet.go:1461] Unable to mount volumes for pod "nfs_lxiap": exit status 32; skipping pod E0113 13:41:18.312970 2944 pod_workers.go:113] Error syncing pod 447c35eb-b9b8-11e5-9533-fa163e36bdd5, skipping: exit status 32 E0113 13:41:22.373366 2944 kubelet.go:1461] Unable to mount volumes for pod "nfs_lxiap": exit status 32; skipping pod E0113 13:41:22.379877 2944 pod_workers.go:113] Error syncing pod 447c35eb-b9b8-11e5-9533-fa163e36bdd5, skipping: exit status 32 E0113 13:41:32.419402 2944 kubelet.go:1461] Unable to mount volumes for pod "nfs_lxiap": exit status 32; skipping pod E0113 13:41:32.423448 2944 pod_workers.go:113] Error syncing pod 447c35eb-b9b8-11e5-9533-fa163e36bdd5, skipping: exit status 32 E0113 13:41:42.340943 2944 kubelet.go:1461] Unable to mount volumes for pod "nfs_lxiap": exit status 32; skipping pod Thanks Liang, could you provide the output above with log level 5? The storage folks want to verify that nsenter_mount.go appears in the log and that the error code is from mount.nfs. Created attachment 1114639 [details]
Logs via command 'docker logs atomic-openshift-node &> atomic-openshift-node.log'
I0114 10:35:37.187588 10746 nfs.go:161] NFS mount set up: /var/lib/origin/openshift.local.volumes/pods/1fd9b004-ba66-11e5-8ec8-fa163ed301db/volumes/kubernetes.io~nfs/nfs false file does not exist I0114 10:35:37.187814 10746 nsenter_mount.go:114] nsenter Mounting 10.66.79.133:/jhou /var/lib/origin/openshift.local.volumes/pods/1fd9b004-ba66-11e5-8ec8-fa163ed301db/volumes/kubernetes.io~nfs/nfs nfs [] I0114 10:35:37.187836 10746 nsenter_mount.go:117] Mount command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/mount -t nfs 10.66.79.133:/jhou /var/lib/origin/openshift.local.volumes/pods/1fd9b004-ba66-11e5-8ec8-fa163ed301db/volumes/kubernetes.io~nfs/nfs] I0114 10:35:37.190832 10746 manager.go:359] Container inspect result: {ID:d0285d3609c2e237ecfbda2b9928bcd082ac0d619cd77439b2326166aa16c5ed Created:2016-01-14 01:33:37.697941901 +0000 UTC Path:/usr/bin/openshift-router Args:[] Config:0xc20a7da820 State:{Running:true Paused:false Restarting:false OOMKilled:false Pid:4396 ExitCode:0 Error: StartedAt:2016-01-14 01:33:38.866863858 +0000 UTC FinishedAt:0001-01-01 00:00:00 +0000 UTC} Image:6be7abd829e1efd90fb2785d80a3232dccce8fd95fe91711c9bb3d738200a39b Node:<nil> NetworkSettings:0xc209d27700 SysInitPath: ResolvConfPath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/resolv.conf HostnamePath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/hostname HostsPath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/hosts LogPath:/var/lib/docker/containers/d0285d3609c2e237ecfbda2b9928bcd082ac0d619cd77439b2326166aa16c5ed/d0285d3609c2e237ecfbda2b9928bcd082ac0d619cd77439b2326166aa16c5ed-json.log Name:/k8s_router.5555da39_router-1-2sfrs_default_d432ff9c-ba5e-11e5-8ec8-fa163ed301db_161f320f Driver:devicemapper Mounts:[{Source:/var/lib/origin/openshift.local.volumes/pods/d432ff9c-ba5e-11e5-8ec8-fa163ed301db/containers/router/d0285d3609c2e237ecfbda2b9928bcd082ac0d619cd77439b2326166aa16c5ed Destination:/dev/termination-log Mode: RW:true} {Source:/var/lib/origin/openshift.local.volumes/pods/d432ff9c-ba5e-11e5-8ec8-fa163ed301db/volumes/kubernetes.io~secret/router-token-lg5i1 Destination:/var/run/secrets/kubernetes.io/serviceaccount Mode:ro RW:false}] Volumes:map[] VolumesRW:map[] HostConfig:0xc209fd7900 ExecIDs:[] RestartCount:0 AppArmorProfile:} I0114 10:35:37.196268 10746 manager.go:359] Container inspect result: {ID:4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822 Created:2016-01-14 01:33:36.320447027 +0000 UTC Path:/pod Args:[] Config:0xc20a7dad00 State:{Running:true Paused:false Restarting:false OOMKilled:false Pid:4347 ExitCode:0 Error: StartedAt:2016-01-14 01:33:37.282952706 +0000 UTC FinishedAt:0001-01-01 00:00:00 +0000 UTC} Image:b0a3a6b46ab34132726620fe1dd93d2f64f1e806e16811b9212929b975fd347b Node:<nil> NetworkSettings:0xc209d27900 SysInitPath: ResolvConfPath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/resolv.conf HostnamePath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/hostname HostsPath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/hosts LogPath:/var/lib/docker/containers/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822/4144dc34d30d9043ed56900b4da2c71c415bfaa0480d6bee1adf67bbe800b822-json.log Name:/k8s_POD.8325e3a2_router-1-2sfrs_default_d432ff9c-ba5e-11e5-8ec8-fa163ed301db_c4d09ded Driver:devicemapper Mounts:[] Volumes:map[] VolumesRW:map[] HostConfig:0xc209fd7b80 ExecIDs:[] RestartCount:0 AppArmorProfile:} I0114 10:35:37.196720 10746 manager.go:231] Ignoring same status for pod "router-1-2sfrs_default", status: {Phase:Running Conditions:[{Type:Ready Status:True LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-01-14 10:34:37.247178361 +0800 CST Reason: Message:}] Message: Reason: HostIP:10.14.6.135 PodIP:10.14.6.135 StartTime:2016-01-14 10:34:27.15222181 +0800 CST ContainerStatuses:[{Name:router State:{Waiting:<nil> Running:0xc20829ec00 Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:true RestartCount:0 Image:registry.access.redhat.com/aep3/aep-haproxy-router:v3.1.1.2 ImageID:docker://6be7abd829e1efd90fb2785d80a3232dccce8fd95fe91711c9bb3d738200a39b ContainerID:docker://d0285d3609c2e237ecfbda2b9928bcd082ac0d619cd77439b2326166aa16c5ed}]} I0114 10:35:37.212183 10746 nsenter_mount.go:121] Output from mount command: mount: wrong fs type, bad option, bad superblock on 10.66.79.133:/jhou, missing codepage or helper program, or other error (for several filesystems (e.g. nfs, cifs) you might need a /sbin/mount.<type> helper program) In some cases useful info is found in syslog - try dmesg | tail or so. I0114 10:35:37.212259 10746 nsenter_mount.go:174] findmnt command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/findmnt -o target --noheadings --target /var/lib/origin/openshift.local.volumes/pods/1fd9b004-ba66-11e5-8ec8-fa163ed301db/volumes/kubernetes.io~nfs/nfs] I0114 10:35:37.229423 10746 nsenter_mount.go:185] IsLikelyNotMountPoint findmnt output: / E0114 10:35:37.229632 10746 kubelet.go:1461] Unable to mount volumes for pod "nfs_lxiap": exit status 32; skipping pod I0114 10:35:37.229665 10746 kubelet.go:2772] Generating status for "nfs_lxiap" I0114 10:35:37.229734 10746 server.go:734] Event(api.ObjectReference{Kind:"Pod", Namespace:"lxiap", Name:"nfs", UID:"1fd9b004-ba66-11e5-8ec8-fa163ed301db", APIVersion:"v1", ResourceVersion:"4527", FieldPath:""}): reason: 'FailedMount' Unable to mount volumes for pod "nfs_lxiap": exit status 32 I0114 10:35:37.233746 10746 kubelet.go:2683] pod waiting > 0, pending I0114 10:35:37.233900 10746 manager.go:231] Ignoring same status for pod "nfs_lxiap", status: {Phase:Pending Conditions:[{Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-01-14 10:34:27.423556655 +0800 CST Reason:ContainersNotReady Message:containers with unready status: [myfrontend]}] Message: Reason: HostIP:10.14.6.135 PodIP: StartTime:2016-01-14 10:25:49 +0800 CST ContainerStatuses:[{Name:myfrontend State:{Waiting:0xc20a81a040 Running:<nil> Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:false RestartCount:0 Image:aosqe/hello-openshift ImageID: ContainerID:}]} E0114 10:35:37.234077 10746 pod_workers.go:113] Error syncing pod 1fd9b004-ba66-11e5-8ec8-fa163ed301db, skipping: exit status 32 I0114 10:35:37.234161 10746 server.go:734] Event(api.ObjectReference{Kind:"Pod", Namespace:"lxiap", Name:"nfs", UID:"1fd9b004-ba66-11e5-8ec8-fa163ed301db", APIVersion:"v1", ResourceVersion:"4527", FieldPath:""}): reason: 'FailedSync' Error syncing pod, skipping: exit status 32 And package nfsutils are not installed on the node(I not mean the container node). # rpm -qa | grep nfs # which mount.nfs /usr/bin/which: no mount.nfs in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin) # which mount /usr/bin/mount The same requirements exist for packages on the host when the container both is and is not containerized. The packages need to be installed on the host. It's not clear to me right now what the error behavior of this bug is -- the packages have to be installed on the host either way. Can the poster elaborate? Sorry, I meant when the _Kubelet_ is and is not containerized. We were not ensuring that nfs-utils was installed on non-atomic node hosts. This has been addressed in https://github.com/openshift/openshift-ansible/pull/1210. Checked on following version: openshift v3.1.1.6-16-g5327e56 kubernetes v1.1.0-origin-1107-g4c8e6f4 etcd 2.1.2 And NFS storage is mounted/working. But GlusterFS still failed to mount with error "unsupported volume type" $ oc describe pods gluster Name: gluster Namespace: lxiap Image(s): aosqe/hello-openshift Node: openshift-138.lab.eng.nay.redhat.com/10.66.79.138 Start Time: Sun, 14 Feb 2016 13:56:25 +0800 Labels: name=gluster Status: Pending Reason: Message: IP: Replication Controllers: <none> Containers: gluster: Container ID: Image: aosqe/hello-openshift Image ID: QoS Tier: cpu: BestEffort memory: BestEffort State: Waiting Reason: ContainerCreating Ready: False Restart Count: 0 Environment Variables: Conditions: Type Status Ready False Volumes: gluster: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: glusterc ReadOnly: false default-token-zkgmy: Type: Secret (a secret that should populate this volume) SecretName: default-token-zkgmy Events: FirstSeen LastSeen Count From SubobjectPath Reason Message ───────── ──────── ───── ──── ───────────── ────── ─────── 1m 1m 1 {scheduler } Scheduled Successfully assigned gluster to openshift-138.lab.eng.nay.redhat.com 1m 5s 12 {kubelet openshift-138.lab.eng.nay.redhat.com} FailedMount Unable to mount volumes for pod "gluster_lxiap": unsupported volume type 1m 5s 12 {kubelet openshift-138.lab.eng.nay.redhat.com} FailedSync Error syncing pod, skipping: unsupported volume type Logs on the node, I0214 14:01:18.289025 15097 nsenter_mount.go:174] findmnt command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/findmnt -o target --noheadings --target /var/lib/origin/openshift.local.volumes/pods/4f8f6878-d2c8-11e5-9b3c-fa163e266f4e/volumes/kubernetes.io~secret/router-token-v6231] I0214 14:01:18.295617 15097 docker.go:368] Docker Container: /atomic-openshift-node is not managed by kubelet. I0214 14:01:18.295646 15097 docker.go:368] Docker Container: /openvswitch is not managed by kubelet. I0214 14:01:18.295852 15097 volumes.go:205] Making a volume.Cleaner for volume kubernetes.io~empty-dir/registry-storage of pod 4e424648-d2c8-11e5-9b3c-fa163e266f4e I0214 14:01:18.295880 15097 volumes.go:241] Used volume plugin "kubernetes.io/empty-dir" for 4e424648-d2c8-11e5-9b3c-fa163e266f4e/kubernetes.io~empty-dir I0214 14:01:18.295894 15097 volumes.go:205] Making a volume.Cleaner for volume kubernetes.io~secret/default-token-rxa59 of pod 4e424648-d2c8-11e5-9b3c-fa163e266f4e I0214 14:01:18.295905 15097 volumes.go:241] Used volume plugin "kubernetes.io/secret" for 4e424648-d2c8-11e5-9b3c-fa163e266f4e/kubernetes.io~secret I0214 14:01:18.295965 15097 volumes.go:205] Making a volume.Cleaner for volume kubernetes.io~secret/router-token-v6231 of pod 4f8f6878-d2c8-11e5-9b3c-fa163e266f4e I0214 14:01:18.295977 15097 volumes.go:241] Used volume plugin "kubernetes.io/secret" for 4f8f6878-d2c8-11e5-9b3c-fa163e266f4e/kubernetes.io~secret I0214 14:01:18.300293 15097 volumes.go:109] Used volume plugin "kubernetes.io/persistent-claim" for gluster E0214 14:01:18.300337 15097 kubelet.go:1521] Unable to mount volumes for pod "gluster_lxiap": unsupported volume type; skipping pod I0214 14:01:18.300351 15097 kubelet.go:2836] Generating status for "gluster_lxiap" I0214 14:01:18.300527 15097 server.go:736] Event(api.ObjectReference{Kind:"Pod", Namespace:"lxiap", Name:"gluster", UID:"ae71ab5b-d2df-11e5-9b3c-fa163e266f4e", APIVersion:"v1", ResourceVersion:"3367", FieldPath:""}): reason: 'FailedMount' Unable to mount volumes for pod "gluster_lxiap": unsupported volume type I0214 14:01:18.303951 15097 nsenter_mount.go:185] IsLikelyNotMountPoint findmnt output: / I0214 14:01:18.304028 15097 kubelet.go:2747] pod waiting > 0, pending I0214 14:01:18.304059 15097 volumes.go:109] Used volume plugin "kubernetes.io/secret" for default-token-rxa59 I0214 14:01:18.304091 15097 nsenter_mount.go:174] findmnt command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/findmnt -o target --noheadings --target /var/lib/origin/openshift.local.volumes/pods/4e424648-d2c8-11e5-9b3c-fa163e266f4e/volumes/kubernetes.io~secret/default-token-rxa59] I0214 14:01:18.304146 15097 manager.go:231] Ignoring same status for pod "gluster_lxiap", status: {Phase:Pending Conditions:[{Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-02-14 13:56:25.207245215 +0800 CST Reason:ContainersNotReady Message:containers with unready status: [gluster]}] Message: Reason: HostIP:10.66.79.138 PodIP: StartTime:2016-02-14 13:56:25.207245952 +0800 CST ContainerStatuses:[{Name:gluster State:{Waiting:0xc20b13c200 Running:<nil> Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:false RestartCount:0 Image:aosqe/hello-openshift ImageID: ContainerID:}]} E0214 14:01:18.306384 15097 pod_workers.go:113] Error syncing pod ae71ab5b-d2df-11e5-9b3c-fa163e266f4e, skipping: unsupported volume type I0214 14:01:18.306586 15097 server.go:736] Event(api.ObjectReference{Kind:"Pod", Namespace:"lxiap", Name:"gluster", UID:"ae71ab5b-d2df-11e5-9b3c-fa163e266f4e", APIVersion:"v1", ResourceVersion:"3367", FieldPath:""}): reason: 'FailedSync' Error syncing pod, skipping: unsupported volume type Talking to Paul Morie it sounds like this bug is related/similar/same as: https://bugzilla.redhat.com/show_bug.cgi?id=1287016 After talking with Paul, this bug isn't a dupe of the above bug after all. Its related to the binary check on host to test for gluster. I removed the check upstream: https://github.com/kubernetes/kubernetes/pull/21758 and re-assigned to myself. ext4 also seems to be an issue, Bug 1316233 This fix is merged into openshift, moving to ON_QA. # openshift version openshift v3.2.0.5 kubernetes v1.2.0-36-g4a3f9c5 etcd 2.2.5 # uname -a Linux openshift-125.lab.sjc.redhat.com 3.10.0-327.13.1.el7.x86_64 #1 SMP Mon Feb 29 13:22:02 EST 2016 x86_64 x86_64 x86_64 GNU/Linux # cat /etc/redhat-release Red Hat Enterprise Linux Atomic Host release 7.2 Logs on the node: Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.793009 5814 kubelet.go:2444] SyncLoop (SYNC): 2 pods; gluster_lxiap(75b13c7d-ef32-11e5-bcc1-fa163efe3ad5), hooks-2-deploy_zhouy(7f75289b-ef2e-11e5-bcc1-fa163efe3ad5) Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.793128 5814 kubelet.go:3270] Generating status for "gluster_lxiap(75b13c7d-ef32-11e5-bcc1-fa163efe3ad5)" Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.793178 5814 kubelet.go:3237] pod waiting > 0, pending Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.793303 5814 manager.go:277] Ignoring same status for pod "gluster_lxiap(75b13c7d-ef32-11e5-bcc1-fa163efe3ad5)", status: {Phase:Pending Conditions:[{Type:Ready Status:False LastProbeTime:0001-01-01 00:00:00 +0000 UTC LastTransitionTime:2016-03-21 06:59:34 +0000 UTC Reason:ContainersNotReady Message:containers with unready status: [gluster]}] Message: Reason: HostIP:10.14.6.133 PodIP: StartTime:2016-03-21 06:59:34 +0000 UTC ContainerStatuses:[{Name:gluster State:{Waiting:0xc20a32b6c0 Running:<nil> Terminated:<nil>} LastTerminationState:{Waiting:<nil> Running:<nil> Terminated:<nil>} Ready:false RestartCount:0 Image:aosqe/hello-openshift ImageID: ContainerID:}]} Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.802867 5814 glusterfs.go:86] glusterfs: endpoints &{{ } {glusterfs-cluster lxiap /api/v1/namespaces/lxiap/endpoints/glusterfs-cluster 9fc14090-ef32-11e5-bcc1-fa163efe3ad5 46590 0 2016-03-21 07:00:41 +0000 UTC <nil> <nil> map[] map[]} [{[{10.66.79.108 <nil>} {10.66.79.134 <nil>}] [] [{ 1 TCP}]}]} Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.803019 5814 nsenter_mount.go:174] findmnt command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/findmnt -o target --noheadings --target /var/lib/origin/openshift.local.volumes/pods/75b13c7d-ef32-11e5-bcc1-fa163efe3ad5/volumes/kubernetes.io~glusterfs/gluster] Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.810278 5814 nsenter_mount.go:179] Failed findmnt command: exit status 1 Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.810331 5814 glusterfs.go:167] glusterfs: mount set up: /var/lib/origin/openshift.local.volumes/pods/75b13c7d-ef32-11e5-bcc1-fa163efe3ad5/volumes/kubernetes.io~glusterfs/gluster false <nil> Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.810529 5814 nsenter_mount.go:114] nsenter Mounting 10.66.79.108:testvol /var/lib/origin/openshift.local.volumes/pods/75b13c7d-ef32-11e5-bcc1-fa163efe3ad5/volumes/kubernetes.io~glusterfs/gluster glusterfs [log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/gluster/glusterfs.log] Mar 21 07:01:04 openshift-133.lab.sjc.redhat.com atomic-openshift-node[5764]: I0321 07:01:04.810562 5814 nsenter_mount.go:117] Mount command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/mount -t glusterfs -o log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/gluster/glusterfs.log 10.66.79.108:testvol /var/lib/origin/openshift.local.volumes/pods/75b13c7d-ef32-11e5-bcc1-fa163efe3ad5/volumes/kubernetes.io~glusterfs/gluster] Liang, Does installing the missing packages in the container fix the issue ? You can probably just: docker exec -t <NODE_CONTAINER_ID> yum -y install ceph-common gluster (In reply to Sami Wagiaalla from comment #26) > Liang, > > Does installing the missing packages in the container fix the issue ? > You can probably just: > docker exec -t <NODE_CONTAINER_ID> yum -y install ceph-common gluster scratch that... after looking at the code the packages need to be installed on the host. The FS host packages are determined during the installation script based on users input Can you verify the host environment and that you specified for the gluster packages to install during the config script? Node (the host) has the necessary package installed. $ ssh root.sjc.redhat.com rpm -qa | grep gluster glusterfs-3.7.1-17.atomic.1.el7.x86_64 glusterfs-fuse-3.7.1-17.atomic.1.el7.x86_64 glusterfs-libs-3.7.1-17.atomic.1.el7.x86_64 glusterfs-client-xlators-3.7.1-17.atomic.1.el7.x86_64 I'm not seeing a problem/failure in the logs... The only error like lines were: nsenter_mount.go:174] findmnt command: nsenter [--mount=/rootfs/proc/1/ns/mnt -- /bin/findmnt -o target --noheadings --target /var/lib/origin/openshift.local.volumes/pods/75b13c7d-ef32-11e5-bcc1-fa163efe3ad5/volumes/kubernetes.io~glusterfs/gluster] nsenter_mount.go:179] Failed findmnt command: exit status 1 What version of util-linux do you have installed (on the host)? The 'error' is normal on some versions. Why did this get kicked back to ASSIGNED? The logs don't show a failure. Maybe something useful/informative in /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/gluster/glusterfs.log ? Sorry, the "exit status 1" is caused by ""[2016-03-23 08:47:21.141106] E [name.c:242:af_inet_client_get_remote_sockaddr] 0-testvol-client-1: DNS resolution failed on host glusterfs-node02. After fixing the DNS issue, glusterFS is working fine. Also tested NFS, working fine. So this bug is fixed. Liang, if everything is fixed now, can you please mark this bug as VERIFIED? Move bug to verified. I still consider it a bug because this is only fixed on AH as comment 29. If OSE or AEP is setup as containerized, ansible skips installation of necessary storage packages, therefore this issue is still reproduced if user is trying to create a pod(eg, glusterfs). The solution is to install these packages on the node if they are not there, however the end users may not know this. I think either ansible installation need to handle that, or our documentation should tell users how to deal with such container installations. I'm changing this bug as assigned. There seems to be some misunderstanding. Mount utilities need to be on the host (e.g. mount.nfs, mount.glusterfs). All other utilities need to be in the container (mkfs.xfs, mkfs.ext4, iscsiadm, rbd, ...) I think that's where we are now with openshift-node 3.2.0.7 and atomic host, all utilities are on the right places. Some volume plugins don't work because of bug #1313210 - I'm working on that. (In reply to Hou Jianwei from comment #34) > I still consider it a bug because this is only fixed on AH as comment 29. If > OSE or AEP is setup as containerized, ansible skips installation of > necessary storage packages, therefore this issue is still reproduced if user > is trying to create a pod(eg, glusterfs). The solution is to install these > packages on the node if they are not there, however the end users may not > know this. I think either ansible installation need to handle that, or our > documentation should tell users how to deal with such container > installations. I'm changing this bug as assigned. This should be the case if you've used the latest installer, can you confirm that you're using either the laest atomic-openshift-utils or latest checkout of openshift-ansible? (In reply to Scott Dodson from comment #37) > (In reply to Hou Jianwei from comment #34) > > I still consider it a bug because this is only fixed on AH as comment 29. If > > OSE or AEP is setup as containerized, ansible skips installation of > > necessary storage packages, therefore this issue is still reproduced if user > > is trying to create a pod(eg, glusterfs). The solution is to install these > > packages on the node if they are not there, however the end users may not > > know this. I think either ansible installation need to handle that, or our > > documentation should tell users how to deal with such container > > installations. I'm changing this bug as assigned. > > This should be the case if you've used the latest installer, can you confirm > that you're using either the laest atomic-openshift-utils or latest checkout > of openshift-ansible? @sdodson you are right, I've used the latest installer today and now the rpm packages are available. So the issue for containerized installation is resolved, thank you! Now I think this bug is good to verify and close. Move bug to verified based on #comment 31 and #comment 38 This bug was fixed in a previous release of the installer. |