1257037 – [AEP] Unable to mount Gluster

Bug 1257037 - [AEP] Unable to mount Gluster

Summary: [AEP] Unable to mount Gluster

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	OKD
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	3.x
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Mark Turansky
QA Contact:	Liang Xia
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-08-26 06:47 UTC by Jianwei Hou
Modified:	2015-09-25 08:28 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2015-09-25 08:28:12 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Jianwei Hou 2015-08-26 06:47:39 UTC

Description of problem:
When creating pods that has a PVC as a volume, the pod was always pending with status 'xxx is ready, container is creating'. After digging into the log, found race conditions happened a lot.

Version-Release number of selected component (if applicable):
atomic-enterprise v3.0.1.100-97-g4539b18
kubernetes v1.1.0-alpha.0-1605-g44c91b1

How reproducible:
80% ~ 90%

Steps to Reproduce:
1. Create glusterfs endpoint, PV, PVC
oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/gluster/endpoints.json
oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/gluster/glusterfs.json
oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/gluster/claim.json

2. Create the pod
oc create -f https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/persistent-volumes/gluster/pod.json

3. oc get pods

Actual results:
After step 3:
NAME        READY     STATUS                                                        RESTARTS   AGE
gluster     0/1       Image: jhou/hello-openshift is ready, container is creating   0          54m

`oc get events` only showed the pod was successfully scheduled, there was no further follow-ups of the pod events.

On master, trace the /var/log/message, found race condition problems happened lot:

Aug 26 13:43:32 openshift-v3 atomic-openshift-master: E0826 13:43:32.411756    5519 persistentvolume_claim_binder_controller.go:142] PVClaimBinder could not update claim glus
terc: Error updating volume: persistentvolumeclaims "glusterc" cannot be updated: the object has been modified; please apply your changes to the latest version and try again

Aug 26 12:54:25 openshift-v3 atomic-openshift-master: E0826 12:54:25.687738    5519 persistentvolume_claim_binder_controller.go:142] PVClaimBinder could not update claim nfsc-chao-project: Error updating volume: persistentvolumeclaims "nfsc-chao-project" cannot be updated: the object has been modified; please apply your changes to the latest version and try again

Aug 26 10:16:15 openshift-v3 atomic-openshift-master: E0826 10:16:15.787541    5519 persistentvolume_claim_binder_controller.go:142] PVClaimBinder could not update claim myclaim-1: Error updating volume: persistentvolumeclaims "myclaim-1" cannot be updated: the object has been modified; please apply your changes to the latest version and try again



Expected results:
Pod should be created successfully.

Additional info:

Comment 1 Mark Turansky 2015-09-10 14:59:34 UTC

The PVClaimBinder has nothing to do with mounting a pod.  

What do you see in 'oc describe pod' ?

What do you see in that container's logs?

By the time the pod is created, the PVClaim should already be bound.   That is the end of PVClaimBinder's role in this story.

Comment 3 Liang Xia 2015-09-11 04:51:53 UTC

The root cause:
Sep 11 12:34:02 openshift-v3 docker: time="2015-09-11T12:34:02.629056859+08:00" level=info msg="GET /containers/json?all=1"
Sep 11 12:34:02 openshift-v3 atomic-openshift-node: E0911 12:34:02.628222    3988 mount_linux.go:103] Mount failed: exit status 32
Sep 11 12:34:02 openshift-v3 atomic-openshift-node: Mounting arguments: 10.66.79.108:testvol /var/lib/origin/openshift.local.volumes/pods/538d7ff4-583e-11e5-b094-fa163e4dc0dd/volumes/kubernetes.io~glusterfs/gluster glusterfs []
Sep 11 12:34:02 openshift-v3 atomic-openshift-node: Output: mount: unknown filesystem type 'glusterfs'
Sep 11 12:34:02 openshift-v3 atomic-openshift-node: E0911 12:34:02.628264    3988 glusterfs.go:235] Glusterfs: mount failed: exit status 32
Sep 11 12:34:02 openshift-v3 atomic-openshift-node: E0911 12:34:02.628354    3988 kubelet.go:1206] Unable to mount volumes for pod "gluster_lxiap": exit status 32; skipping pod
Sep 11 12:34:02 openshift-v3 docker: time="2015-09-11T12:34:02.633546226+08:00" level=info msg="GET /images/jhou/hello-openshift/json"
Sep 11 12:34:02 openshift-v3 atomic-openshift-node: E0911 12:34:02.635055    3988 pod_workers.go:111] Error syncing pod 538d7ff4-583e-11e5-b094-fa163e4dc0dd, skipping: exit status 32

Comment 4 Mark Turansky 2015-09-11 17:13:38 UTC

"failedMountUnable to mount volumes for pod "gluster_lxiap": exit status 32"

The pod failed to mount Gluster.

This is not a race condition in the PV binder.

Comment 5 Mark Turansky 2015-09-11 17:24:12 UTC

Huamin Chen (hchen) can assist further.  He is the Gluster plugin author.

Comment 6 Mark Turansky 2015-09-14 15:16:32 UTC

I changed the title of this issue to reflect the actual problem.   This is not PV binding problem but a Gluster mounting problem.

Comment 7 Jan Safranek 2015-09-22 11:15:10 UTC

(In reply to Liang Xia from comment #3)
> Sep 11 12:34:02 openshift-v3 atomic-openshift-node: Output: mount: unknown
> filesystem type 'glusterfs'

You probably don't have installed glusterfs client tools. Please check you have insatalled glusterfs-fuse package.

Note You need to log in before you can comment on or make changes to this bug.