Bug 1466249 - [glusterfs] Improve error messaging for failed volume mount
Summary: [glusterfs] Improve error messaging for failed volume mount
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Storage
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.7.0
Assignee: Bradley Childs
QA Contact: Jianwei Hou
URL:
Whiteboard:
Depends On:
Blocks: 1473587
TreeView+ depends on / blocked
 
Reported: 2017-06-29 10:52 UTC by Jianwei Hou
Modified: 2017-11-28 21:59 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1473587 (view as bug list)
Environment:
Last Closed: 2017-11-28 21:59:33 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Jianwei Hou 2017-06-29 10:52:45 UTC
Description of problem:
Create a glusterfs PV with invalid endpoints, when the Pod failed to mount, there were no useful messages describing how/why it failed.

Version-Release number of selected component (if applicable):
openshift v3.6.126.1
kubernetes v1.6.1+5115d708d7

How reproducible:
Always


Steps to Reproduce:
1. Create an invalid endpoints, create a glusterfs PV with this endpoints
2. Create PVC and Pod
3. Pod failed to mount, run 'oc describe pod'

Actual results:
No useful message except:

```
  2m            2m              1       default-scheduler                                       Normal          Scheduled       Successfully assigned gluster to ip-172-18-7-39.ec2.internal
  8s            8s              1       kubelet, ip-172-18-7-39.ec2.internal                    Warning         FailedMount     Unable to mount volumes for pod "gluster_bndck(11cab1ac-5cb8-11e7-af12-0e8aebc0d9ea)": timeout expired waiting for volumes to attach/mount for pod "bndck"/"gluster". list of unattached/unmounted volumes=[gluster]
  8s            8s              1       kubelet, ip-172-18-7-39.ec2.internal                    Warning         FailedSync      Error syncing pod
```

Expected results:
In earlier version we could expect "glusterfs: mount failed" from 'oc describe pod' and and then find out how/why it failed to mount.

Master Log:

Node Log (of failed PODs):

Endpoints Dump:
apiVersion: v1
items:
- apiVersion: v1
  kind: Endpoints
  metadata:
    creationTimestamp: null
    name: glusterfs-cluster
  subsets:
  - addresses:
    - ip: 11.11.11.114
    - ip: 11.11.11.118
    ports:
    - port: 1
      protocol: TCP
kind: List
metadata: {}

PV Dump:
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: null
  name: gluster-bndck
spec:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 5Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: glusterc
    namespace: bndck
    resourceVersion: "14597"
    uid: 0f51dfe7-5cb8-11e7-af12-0e8aebc0d9ea
  glusterfs:
    endpoints: glusterfs-cluster
    path: testvol
  persistentVolumeReclaimPolicy: Retain
status: {}


PVC Dump:

StorageClass Dump (if StorageClass used by PV/PVC):

Additional info:

Comment 1 Jan Safranek 2017-07-04 13:49:15 UTC
Jianwei, how exactly is the endpoint wrong? I created an endpoint with invalid IP and otherwise correct pv + pvc + pod and "oc describe pod" is very verbose abound the bad IP address:



  FirstSeen	LastSeen	Count	From								SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----								-------------	--------	------		-------
  8s		8s		1	default-scheduler								Normal		Scheduled	Successfully assigned testpod to host-8-174-8.host.centralci.eng.rdu2.redhat.com
  2s		2s		1	kubelet, host-8-174-8.host.centralci.eng.rdu2.redhat.com			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/glusterfs/f81b5ab0-60be-11e7-842b-fa163ef34b22-gluster-pv" (spec.Name: "gluster-pv") pod "f81b5ab0-60be-11e7-842b-fa163ef34b22" (UID: "f81b5ab0-60be-11e7-842b-fa163ef34b22") with: glusterfs: mount failed: mount failed: exit status 1
Mounting command: mount
Mounting arguments: 172.17.0.3:test_vol /var/lib/origin/openshift.local.volumes/pods/f81b5ab0-60be-11e7-842b-fa163ef34b22/volumes/kubernetes.io~glusterfs/gluster-pv glusterfs [log-level=ERROR log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/gluster-pv/testpod-glusterfs.log]
Output: Mount failed. Please check the log file for more details.

 the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2017-07-04 13:45:04.424365] E [socket.c:2395:socket_connect_finish] 0-glusterfs: connection to 172.17.0.3:24007 failed (No route to host)
[2017-07-04 13:45:04.424468] E [glusterfsd-mgmt.c:1907:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 172.17.0.3 (Transport endpoint is not connected)



It's probably too much information, but at least there is "connection to 172.17.0.3:24007 failed (No route to host)" somewhere.

oc v3.6.133
kubernetes v1.6.1+5115d708d7

Comment 2 Jianwei Hou 2017-07-05 08:34:51 UTC
@jsafrane
I've tried it again and when an invalid ip in the endpoint is given, I could get the logs:

Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  8m		8m		1	default-scheduler				Normal		Scheduled	Successfully assigned gluster to node1.vsphere.local
  4m		4m		1	kubelet, node1.vsphere.local			Warning		FailedMount	MountVolume.SetUp failed for volume "kubernetes.io/glusterfs/ff912ad5-6159-11e7-99b2-0050569f4dcf-gluster" (spec.Name: "gluster") pod "ff912ad5-6159-11e7-99b2-0050569f4dcf" (UID: "ff912ad5-6159-11e7-99b2-0050569f4dcf") with: glusterfs: mount failed: mount failed: exit status 1
Mounting command: mount
Mounting arguments: 11.66.147.137:testvol /var/lib/origin/openshift.local.volumes/pods/ff912ad5-6159-11e7-99b2-0050569f4dcf/volumes/kubernetes.io~glusterfs/gluster glusterfs [log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/gluster/gluster-glusterfs.log log-level=ERROR]
Output: Mount failed. Please check the log file for more details.

 the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2017-07-05 08:18:57.460446] E [socket.c:2395:socket_connect_finish] 0-glusterfs: connection to 11.66.147.137:24007 failed (Connection timed out)
[2017-07-05 08:18:57.460625] E [glusterfsd-mgmt.c:1907:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 11.66.147.137 (Transport endpoint is not connected)

  6m	1m	3	kubelet, node1.vsphere.local		Warning	FailedMount	Unable to mount volumes for pod "gluster_jhou(ff912ad5-6159-11e7-99b2-0050569f4dcf)": timeout expired waiting for volumes to attach/mount for pod "jhou"/"gluster". list of unattached/unmounted volumes=[gluster]
  6m	1m	3	kubelet, node1.vsphere.local		Warning	FailedSync	Error syncing pod


But with this endpoint(two invalid IPs),  it was reproduced.

oc export endpoints                         
apiVersion: v1
items:
- apiVersion: v1
  kind: Endpoints
  metadata:
    creationTimestamp: null
    name: glusterfs-cluster
  subsets:
  - addresses:
    - ip: 11.11.11.114
    - ip: 11.11.11.118
    ports:
    - port: 1
      protocol: TCP
kind: List
metadata: {}

Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  2m		2m		1	default-scheduler				Normal		Scheduled	Successfully assigned gluster to node1.vsphere.local
  45s		45s		1	kubelet, node1.vsphere.local			Warning		FailedMount	Unable to mount volumes for pod "gluster_jhou(ec7f7a87-615b-11e7-99b2-0050569f4dcf)": timeout expired waiting for volumes to attach/mount for pod "jhou"/"gluster". list of unattached/unmounted volumes=[gluster]
  45s		45s		1	kubelet, node1.vsphere.local			Warning		FailedSync	Error syncing pod

Comment 3 Jan Safranek 2017-07-10 12:31:39 UTC
I see, I can reproduce the bug with two IP addresses that drop IP packets and don't send "connection refused".

The reason is that kubelet tries to mount the filesystem several times:

- 11.11.11.114:/volume
- 11.11.11.114:/volume again, now without auto_unmount
- 11.11.11.118:/volume
- 11.11.11.118:/volume again, now without auto_unmount

All these attempts take 2 minutes to time out, so together it takes 8 minutes for kubelet to realize that something it wrong. Indeed, after 8 minutes it sends the usual event:

  FirstSeen     LastSeen        Count   From                            SubObjectPath   Type            Reason          Message
  ---------     --------        -----   ----                            -------------   --------        ------          -------
  9m            9m              1       default-scheduler                               Normal          Scheduled       Successfully assigned glusterfs to localhost.localdomain
  7m            1m              4       kubelet, localhost.localdomain                  Warning         FailedMount     Unable to mount volumes for pod "glusterfs_default(4221407c-6568-11e7-862d-5254000ac840)": timeout expired waiting for volumes to attach/mount for pod "default"/"glusterfs". list of unattached/unmounted volumes=[glusterfsvol]
  7m            1m              4       kubelet, localhost.localdomain                  Warning         FailedSync      Error syncing pod
  1m            1m              1       kubelet, localhost.localdomain                  Warning         FailedMount     MountVolume.SetUp failed for volume "kubernetes.io/glusterfs/4221407c-6568-11e7-862d-5254000ac840-glusterfsvol" (spec.Name: "glusterfsvol") pod "4221407c-6568-11e7-862d-5254000ac840" (UID: "4221407c-6568-11e7-862d-5254000ac840") with: glusterfs: mount failed: mount failed: exit status 1

...

glusterfs: connection to 11.11.11.118:24007 failed (Connection timed out); disconnecting socket


As an interim fix, trying blindly the same mount without auto_unmount is useless if we know that auto_unmount was not the culprit. Still, for two IP addresses it will take 2x2 minutes to time out, for 5 IP addresses it will take 10 minutes... It does not scale well.

@Humble, is there a possibility to reduce the initial timeout to say 30 seconds? We would get feedback in one minute for two servers and 2.5 minutes for five servers - long, but acceptable.

Comment 8 Humble Chirammal 2017-07-13 08:49:31 UTC
(In reply to Jan Safranek from comment #3)


> 
> @Humble, is there a possibility to reduce the initial timeout to say 30
> seconds? We would get feedback in one minute for two servers and 2.5 minutes
> for five servers - long, but acceptable.

Jan, Glusterfs rpc engineers are taking a look at this to confirm whether any extra delay is introduced from gluster layer in this case. I am setting needinfo accordingly.

Comment 9 Milind Changire 2017-07-13 09:50:07 UTC
(In reply to Humble Chirammal from comment #8)
> (In reply to Jan Safranek from comment #3)
> 
> 
> > 
> > @Humble, is there a possibility to reduce the initial timeout to say 30
> > seconds? We would get feedback in one minute for two servers and 2.5 minutes
> > for five servers - long, but acceptable.
> 
> Jan, Glusterfs rpc engineers are taking a look at this to confirm whether
> any extra delay is introduced from gluster layer in this case. I am setting
> needinfo accordingly.

With a test program connecting to an invalid address, the connect() system call times out after 2m 10s on my F25 laptop. Seems like this is the scenario encountered with gluster as well.

So, that is the default timeout when a client connects to gluster bricks. To reduce the timeout to the desired time of 30s, gluster would need to set the O_NONBLOCK flag on the socket fd and then wait in a select() syscall for 30s at most to get the desired effect. This needs work to be done at gluster end.

The timeout is also affected by the transport.socket.ignore-enoent gluster volume option. If set to "on", it is presumed that a connect failure would possibly be because the bricks are yet to be fully functional and that at most 5 connection attempts are made successively to eventually declare a connection failure.

Other than these factors, you could try tweaking the /proc/sys/net/ipv4/* system-wide tcp tunables for the desired effect. Documentation/networking/ip-sysctl.txt in the kernel source tree describes these tunables.

Comment 10 Jan Safranek 2017-07-14 08:31:54 UTC
I posted https://github.com/openshift/origin/pull/15199 that removes re-try  without auto_unmount if it was not necessary.

Event "glusterfsd-mgmt: failed to connect with remote-host: 11.66.147." will appear in half of the time. That would remove "Regression" from this bug, the timeout would be the same as in previous OpenShift releases.

Faster gluster timeouts must be fixed on Gluster side, I filled #1470986 there.

Comment 11 Humble Chirammal 2017-07-21 07:03:05 UTC
(In reply to Jan Safranek from comment #10)
> I posted https://github.com/openshift/origin/pull/15199 that removes re-try 
> without auto_unmount if it was not necessary.

Thanks Jan! This will be pulled into OCP 3.6 tree, Isnt it ?

> 
> Event "glusterfsd-mgmt: failed to connect with remote-host: 11.66.147." will
> appear in half of the time. That would remove "Regression" from this bug,
> the timeout would be the same as in previous OpenShift releases.
> 
> Faster gluster timeouts must be fixed on Gluster side, I filled #1470986
> there.


Thanks. I will follow up with Gluster RPC engineers.

Comment 12 Jan Safranek 2017-07-21 08:16:37 UTC
It would be very useful to pull also https://github.com/kubernetes/kubernetes/pull/49127 as it reduces the timeout from 2 minutes * nr. of endpoints to 2 minutes.

And yes, it would be nice to have in 3.6.x. PRs will follow shortly.

Comment 13 Jan Safranek 2017-07-21 08:17:50 UTC
Bumping priority - it's regression after all.

Comment 14 Jan Safranek 2017-07-21 08:32:29 UTC
Cloned into 3.6.x as bug #1473587. From now on, this bug tracks fix in 3.7

Comment 16 Jianwei Hou 2017-09-25 06:51:36 UTC
Tested on v3.7.0-0.127.0

When the mount failed with two invalid ip the following messages are displayed. It was 6 minutes later after the previous event.  It's not like the description in comment 12, @jsafrane could you take a look?  

Events:
  FirstSeen	LastSeen	Count	From					SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----					-------------	--------	------			-------
  9m		9m		1	default-scheduler					Normal		Scheduled		Successfully assigned gluster to ocp37.lb.node1.vsphere.local
  9m		9m		1	kubelet, ocp37.lb.node1.vsphere.local			Normal		SuccessfulMountVolume	MountVolume.SetUp succeeded for volume "default-token-x5xwm" 
  3m		3m		1	kubelet, ocp37.lb.node1.vsphere.local			Warning		FailedMount		MountVolume.SetUp failed for volume "gluster" : glusterfs: mount failed: mount failed: exit status 1
Mounting command: systemd-run
Mounting arguments: --description=Kubernetes transient mount for /var/lib/origin/openshift.local.volumes/pods/6e623079-a1b7-11e7-a970-0050569f68e7/volumes/kubernetes.io~glusterfs/gluster --scope -- mount -t glusterfs -o backup-volfile-servers=11.11.11.114:11.11.11.118,log-level=ERROR,log-file=/var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/gluster/gluster-glusterfs.log 11.11.11.114:testvol /var/lib/origin/openshift.local.volumes/pods/6e623079-a1b7-11e7-a970-0050569f68e7/volumes/kubernetes.io~glusterfs/gluster
Output: Running scope as unit run-14258.scope.
Mount failed. Please check the log file for more details.

 the following error information was pulled from the glusterfs log to help diagnose this issue: 
[2017-09-25 06:09:04.071411] E [glusterfsd-mgmt.c:1907:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 11.11.11.114 (Transport endpoint is not connected)
[2017-09-25 06:11:13.095419] E [glusterfsd-mgmt.c:1907:mgmt_rpc_notify] 0-glusterfsd-mgmt: failed to connect with remote-host: 11.11.11.118 (Transport endpoint is not connected)

  7m	51s	4	kubelet, ocp37.lb.node1.vsphere.local		Warning	FailedMount	Unable to mount volumes for pod "gluster_jhou(6e623079-a1b7-11e7-a970-0050569f68e7)": timeout expired waiting for volumes to attach/mount for pod "jhou"/"gluster". list of unattached/unmounted volumes=[gluster]
  7m	51s	4	kubelet, ocp37.lb.node1.vsphere.local		Warning	FailedSync	Error syncing pod

Comment 17 Jan Safranek 2017-09-25 07:18:45 UTC
The main message here is "Transport endpoint is not connected", which is indeed different to "Connection timed out", but it's change on Gluster side. Kubernetes/OpenShift is just printing whatever Gluster provides.


So on OpenShift side it's fixed. If you want you may open a new bug on Gluster to log something better than "Transport endpoint is not connected".

Comment 18 Jianwei Hou 2017-09-25 09:31:42 UTC
Thank you!

This bug is verified according to comment 16

Comment 21 errata-xmlrpc 2017-11-28 21:59:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.