Bug 1262262

Summary:	Command on mount hung for more than ping-timeout interval
Product:	[Red Hat Storage] Red Hat Gluster Storage	Reporter:	Shruti Sampat <ssampat>
Component:	replicate	Assignee:	Bug Updates Notification Mailing List <rhs-bugs>
Status:	CLOSED WORKSFORME	QA Contact:	Anoop <annair>
Severity:	high	Docs Contact:
Priority:	medium
Version:	unspecified	CC:	atumball, ravishankar, rhs-bugs, sasundar, storage-qa-internal
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2018-01-30 07:44:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Shruti Sampat 2015-09-11 09:59:15 UTC

Description of problem:
-----------------------

getxattr command on the fuse mount of a volume was hung for a very long time. Using gdb on the mount process showed this -

(gdb) p clnt->conn
$6 = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, 
  trans = 0x7faccc0cd7f0, config = {rpc_timeout = 0, remote_port = 0, remote_host = 0x0, ping_timeout = 0}, reconnect = 0x0, timer = 0x7facac000ad0, ping_timer = 0x0, rpc_clnt = 0x7faccc0bdc60, 
  connected = 1 '\001', saved_frames = 0x7faccc0cf6a0, frame_timeout = 1800, last_sent = {tv_sec = 1441955034, tv_usec = 568863}, last_received = {tv_sec = 1441961528, tv_usec = 393618}, ping_started = 0, 
  name = 0x7faccc0a7ca0 "test-client-0", ping_timeout = 42, pingcnt = 10, msgcnt = 136}

(gdb) p clnt->conn.ping_timer
$7 = (gf_timer_t *) 0x0

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs-3.7.1-14.el7rhgs.x86_64

How reproducible:
-----------------
Frequently

Steps to Reproduce:
-------------------
1. Create a 1x2 volume and mount it on a client via fuse.
2. Run the command on the mount to get the split-brain-status of a file -
`# getfattr -n replica.split-brain-status file'

Actual results:
---------------
The mount was hung for a long time.

Expected results:
-----------------
The command should timeout after the ping-timeout interval and the mount should not be hung.

Comment 2 Amar Tumballi 2018-01-30 07:44:33 UTC

This is not seen in recent releases:

[root@local glusterfs]# mount -t glusterfs local:/demo /mnt/fuse
[root@local glusterfs]# echo hello > /mnt/fuse/new-file
[root@local glusterfs]# getfattr -n replica.split-brain-status /mnt/fuse/new-file
getfattr: Removing leading '/' from absolute path names
# file: mnt/fuse/new-file
replica.split-brain-status="The file is not under data or metadata split-brain"
[root@local glusterfs]#