Bug 1262262

Summary: Command on mount hung for more than ping-timeout interval
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Shruti Sampat <ssampat>
Component: replicateAssignee: Bug Updates Notification Mailing List <rhs-bugs>
Status: CLOSED WORKSFORME QA Contact: Anoop <annair>
Severity: high Docs Contact:
Priority: medium    
Version: unspecifiedCC: atumball, ravishankar, rhs-bugs, sasundar, storage-qa-internal
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-30 07:44:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shruti Sampat 2015-09-11 09:59:15 UTC
Description of problem:
-----------------------

getxattr command on the fuse mount of a volume was hung for a very long time. Using gdb on the mount process showed this -

(gdb) p clnt->conn
$6 = {lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, 
  trans = 0x7faccc0cd7f0, config = {rpc_timeout = 0, remote_port = 0, remote_host = 0x0, ping_timeout = 0}, reconnect = 0x0, timer = 0x7facac000ad0, ping_timer = 0x0, rpc_clnt = 0x7faccc0bdc60, 
  connected = 1 '\001', saved_frames = 0x7faccc0cf6a0, frame_timeout = 1800, last_sent = {tv_sec = 1441955034, tv_usec = 568863}, last_received = {tv_sec = 1441961528, tv_usec = 393618}, ping_started = 0, 
  name = 0x7faccc0a7ca0 "test-client-0", ping_timeout = 42, pingcnt = 10, msgcnt = 136}

(gdb) p clnt->conn.ping_timer
$7 = (gf_timer_t *) 0x0

Version-Release number of selected component (if applicable):
-------------------------------------------------------------
glusterfs-3.7.1-14.el7rhgs.x86_64

How reproducible:
-----------------
Frequently

Steps to Reproduce:
-------------------
1. Create a 1x2 volume and mount it on a client via fuse.
2. Run the command on the mount to get the split-brain-status of a file -
`# getfattr -n replica.split-brain-status file'

Actual results:
---------------
The mount was hung for a long time.

Expected results:
-----------------
The command should timeout after the ping-timeout interval and the mount should not be hung.

Comment 2 Amar Tumballi 2018-01-30 07:44:33 UTC
This is not seen in recent releases:

[root@local glusterfs]# mount -t glusterfs local:/demo /mnt/fuse
[root@local glusterfs]# echo hello > /mnt/fuse/new-file
[root@local glusterfs]# getfattr -n replica.split-brain-status /mnt/fuse/new-file
getfattr: Removing leading '/' from absolute path names
# file: mnt/fuse/new-file
replica.split-brain-status="The file is not under data or metadata split-brain"
[root@local glusterfs]#