Bug 853596

Summary: dd hung on fuse/nfs mount when new bricks were added to volume.
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Vidya Sakar <vinaraya>
Component: glusterfsAssignee: Amar Tumballi <amarts>
Status: CLOSED ERRATA QA Contact: spandura
Severity: high Docs Contact:
Priority: medium    
Version: unspecifiedCC: amarts, gluster-bugs, rfortier, rhs-bugs, sdharane, sgowda, shwetha.h.panduranga, vbellur, vraman
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 818065 Environment:
Last Closed: 2013-09-23 22:33:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 818065    
Bug Blocks:    

Description Vidya Sakar 2012-09-01 06:27:31 UTC
+++ This bug was initially created as a clone of Bug #818065 +++

Description of problem:
----------------------
When new bricks were added to a distribute-replicate volume(2x3), the dd operations on fuse, nfs mount hung and didn't receive reply back from the bricks. 

Version-Release number of selected component (if applicable):
3.3.0qa39

How reproducible:


Steps to Reproduce:
---------------------
1.create a distribute-replicate volume (2x3).
2.create fuse, nfs mount. run "gfsc1.sh" on fuse mount, "nfsc1.sh" on nfs mount
3.execute "gluster volume add-brick <vol_name> <new_brick1> <new_brick2> <new_brick3>"

Actual results:
-------------
dd hung on both fuse and nfs mount. The client log reports the change in the volume file but didn't receive the new volume file. 

Expected results:
------------------
dd should not hang. 

Additional info:
------------------
Initial Set-Up:-
~~~~~~~~~~~~~~~
[05/02/12 - 09:58:38 root@APP-SERVER1 ~]# gluster volume info
 
Volume Name: dstore
Type: Distributed-Replicate
Volume ID: bf26fea1-bc16-4064-980a-778f6f216d79
Status: Created
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 192.168.2.35:/export1/dstore1
Brick2: 192.168.2.36:/export1/dstore1
Brick3: 192.168.2.35:/export1/dstore2
Brick4: 192.168.2.35:/export2/dstore1
Brick5: 192.168.2.36:/export2/dstore1
Brick6: 192.168.2.36:/export1/dstore2

After add-brick:-
~~~~~~~~~~~~~~~~~
[05/02/12 - 10:45:53 root@APP-SERVER1 ~]# gluster volume info
 
Volume Name: dstore
Type: Distributed-Replicate
Volume ID: bf26fea1-bc16-4064-980a-778f6f216d79
Status: Started
Number of Bricks: 3 x 3 = 9
Transport-type: tcp
Bricks:
Brick1: 192.168.2.35:/export1/dstore1
Brick2: 192.168.2.36:/export1/dstore1
Brick3: 192.168.2.35:/export1/dstore2
Brick4: 192.168.2.35:/export2/dstore1
Brick5: 192.168.2.36:/export2/dstore1
Brick6: 192.168.2.36:/export1/dstore2
Brick7: 192.168.2.35:/export2/dstore2
Brick8: 192.168.2.36:/export2/dstore2
Brick9: 192.168.2.35:/export10

--- Additional comment from sgowda on 2012-05-02 03:05:11 EDT ---

Can you please attach the nfs/client/server logs relevant to the time at which
the clients hung? Additionally, when ever there is a hang, a statedump of those
processes(client/nfs) would be beneficial.

--- Additional comment from shwetha.h.panduranga on 2012-05-02 03:05:33 EDT ---

Created attachment 581528 [details]
client, brick logs

Comment 2 Amar Tumballi 2012-09-04 08:31:47 UTC
asked for needinfo in original bug too.

Comment 3 spandura 2012-12-11 05:45:06 UTC
This issue is not happening any more. Tested it on :

[11/09/12 - 06:38:37 root@king ~]# rpm -qa | grep gluster
glusterfs-fuse-3.3.0.5rhs-37.el6rhs.x86_64

[11/09/12 - 06:38:44 root@king ~]# gluster --version
glusterfs 3.3.0.5rhs built on Nov  8 2012 22:30:35

Comment 6 Scott Haines 2013-09-23 22:33:13 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. 

For information on the advisory, and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2013-1262.html