Bug 1569336

Summary: Volume status inode is broken with brickmux
Product: [Community] GlusterFS Reporter: hari gowtham <hgowtham>
Component: glusterdAssignee: hari gowtham <hgowtham>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.12CC: amukherj, bugs, hgowtham, moagrawa, rhinduja, rhs-bugs, rmadaka, storage-qa-internal, vbellur
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.12.15 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1566067 Environment:
Last Closed: 2018-10-23 14:21:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1566067, 1569346    
Bug Blocks: 1559452    

Comment 1 hari gowtham 2018-04-19 05:22:39 UTC
Description of problem:

Gluster volume status inode command failing on subsequent volumes.


Version-Release number of selected component (if applicable):

3.8.4-54-2

How reproducible:

Every time

Steps to Reproduce:
1.Create 3 node cluster(n1, n2, n3).
2.Create 2 replica-3 volumes (v1, v2).
3.Mount 2 volumes on two different clients(c1, c2).
4.Start running I/O parallel on two mount points.
5.While running I/O's , start executing 'Gluster volume status v1 inode' and
'Gluster volume status v1 fd' frequently with some time gap
6.In sameway run volume status inode command for v2 also
7.Then create new volume  v3 (distirubted_replicated)
8. Then perform "gluster volume status v3 inode" and "gluster volume status v3 fd" on node n1
9. 'Gluster volume status inode' and 'gluster volume status fd' commands are failing for newly created volume.
10. node n1 bricks of volume v3  went to offline 

Actual results:

root@dhcp37-113 home]# gluster vol status rp1 fd
Error : Request timed out
[root@dhcp37-113 home]# gluster vol status drp1 inode
Error : Request timed out

gluster vol status drp1
Status of volume: drp1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.37.113:/bricks/brick1/drp1      N/A       N/A        N       N/A  
Brick 10.70.37.157:/bricks/brick1/drp1      49152     0          Y       2125 
Brick 10.70.37.174:/bricks/brick1/drp1      49152     0          Y       2306 
Brick 10.70.37.113:/bricks/brick2/drp1      N/A       N/A        N       N/A  
Brick 10.70.37.157:/bricks/brick2/drp1      49152     0          Y       2125 
Brick 10.70.37.174:/bricks/brick2/drp1      49152     0          Y       2306 
Self-heal Daemon on localhost               N/A       N/A        Y       4507 
Self-heal Daemon on 10.70.37.157            N/A       N/A        Y       4006 
Self-heal Daemon on 10.70.37.174            N/A       N/A        Y       4111 
 
Task Status of Volume drp1



Expected results:

Bricks should not go to offline and gluster volume status inode and fd commands should get executed successfully

Comment 2 Atin Mukherjee 2018-09-18 03:52:55 UTC
Hari - is this patch backported to 3.12?

Comment 3 hari gowtham 2018-09-18 04:38:02 UTC
Hi Atin,

No, it wasnt backported. it has some dependent patches missing. 
Will have to edit it to send it to 3.12.
Shall I proceed with it?

Regards,
Hari.

Comment 4 hari gowtham 2018-09-18 04:51:40 UTC
a patch was sent back then to back port it and later got abandoned
https://review.gluster.org/#/c/glusterfs/+/19903/

Comment 5 Atin Mukherjee 2018-09-18 05:50:39 UTC
Well, I haven't seen any users reporting any problems around brick multiplexing in 3.12 series where there were plenty of them. This points to a fact that probably we don't have much users who're using this feature with 3.12. If the backport effort is huge, I'd not go for it.

Comment 6 Atin Mukherjee 2018-09-18 05:51:28 UTC
(In reply to Atin Mukherjee from comment #5)
> Well, I haven't seen any users reporting any problems around brick
> multiplexing in 3.12 series where there were plenty of them. This points to
> a fact that probably we don't have much users who're using this feature with
> 3.12. If the backport effort is huge, I'd not go for it.

Especially after we have released 4.0, 4.1 & now branch 5 is in place.

Comment 7 hari gowtham 2018-09-18 06:56:47 UTC
(In reply to Atin Mukherjee from comment #6)
> (In reply to Atin Mukherjee from comment #5)
> > Well, I haven't seen any users reporting any problems around brick
> > multiplexing in 3.12 series where there were plenty of them. This points to
> > a fact that probably we don't have much users who're using this feature with
> > 3.12. If the backport effort is huge, I'd not go for it.
> 
> Especially after we have released 4.0, 4.1 & now branch 5 is in place.

As the bug was already backported, the changes were minimal. so have backported it again.

Comment 8 Atin Mukherjee 2018-10-05 02:28:36 UTC
Given 3.12 is going to EOL, we have decided not to accept this backport considering we don't have any users reporting issues against brick multiplexing at 3.12.

Comment 9 hari gowtham 2018-10-08 09:32:07 UTC
Hi Atin,

Do we have to backport this considering that there is going to be another release on 3.12. As users might come across this in the future.

Regards,
Hari.

Comment 10 Atin Mukherjee 2018-10-08 12:57:24 UTC
I'll leave it up to you to decide, but if no one has used 3.12 series so far for trying brick multiplexing, they wouldn't do the same in upgrading to the last update, rather they should be encouraged to upgrade to the latest versions.

Comment 11 hari gowtham 2018-10-09 07:31:37 UTC
I thought of providing the fix so that it will be there to avoid the issue if it pops up when someone gives it a try. But as they aren't using brick mux with 3.12 it makes sense for them to upgrade to the latest release.

Comment 12 hari gowtham 2018-10-11 10:29:57 UTC
The bug was merged by jiffin was it was available for review.

https://review.gluster.org/#/c/glusterfs/+/19903/

I'm changing the status.

Comment 13 Shyamsundar 2018-10-23 14:21:35 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.12.15, please open a new bug report.

glusterfs-3.12.15 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2018-October/000114.html
[2] https://www.gluster.org/pipermail/gluster-users/