Bug 1188471

Summary:	When the volume is in stopped state/all the bricks are down mount of the volume hangs
Product:	[Community] GlusterFS	Reporter:	Pranith Kumar K <pkarampu>
Component:	disperse	Assignee:	Pranith Kumar K <pkarampu>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	3.6.1	CC:	bugs, iesool, rabhat
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-v3.6.3	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:	1179180	Environment:
Last Closed:	2016-02-04 15:20:08 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1179180
Bug Blocks:	1184460

Description Pranith Kumar K 2015-02-03 01:33:47 UTC

+++ This bug was initially created as a clone of Bug #1179180 +++

Description of problem:
When all the bricks are down at the time of mounting the volume, then mount
command hangs. If only fragment number of bricks are up then mount takes
5 seconds to be successful.

root@pranithk-laptop - ~ 
17:12:37 :( ⚡ glusterd && gluster volume create ec2 disperse 3 redundancy 1 pranithk-laptop:/home/gfs/ec_{2,3,4} force
volume create: ec2: success: please start the volume to access data

root@pranithk-laptop - ~ 
17:12:42 :) ⚡ mount -t glusterfs pranithk-laptop:/ec2 /mnt/fuse1
^C

root@pranithk-laptop - ~ 
17:12:55 :( ⚡ ls /mnt/fuse1



^C^C^C

Command above hung, I had to kill the mount to get the prompt.

Version-Release number of selected component (if applicable):


How reproducible:
always

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

--- Additional comment from Anand Avati on 2015-01-06 06:49:19 EST ---

REVIEW: http://review.gluster.org/9396 (cluster/ec: Handle CHILD UP/DOWN in all cases) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-08 12:20:38 EST ---

REVIEW: http://review.gluster.org/9396 (cluster/ec: Handle CHILD UP/DOWN in all cases) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-28 04:05:14 EST ---

REVIEW: http://review.gluster.org/9396 (cluster/ec: Handle CHILD UP/DOWN in all cases) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-28 08:06:45 EST ---

REVIEW: http://review.gluster.org/9396 (cluster/ec: Handle CHILD UP/DOWN in all cases) posted (#4) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-28 10:53:56 EST ---

REVIEW: http://review.gluster.org/9396 (cluster/ec: Handle CHILD UP/DOWN in all cases) posted (#5) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-01-28 22:49:59 EST ---

COMMIT: http://review.gluster.org/9396 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit a48b18d6f661f863371e625084a88a01aaf989f0
Author: Pranith Kumar K <pkarampu>
Date:   Thu Jan 8 15:39:40 2015 +0530

    cluster/ec: Handle CHILD UP/DOWN in all cases
    
    Problem:
    When all the bricks are down at the time of mounting the volume, then mount
    command hangs.
    
    Fix:
    1. Ignore all CHILD_CONNECTING events comming from subvolumes.
    2. On timer expiration (without enough up or down childs) send
       CHILD_DOWN.
    3. Once enough up or down subvolumes are detected, send the appropriate event.
       When rest of the subvols go up/down without changing the overall
       ec-up/ec-down send CHILD_MODIFIED to parent subvols.
    
    Change-Id: Ie0194dbadef2dce36ab5eb7beece84a6bf3c631c
    BUG: 1179180
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9396
    Reviewed-by: Xavier Hernandez <xhernandez>
    Tested-by: Gluster Build System <jenkins.com>

--- Additional comment from Anand Avati on 2015-02-01 11:27:22 EST ---

REVIEW: http://review.gluster.org/9523 (cluster/ec: Wait for all bricks to notify before notifying parent) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-02-02 07:29:09 EST ---

REVIEW: http://review.gluster.org/9523 (cluster/ec: Wait for all bricks to notify before notifying parent) posted (#2) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-02-02 07:36:41 EST ---

REVIEW: http://review.gluster.org/9523 (cluster/ec: Wait for all bricks to notify before notifying parent) posted (#3) for review on master by Pranith Kumar Karampuri (pkarampu)

--- Additional comment from Anand Avati on 2015-02-02 15:21:57 EST ---

REVIEW: http://review.gluster.org/9523 (cluster/ec: Wait for all bricks to notify before notifying parent) posted (#5) for review on master by Vijay Bellur (vbellur)

--- Additional comment from Anand Avati on 2015-02-02 15:22:15 EST ---

COMMIT: http://review.gluster.org/9523 committed in master by Vijay Bellur (vbellur) 
------
commit da1ff66255017501f54c50b3c40eeea11b5fc38f
Author: Pranith Kumar K <pkarampu>
Date:   Sun Feb 1 15:03:46 2015 +0530

    cluster/ec: Wait for all bricks to notify before notifying parent
    
    This is to prevent spurious heals that can result in self-heal.
    
    Change-Id: I0b27c1c1fc7a58e2683cb1ca135117a85efcc6c9
    BUG: 1179180
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9523
    Reviewed-by: Xavier Hernandez <xhernandez>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Vijay Bellur <vbellur>
    Tested-by: Vijay Bellur <vbellur>

Comment 1 Anand Avati 2015-02-03 04:26:11 UTC

REVIEW: http://review.gluster.org/9551 (cluster/ec: Handle CHILD UP/DOWN in all cases) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)

Comment 2 Anand Avati 2015-02-03 04:26:19 UTC

REVIEW: http://review.gluster.org/9552 (cluster/ec: Wait for all bricks to notify before notifying parent) posted (#1) for review on release-3.6 by Pranith Kumar Karampuri (pkarampu)

Comment 3 Anand Avati 2015-03-30 07:20:42 UTC

COMMIT: http://review.gluster.org/9551 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit d1eb4f520b35c1057c7cb3427a51dd6ae75cc61f
Author: Pranith Kumar K <pkarampu>
Date:   Thu Jan 8 15:39:40 2015 +0530

    cluster/ec: Handle CHILD UP/DOWN in all cases
    
            Backport of http://review.gluster.org/9396
    
    Problem:
    When all the bricks are down at the time of mounting the volume, then mount
    command hangs.
    
    Fix:
    1. Ignore all CHILD_CONNECTING events comming from subvolumes.
    2. On timer expiration (without enough up or down childs) send
       CHILD_DOWN.
    3. Once enough up or down subvolumes are detected, send the appropriate event.
       When rest of the subvols go up/down without changing the overall
       ec-up/ec-down send CHILD_MODIFIED to parent subvols.
    
    BUG: 1188471
    Change-Id: If92bd84107d49495cd104deb34601afe7f9b155c
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9551
    Reviewed-by: Xavier Hernandez <xhernandez>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 4 Anand Avati 2015-03-30 07:21:00 UTC

COMMIT: http://review.gluster.org/9552 committed in release-3.6 by Raghavendra Bhat (raghavendra) 
------
commit bd7f4451aef70c4c968d3ca4e5996ffc96cf64fa
Author: Pranith Kumar K <pkarampu>
Date:   Sun Feb 1 15:03:46 2015 +0530

    cluster/ec: Wait for all bricks to notify before notifying parent
    
            Backport of http://review.gluster.org/9523
    
    This is to prevent spurious heals that can result in self-heal.
    
    BUG: 1188471
    Change-Id: Iaea335d59431d8d85a236963a365f5c791fc7c49
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/9552
    Reviewed-by: Xavier Hernandez <xhernandez>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra Bhat <raghavendra>

Comment 5 Kaushal 2016-02-04 15:20:08 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-v3.6.3, please open a new bug report.

glusterfs-v3.6.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-users/2015-April/021669.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user