Bug 1257854 - [glusterD]: Brick status showing offline when glusterd is down on peer node and restarted glusterd on the other node in the two node cluster.
[glusterD]: Brick status showing offline when glusterd is down on peer node a...
Status: CLOSED DEFERRED
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterd (Show other bugs)
3.1
x86_64 Linux
unspecified Severity high
: ---
: ---
Assigned To: Samikshan Bairagya
Byreddy
glusterd
: ZStream
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-08-28 05:19 EDT by Byreddy
Modified: 2017-02-17 00:36 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-02-17 00:36:36 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
sos report on Node1 (7.76 MB, application/x-xz)
2015-08-28 07:57 EDT, Byreddy
no flags Details
sos report on Node2 (peer node) (7.27 MB, application/x-xz)
2015-08-28 07:58 EDT, Byreddy
no flags Details

  None (edit)
Description Byreddy 2015-08-28 05:19:32 EDT
Description of problem:
Brick status showing offline(N) ( gluster v status ) when glusterd is down on the peer node and restarted the glusterd on the other node in the two node cluster.

Version-Release number of selected component (if applicable):
glusterfs-3.7.1-13

How reproducible:
Always

Steps to Reproduce:
1. Create Distributed volume using cluster of two nodes( Node-1 & Node-2)
2. Check volume status (gluster volume status <vol_name>)
3. Stop glusterD on Node-2
4. Check volume status on Node-1
5. Restart the glusterD on Node-1
6. Again check volume status on Node-1.

Actual results:
Brick status showing offline(N) even if brick process is running.

Expected results:
Brick status should show online (Y) even after glusterd restart.

Additional info:
Here after glusterd restart, brick process are running but status of it is not showing properly
Comment 4 Byreddy 2015-08-28 07:57:11 EDT
Created attachment 1067960 [details]
sos report on Node1
Comment 5 Byreddy 2015-08-28 07:58:56 EDT
Created attachment 1067961 [details]
sos report on Node2 (peer node)
Comment 7 Atin Mukherjee 2016-06-23 13:00:11 EDT
Samikshan,

Can you check this behaviour and see what's wrong here?

~Atin
Comment 8 Samikshan Bairagya 2016-06-23 23:10:53 EDT
(In reply to Atin Mukherjee from comment #7)
> Samikshan,
> 
> Can you check this behaviour and see what's wrong here?
> 

Will check this out.
Comment 9 Atin Mukherjee 2016-06-29 05:20:50 EDT
On a two node set up if the glusterd instance on first node is down and glusterd on second node is restarted then glusterd_restart_bricks () is not called until and unless glusterd on first node comes back. What it means is even though the brick process on node 2 is alive glusterd will not be able to connect to it since glusterd_brick_start () which is called by glusterd_restart_bricks () does that handling. So on a nutshell this is a known issue. We have got some upstream users complaining about this where they want a sort of an option where they don't care about split brains and still want to bring up the brick processes, with that in mind we'd need to think how we can solve this, is it worth to fix in GD 1.0 or 2.0 is what we need to take a call, my vote would be for the later. 

Byreddy,

Do you mind to close this bug and clone this upstream with GlusterD2 as a component?

~Atin
Comment 10 Byreddy 2017-02-17 00:36:36 EST
Closing this BZ as DEFERRED to GD2

Note You need to log in before you can comment on or make changes to this bug.