1257854 – [glusterD]: Brick status showing offline when glusterd is down on peer node and restarted glusterd on the other node in the two node cluster.

Bug 1257854 - [glusterD]: Brick status showing offline when glusterd is down on peer node and restarted glusterd on the other node in the two node cluster.

Summary: [glusterD]: Brick status showing offline when glusterd is down on peer node a...

Keywords:
Status:	CLOSED DEFERRED
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	glusterd
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Samikshan Bairagya
QA Contact:	Byreddy
Docs Contact:
URL:
Whiteboard:	glusterd
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-08-28 09:19 UTC by Byreddy
Modified:	2017-02-17 05:36 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2017-02-17 05:36:36 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sos report on Node1 (7.76 MB, application/x-xz) 2015-08-28 11:57 UTC, Byreddy	no flags	Details
sos report on Node2 (peer node) (7.27 MB, application/x-xz) 2015-08-28 11:58 UTC, Byreddy	no flags	Details
View All

Description Byreddy 2015-08-28 09:19:32 UTC

Description of problem:
Brick status showing offline(N) ( gluster v status ) when glusterd is down on the peer node and restarted the glusterd on the other node in the two node cluster.

Version-Release number of selected component (if applicable):
glusterfs-3.7.1-13

How reproducible:
Always

Steps to Reproduce:
1. Create Distributed volume using cluster of two nodes( Node-1 & Node-2)
2. Check volume status (gluster volume status <vol_name>)
3. Stop glusterD on Node-2
4. Check volume status on Node-1
5. Restart the glusterD on Node-1
6. Again check volume status on Node-1.

Actual results:
Brick status showing offline(N) even if brick process is running.

Expected results:
Brick status should show online (Y) even after glusterd restart.

Additional info:
Here after glusterd restart, brick process are running but status of it is not showing properly

Comment 4 Byreddy 2015-08-28 11:57:11 UTC

Created attachment 1067960 [details]
sos report on Node1

Comment 5 Byreddy 2015-08-28 11:58:56 UTC

Created attachment 1067961 [details]
sos report on Node2 (peer node)

Comment 7 Atin Mukherjee 2016-06-23 17:00:11 UTC

Samikshan,

Can you check this behaviour and see what's wrong here?

~Atin

Comment 8 Samikshan Bairagya 2016-06-24 03:10:53 UTC

(In reply to Atin Mukherjee from comment #7)
> Samikshan,
> 
> Can you check this behaviour and see what's wrong here?
> 

Will check this out.

Comment 9 Atin Mukherjee 2016-06-29 09:20:50 UTC

On a two node set up if the glusterd instance on first node is down and glusterd on second node is restarted then glusterd_restart_bricks () is not called until and unless glusterd on first node comes back. What it means is even though the brick process on node 2 is alive glusterd will not be able to connect to it since glusterd_brick_start () which is called by glusterd_restart_bricks () does that handling. So on a nutshell this is a known issue. We have got some upstream users complaining about this where they want a sort of an option where they don't care about split brains and still want to bring up the brick processes, with that in mind we'd need to think how we can solve this, is it worth to fix in GD 1.0 or 2.0 is what we need to take a call, my vote would be for the later. 

Byreddy,

Do you mind to close this bug and clone this upstream with GlusterD2 as a component?

~Atin

Comment 10 Byreddy 2017-02-17 05:36:36 UTC

Closing this BZ as DEFERRED to GD2

Note You need to log in before you can comment on or make changes to this bug.