Bug 1612098 - Brick not coming up on a volume after rebooting the node
Summary: Brick not coming up on a volume after rebooting the node
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.4
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: RHGS 3.4.0
Assignee: Mohit Agrawal
QA Contact: Upasana
URL:
Whiteboard:
Depends On:
Blocks: 1503137 1577800 1612418 1619158
TreeView+ depends on / blocked
 
Reported: 2018-08-03 13:25 UTC by Upasana
Modified: 2018-10-02 15:42 UTC (History)
11 users (show)

Fixed In Version: glusterfs-3.12.2-16
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1612418 (view as bug list)
Environment:
Last Closed: 2018-09-04 06:51:13 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 0 None None None 2018-09-04 06:52:51 UTC

Description Upasana 2018-08-03 13:25:59 UTC
Description of problem:
=======================
If we reboot a node which has a brick which is a part of a volume.
The brick does not come up after reboot


Version-Release number of selected component (if applicable):
===========================================================
server-3.12.2-15.el7rhgs.x86_64


How reproducible:
=================
3/3


Steps to Reproduce:
==================
1.Create a Volume
2.Reboot a node which has a brick which is part of that volume
3.after reboot brick does not come up

Actual results:
==============
Brick should come up after reboot


Expected results:
=================
Brick not coming up after reboot


Additional info:
================
On a EC volume

[root@dhcp35-56 ~]# gluster v status
Status of volume: dispersed
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.56:/gluster/brick1/vol2      N/A       N/A        N       N/A  
Brick 10.70.35.228:/gluster/brick1/vol2     49152     0          Y       23860
Brick 10.70.35.17:/gluster/brick3/vol2      49152     0          Y       3190 
Brick 10.70.35.3:/gluster/brick1/vol2       49152     0          Y       22153
Brick 10.70.35.27:/gluster/brick1/vol2      49152     0          Y       13528
Brick 10.70.35.130:/gluster/brick1/vol2     N/A       N/A        N       N/A  
Self-heal Daemon on localhost               N/A       N/A        Y       2040 
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       22985
Self-heal Daemon on dhcp35-3.lab.eng.blr.re
dhat.com                                    N/A       N/A        Y       8224 
Self-heal Daemon on 10.70.35.228            N/A       N/A        Y       10154
Self-heal Daemon on 10.70.35.27             N/A       N/A        Y       32318
Self-heal Daemon on 10.70.35.17             N/A       N/A        Y       22964
 
Task Status of Volume dispersed
------------------------------------------------------------------------------
There are no active volume tasks


On a replica volume

Status of volume: test
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.56:/gluster/brick1/r1        N/A       N/A        N       N/A  
Brick 10.70.35.228:/gluster/brick1/r1       49153     0          Y       10131
Brick 10.70.35.17:/gluster/brick3/r1        49153     0          Y       22941
Self-heal Daemon on localhost               N/A       N/A        Y       2040 
Self-heal Daemon on dhcp35-3.lab.eng.blr.re
dhat.com                                    N/A       N/A        Y       8224 
Self-heal Daemon on 10.70.35.130            N/A       N/A        Y       22985
Self-heal Daemon on 10.70.35.17             N/A       N/A        Y       22964
Self-heal Daemon on 10.70.35.27             N/A       N/A        Y       32318
Self-heal Daemon on 10.70.35.228            N/A       N/A        Y       10154
 
Task Status of Volume test


Logs - 
====
[2018-08-03 12:46:56.304320] I [MSGID: 101190] [event-epoll.c:613:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2018-08-03 12:46:56.312524] I [MSGID: 106163] [glusterd-handshake.c:1319:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31302
[2018-08-03 12:46:56.766187] I [MSGID: 106163] [glusterd-handshake.c:1319:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31302
[2018-08-03 12:46:56.793152] I [MSGID: 106490] [glusterd-handler.c:2627:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 8725052e-568b-4123-ac75-21e9574c923e
[2018-08-03 12:46:56.810897] I [MSGID: 106493] [glusterd-handler.c:3890:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.35.228 (0), ret: 0, op_ret: 0
[2018-08-03 12:46:56.885420] E [glusterd-utils.c:6135:glusterd_brick_start] 0-management: fsid comparison is failed it means Brick root path /gluster/brick1/vol2 is not created by glusterd, start/attach will als
o fail
[2018-08-03 12:46:56.885976] E [glusterd-utils.c:6135:glusterd_brick_start] 0-management: fsid comparison is failed it means Brick root path /gluster/brick1/r1 is not created by glusterd, start/attach will also 
fail
[2018-08-03 12:46:57.082261] I [MSGID: 106163] [glusterd-handshake.c:1319:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31302
[2018-08-03 12:46:57.118469] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 8725052e-568b-4123-ac75-21e9574c923e, host: 10.70.35.228, port: 0
[2018-08-03 12:46:57.155205] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2018-08-03 12:46:57.155478] I [MSGID: 106132] [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: nfs already stopped
[2018-08-03 12:46:57.155529] I [MSGID: 106568] [glusterd-svc-mgmt.c:243:glusterd_svc_stop] 0-management: nfs service is stopped
[2018-08-03 12:46:57.156890] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2018-08-03 12:46:57.159529] I [MSGID: 106132] [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: glustershd already stopped
[2018-08-03 12:46:57.159570] I [MSGID: 106568] [glusterd-svc-mgmt.c:243:glusterd_svc_stop] 0-management: glustershd service is stopped
[2018-08-03 12:46:57.159732] I [MSGID: 106567] [glusterd-svc-mgmt.c:211:glusterd_svc_start] 0-management: Starting glustershd service
[2018-08-03 12:46:58.168512] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2018-08-03 12:46:58.169517] I [MSGID: 106132] [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: quotad already stopped
[2018-08-03 12:46:58.169584] I [MSGID: 106568] [glusterd-svc-mgmt.c:243:glusterd_svc_stop] 0-management: quotad service is stopped
[2018-08-03 12:46:58.169687] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2018-08-03 12:46:58.170082] I [MSGID: 106132] [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: bitd already stopped
[2018-08-03 12:46:58.170121] I [MSGID: 106568] [glusterd-svc-mgmt.c:243:glusterd_svc_stop] 0-management: bitd service is stopped
[2018-08-03 12:46:58.170189] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2018-08-03 12:46:58.170471] I [MSGID: 106132] [glusterd-proc-mgmt.c:84:glusterd_proc_stop] 0-management: scrub already stopped
[2018-08-03 12:46:58.170496] I [MSGID: 106568] [glusterd-svc-mgmt.c:243:glusterd_svc_stop] 0-management: scrub service is stopped
[2018-08-03 12:46:58.170609] E [glusterd-utils.c:6135:glusterd_brick_start] 0-management: fsid comparison is failed it means Brick root path /gluster/brick1/vol2 is not created by glusterd, start/attach will als
o fail
[2018-08-03 12:46:58.452086] E [glusterd-utils.c:6135:glusterd_brick_start] 0-management: fsid comparison is failed it means Brick root path /gluster/brick1/r1 is not created by glusterd, start/attach will also fail
[2018-08-03 12:46:58.804420] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2018-08-03 12:46:58.804871] I [rpc-clnt.c:1044:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2018-08-03 12:46:58.806954] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 9248003d-418e-41d5-acd8-de2808bc6191, host: 10.70.35.130, port: 0
[2018-08-03 12:46:58.849347] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: cf5cf97a-fa28-4107-8a01-1ead0965eab4, host: 10.70.35.17, port: 0
[2018-08-03 12:46:58.876069] I [MSGID: 106493] [glusterd-rpc-ops.c:486:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 8d650b81-ca20-409d-8721-621fe166f2bd, host: dhcp35-3.lab.eng.blr.redhat.com, port: 0
[2018-08-03 12:46:58.922170] I [MSGID: 106492] [glusterd-handler.c:2805:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: cf5cf97a-fa28-4107-8a01-1ead0965eab4
[2018-08-03 12:46:59.003237] I [MSGID: 106502] [glusterd-handler.c:2850:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2018-08-03 12:46:59.158677] I [MSGID: 106492] [glusterd-handler.c:2805:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 8d650b81-ca20-409d-8721-621fe166f2bd
[2018-08-03 12:46:59.158827] I [MSGID: 106502] [glusterd-handler.c:2850:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2018-08-03 12:46:59.327880] I [MSGID: 106492] [glusterd-handler.c:2805:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 9248003d-418e-41d5-acd8-de2808bc6191

Comment 6 Atin Mukherjee 2018-08-06 03:39:45 UTC
upstream patch : https://review.gluster.org/20638

Comment 11 errata-xmlrpc 2018-09-04 06:51:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.