Bug 1505363 - Brick Multiplexing: stale brick processes getting created and volume status shows brick as down(pkill glusterfsd glusterfs ,glusterd restart)
Summary: Brick Multiplexing: stale brick processes getting created and volume status s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: glusterd
Version: rhgs-3.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: RHGS 3.4.0
Assignee: Atin Mukherjee
QA Contact: Bala Konda Reddy M
URL:
Whiteboard: brick-multiplexing
Depends On: 1506513 1508283
Blocks: 1503134 1526368
TreeView+ depends on / blocked
 
Reported: 2017-10-23 12:14 UTC by Nag Pavan Chilakam
Modified: 2019-03-13 10:28 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.12.2-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1506513 1526368 (view as bug list)
Environment:
Last Closed: 2018-09-04 06:38:02 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2607 0 None None None 2018-09-04 06:39:45 UTC

Description Nag Pavan Chilakam 2017-10-23 12:14:43 UTC
Description of problem:
=======================
Given that brick mux is now into its 2nd release, so a upgrade test was apparent.
I kill glusterfsd , glusterfs follwed by glusterd stop , the same steps we do before upgrading a node packages(though i didn't do any update)
post that i did a glusterd start. 
I have seen that stale brick process are getting created while some bricks showing as not up in volume status
However, when I confirmed by doing IOs, that the bricks showing as down are up



Version-Release number of selected component (if applicable):
===============================
3.8.4-50

How reproducible:
==============
4/4 (always)

Steps to Reproduce:
1.brick mux enabled, max brick per process set to 3
2.had about 12 volumes,about 10 were 1x3 and 2 were 2x2 =>in total 17 bricks per node
3.did a pkill glusterfsd, glusterfs and service glusterd stop
4. did service glusterd start


Actual results:
==============
found about 11-18(different tries, different numbers) glusterfsd running, while only 7 are supposed to be created

also volume status shows bricks as offline for some of them

however, no IO impact


We would hit this in upgrade path

Expected results:


Additional info:

Comment 2 Nag Pavan Chilakam 2017-10-23 12:16:13 UTC
post glusterd restart, and after waiting for about 5 min(so that all vols are up)
[root@dhcp35-192 ~]# ps -ef|grep glusterfsd
root      5251     1  0 17:39 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_1.10.70.35.192.rhs-brick2-cross3_1 -p /var/run/gluster/vols/cross3_1/10.70.35.192-rhs-brick2-cross3_1.pid -S /var/run/gluster/84bc793e9e9abb985e161de39469a0c2.socket --brick-name /rhs/brick2/cross3_1 -l /var/log/glusterfs/bricks/rhs-brick2-cross3_1.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49153 --xlator-option cross3_1-server.listen-port=49153
root      5292     1  0 17:39 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_10.10.70.35.192.rhs-brick3-cross3_10 -p /var/run/gluster/vols/cross3_10/10.70.35.192-rhs-brick3-cross3_10.pid -S /var/run/gluster/b410220a70767da0f7c97bf2b974faf8.socket --brick-name /rhs/brick3/cross3_10 -l /var/log/glusterfs/bricks/rhs-brick3-cross3_10.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49157 --xlator-option cross3_10-server.listen-port=49157
root      5370     1  0 17:39 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_11.10.70.35.192.rhs-brick3-cross3_11 -p /var/run/gluster/vols/cross3_11/10.70.35.192-rhs-brick3-cross3_11.pid -S /var/run/gluster/b3980c8bbb8bfdd08c1d1925b2018da3.socket --brick-name /rhs/brick3/cross3_11 -l /var/log/glusterfs/bricks/rhs-brick3-cross3_11.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49159 --xlator-option cross3_11-server.listen-port=49159
root      5420     1  0 17:39 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_13.10.70.35.192.rhs-brick3-cross3_13 -p /var/run/gluster/vols/cross3_13/10.70.35.192-rhs-brick3-cross3_13.pid -S /var/run/gluster/27e397caae9bbb1db4544a1352178ce1.socket --brick-name /rhs/brick3/cross3_13 -l /var/log/glusterfs/bricks/rhs-brick3-cross3_13.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49161 --xlator-option cross3_13-server.listen-port=49161
root      5472     1  0 17:39 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_3.10.70.35.192.rhs-brick3-cross3_1 -p /var/run/gluster/vols/cross3_3/10.70.35.192-rhs-brick3-cross3_1.pid -S /var/run/gluster/4f31159c847f8c6a64ab4c4b5a1bad83.socket --brick-name /rhs/brick3/cross3_1 -l /var/log/glusterfs/bricks/rhs-brick3-cross3_1.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49166 --xlator-option cross3_3-server.listen-port=49166
root      5523     1  0 17:39 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_5.10.70.35.192.rhs-brick2-cross3_5 -p /var/run/gluster/vols/cross3_5/10.70.35.192-rhs-brick2-cross3_5.pid -S /var/run/gluster/be83f9adc22f592a6d5c6ded82b4d7ae.socket --brick-name /rhs/brick2/cross3_5 -l /var/log/glusterfs/bricks/rhs-brick2-cross3_5.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49167 --xlator-option cross3_5-server.listen-port=49167
root      5579     1  0 17:40 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_6.10.70.35.192.rhs-brick3-cross3_6 -p /var/run/gluster/vols/cross3_6/10.70.35.192-rhs-brick3-cross3_6.pid -S /var/run/gluster/280b10b0762ee6760853a296407bc378.socket --brick-name /rhs/brick3/cross3_6 -l /var/log/glusterfs/bricks/rhs-brick3-cross3_6.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49152 --xlator-option cross3_6-server.listen-port=49152
root      5605     1  0 17:40 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_7.10.70.35.192.rhs-brick1-cross3_7 -p /var/run/gluster/vols/cross3_7/10.70.35.192-rhs-brick1-cross3_7.pid -S /var/run/gluster/90cf59c10778292d7c0b0f6701b9ac66.socket --brick-name /rhs/brick1/cross3_7 -l /var/log/glusterfs/bricks/rhs-brick1-cross3_7.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49155 --xlator-option cross3_7-server.listen-port=49155
root      5636     1  0 17:41 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id cross3_8.10.70.35.192.rhs-brick2-cross3_8 -p /var/run/gluster/vols/cross3_8/10.70.35.192-rhs-brick2-cross3_8.pid -S /var/run/gluster/60e47947cfb6c22f811452a47a96ee0d.socket --brick-name /rhs/brick2/cross3_8 -l /var/log/glusterfs/bricks/rhs-brick2-cross3_8.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49158 --xlator-option cross3_8-server.listen-port=49158
root      5671     1  0 17:41 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id distrep_1.10.70.35.192.rhs-brick1-distrep_1 -p /var/run/gluster/vols/distrep_1/10.70.35.192-rhs-brick1-distrep_1.pid -S /var/run/gluster/3c07c6417997d2b158e4a9a8500adebb.socket --brick-name /rhs/brick1/distrep_1 -l /var/log/glusterfs/bricks/rhs-brick1-distrep_1.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49160 --xlator-option distrep_1-server.listen-port=49160
root      5740     1  0 17:41 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id distrep_2.10.70.35.192.rhs-brick2-distrep_2 -p /var/run/gluster/vols/distrep_2/10.70.35.192-rhs-brick2-distrep_2.pid -S /var/run/gluster/1059b407c449f2d41365855849c22372.socket --brick-name /rhs/brick2/distrep_2 -l /var/log/glusterfs/bricks/rhs-brick2-distrep_2.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49163 --xlator-option distrep_2-server.listen-port=49163
root      5748     1  0 17:41 ?        00:00:00 /usr/sbin/glusterfsd -s 10.70.35.192 --volfile-id distrep_2.10.70.35.192.rhs-brick1-distrep_2 -p /var/run/gluster/vols/distrep_2/10.70.35.192-rhs-brick1-distrep_2.pid -S /var/run/gluster/3ecf6b7f050e35fb61289a4931631b7f.socket --brick-name /rhs/brick1/distrep_2 -l /var/log/glusterfs/bricks/rhs-brick1-distrep_2.log --xlator-option *-posix.glusterd-uuid=71c0dbb2-71ff-485e-af41-c9e6f4904e90 --brick-port 49164 --xlator-option distrep_2-server.listen-port=49164
root      5820  4957  0 17:44 pts/1    00:00:00 grep --color=auto glusterfsd
[root@dhcp35-192 ~]# 
[root@dhcp35-192 ~]# 
[root@dhcp35-192 ~]# gluster v status
Status of volume: cross3_1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick2/cross3_1     49153     0          Y       5251 
Brick 10.70.35.214:/rhs/brick1/cross3_1     49152     0          Y       5407 
Brick 10.70.35.215:/rhs/brick1/cross3_1     49152     0          Y       25031
Brick 10.70.35.192:/rhs/brick3/cross3_1a    49153     0          Y       5251 
Brick 10.70.35.214:/rhs/brick3/cross3_1a    49152     0          Y       5407 
Brick 10.70.35.215:/rhs/brick3/cross3_1a    49152     0          Y       25031
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_1
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 5816c10c-cf86-49b9-8036-b4206bc80786
Status               : completed           
 
Status of volume: cross3_10
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_10    49157     0          Y       5292 
Brick 10.70.35.214:/rhs/brick3/cross3_10    49152     0          Y       5407 
Brick 10.70.35.215:/rhs/brick3/cross3_10    49152     0          Y       25031
Brick 10.70.35.192:/rhs/brick3/cross3_10a   49157     0          Y       5292 
Brick 10.70.35.214:/rhs/brick3/cross3_10a   49153     0          Y       5459 
Brick 10.70.35.215:/rhs/brick3/cross3_10a   49153     0          Y       25082
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_10
------------------------------------------------------------------------------
Task                 : Rebalance           
ID                   : 765668b4-191c-478b-b2ec-0ceabfda13b7
Status               : completed           
 
Status of volume: cross3_11
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_11    49159     0          Y       5370 
Brick 10.70.35.214:/rhs/brick3/cross3_11    49153     0          Y       5459 
Brick 10.70.35.215:/rhs/brick3/cross3_11    49153     0          Y       25082
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_11
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_12
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_12    49159     0          Y       5370 
Brick 10.70.35.214:/rhs/brick3/cross3_12    49153     0          Y       5459 
Brick 10.70.35.215:/rhs/brick3/cross3_12    49153     0          Y       25082
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_12
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_13
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_13    49161     0          Y       5420 
Brick 10.70.35.214:/rhs/brick3/cross3_13    49154     0          Y       5544 
Brick 10.70.35.215:/rhs/brick3/cross3_13    49154     0          Y       25155
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_13
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick2/cross3_2     49161     0          Y       5420 
Brick 10.70.35.214:/rhs/brick2/cross3_2     49154     0          Y       5544 
Brick 10.70.35.215:/rhs/brick2/cross3_2     49154     0          Y       25155
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_2
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_3
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_1     49166     0          Y       5472 
Brick 10.70.35.214:/rhs/brick3/cross3_3     49154     0          Y       5544 
Brick 10.70.35.215:/rhs/brick3/cross3_3     49154     0          Y       25155
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_3
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_4
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick1/cross3_4     49166     0          Y       5472 
Brick 10.70.35.214:/rhs/brick1/cross3_4     49155     0          Y       5626 
Brick 10.70.35.215:/rhs/brick1/cross3_4     49155     0          Y       25237
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_4
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_5
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick2/cross3_5     N/A       N/A        N       N/A  
Brick 10.70.35.214:/rhs/brick2/cross3_5     49155     0          Y       5626 
Brick 10.70.35.215:/rhs/brick2/cross3_5     49155     0          Y       25237
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_5
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_6
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_6     N/A       N/A        N       N/A  
Brick 10.70.35.214:/rhs/brick3/cross3_6     49155     0          Y       5626 
Brick 10.70.35.215:/rhs/brick3/cross3_6     49155     0          Y       25237
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume cross3_6
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_7
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick1/cross3_7     N/A       N/A        N       N/A  
Brick 10.70.35.214:/rhs/brick1/cross3_7     49156     0          Y       5696 
Brick 10.70.35.215:/rhs/brick1/cross3_7     49156     0          Y       25307
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
 
Task Status of Volume cross3_7
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_8
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick2/cross3_8     49158     0          Y       5636 
Brick 10.70.35.214:/rhs/brick2/cross3_8     49156     0          Y       5696 
Brick 10.70.35.215:/rhs/brick2/cross3_8     49156     0          Y       25307
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
 
Task Status of Volume cross3_8
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: cross3_9
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick3/cross3_9     49158     0          Y       5636 
Brick 10.70.35.214:/rhs/brick3/cross3_9     49156     0          Y       5696 
Brick 10.70.35.215:/rhs/brick3/cross3_9     49156     0          Y       25307
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
 
Task Status of Volume cross3_9
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: distrep_1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick2/distrep_1    49158     0          Y       5636 
Brick 10.70.35.214:/rhs/brick3/distrep_1    49157     0          Y       5766 
Brick 10.70.35.192:/rhs/brick1/distrep_1    N/A       N/A        N       N/A  
Brick 10.70.35.215:/rhs/brick3/distrep_1    49157     0          Y       25377
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
 
Task Status of Volume distrep_1
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: distrep_2
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.35.192:/rhs/brick2/distrep_2    49163     0          Y       5740 
Brick 10.70.35.214:/rhs/brick3/distrep_2    49157     0          Y       5766 
Brick 10.70.35.192:/rhs/brick1/distrep_2    49163     0          Y       5740 
Brick 10.70.35.215:/rhs/brick3/distrep_2    49157     0          Y       25377
Self-heal Daemon on localhost               N/A       N/A        Y       5260 
Self-heal Daemon on 10.70.35.215            N/A       N/A        Y       25418
Self-heal Daemon on 10.70.35.214            N/A       N/A        Y       5807 
 
Task Status of Volume distrep_2
------------------------------------------------------------------------------
There are no active volume tasks

Comment 3 Nag Pavan Chilakam 2017-10-23 12:16:36 UTC
glusterd logs:


[2017-10-23 12:09:09.522841] I [MSGID: 100030] [glusterfsd.c:2441:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.8.4 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO)
[2017-10-23 12:09:09.527750] I [MSGID: 106478] [glusterd.c:1448:init] 0-management: Maximum allowed open file descriptors set to 65536
[2017-10-23 12:09:09.527784] I [MSGID: 106479] [glusterd.c:1508:init] 0-management: Using /var/lib/glusterd as working directory
[2017-10-23 12:09:09.527794] I [MSGID: 106479] [glusterd.c:1513:init] 0-management: Using /var/run/gluster as pid file working directory
[2017-10-23 12:09:09.534871] W [MSGID: 103071] [rdma.c:4596:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed [No such device]
[2017-10-23 12:09:09.534922] W [MSGID: 103055] [rdma.c:4905:init] 0-rdma.management: Failed to initialize IB Device
[2017-10-23 12:09:09.534962] W [rpc-transport.c:350:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed
[2017-10-23 12:09:09.535051] W [rpcsvc.c:1646:rpcsvc_create_listener] 0-rpc-service: cannot create listener, initing the transport failed
[2017-10-23 12:09:09.535077] E [MSGID: 106243] [glusterd.c:1796:init] 0-management: creation of 1 listeners failed, continuing with succeeded transport
[2017-10-23 12:09:11.623164] I [MSGID: 106513] [glusterd-store.c:2135:glusterd_restore_op_version] 0-glusterd: retrieved op-version: 31101
[2017-10-23 12:09:11.830677] I [MSGID: 106498] [glusterd-handler.c:3670:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2017-10-23 12:09:11.830832] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:11.834172] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:11.830767] I [MSGID: 106498] [glusterd-handler.c:3670:glusterd_friend_add_from_peerinfo] 0-management: connect returned 0
[2017-10-23 12:09:11.837119] I [MSGID: 106544] [glusterd.c:157:glusterd_uuid_init] 0-management: retrieved UUID: 71c0dbb2-71ff-485e-af41-c9e6f4904e90
Final graph:
+------------------------------------------------------------------------------+
  1: volume management
  2:     type mgmt/glusterd
  3:     option rpc-auth.auth-glusterfs on
  4:     option rpc-auth.auth-unix on
  5:     option rpc-auth.auth-null on
  6:     option rpc-auth-allow-insecure on
  7:     option transport.listen-backlog 10
  8:     option event-threads 1
  9:     option ping-timeout 0
 10:     option transport.socket.read-fail-log off
 11:     option transport.socket.keepalive-interval 2
 12:     option transport.socket.keepalive-time 10
 13:     option transport-type rdma
 14:     option working-directory /var/lib/glusterd
 15: end-volume
 16:  
+------------------------------------------------------------------------------+
[2017-10-23 12:09:11.840503] I [MSGID: 101190] [event-epoll.c:602:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-23 12:09:11.870068] I [MSGID: 106493] [glusterd-rpc-ops.c:478:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 9ad7c3ad-f89b-4902-a98c-d3f8c466f185, host: 10.70.35.215, port: 0
[2017-10-23 12:09:11.872564] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-nfs: setting frame-timeout to 600
[2017-10-23 12:09:11.872728] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-10-23 12:09:11.872771] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: nfs service is stopped
[2017-10-23 12:09:11.873591] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-glustershd: setting frame-timeout to 600
[2017-10-23 12:09:11.878290] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped
[2017-10-23 12:09:11.878355] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: glustershd service is stopped
[2017-10-23 12:09:11.878416] I [MSGID: 106567] [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting glustershd service
[2017-10-23 12:09:12.881974] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-quotad: setting frame-timeout to 600
[2017-10-23 12:09:12.882490] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
[2017-10-23 12:09:12.882600] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad service is stopped
[2017-10-23 12:09:12.882705] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-bitd: setting frame-timeout to 600
[2017-10-23 12:09:12.883138] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-10-23 12:09:12.883173] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-10-23 12:09:12.883215] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-scrub: setting frame-timeout to 600
[2017-10-23 12:09:12.883574] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-10-23 12:09:12.883620] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-10-23 12:09:12.883743] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/cross3_1
[2017-10-23 12:09:12.886097] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:12.886430] I [MSGID: 106493] [glusterd-rpc-ops.c:693:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 9ad7c3ad-f89b-4902-a98c-d3f8c466f185
[2017-10-23 12:09:12.886558] I [MSGID: 106493] [glusterd-rpc-ops.c:478:__glusterd_friend_add_cbk] 0-glusterd: Received ACC from uuid: 8788ec96-cf90-453e-9c73-55af0167a83f, host: 10.70.35.214, port: 0
[2017-10-23 12:09:12.893180] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:12.893393] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick2/cross3_1 has disconnected from glusterd.
[2017-10-23 12:09:12.893847] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped
[2017-10-23 12:09:12.893879] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: nfs service is stopped
[2017-10-23 12:09:12.899617] I [MSGID: 106568] [glusterd-proc-mgmt.c:87:glusterd_proc_stop] 0-management: Stopping glustershd daemon running in pid: 5242
[2017-10-23 12:09:13.899861] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: glustershd service is stopped
[2017-10-23 12:09:13.899987] I [MSGID: 106567] [glusterd-svc-mgmt.c:197:glusterd_svc_start] 0-management: Starting glustershd service
[2017-10-23 12:09:13.902809] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped
[2017-10-23 12:09:13.902853] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: quotad service is stopped
[2017-10-23 12:09:13.903088] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped
[2017-10-23 12:09:13.903116] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: bitd service is stopped
[2017-10-23 12:09:13.903344] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped
[2017-10-23 12:09:13.903367] I [MSGID: 106568] [glusterd-svc-mgmt.c:229:glusterd_svc_stop] 0-management: scrub service is stopped
[2017-10-23 12:09:13.903435] I [glusterd-utils.c:5780:glusterd_brick_start] 0-management: discovered already-running brick /rhs/brick2/cross3_1
[2017-10-23 12:09:13.903458] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick2/cross3_1 on port 49153
[2017-10-23 12:09:13.903656] I [MSGID: 106493] [glusterd-rpc-ops.c:693:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 8788ec96-cf90-453e-9c73-55af0167a83f
[2017-10-23 12:09:13.904042] I [MSGID: 106492] [glusterd-handler.c:2789:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 8788ec96-cf90-453e-9c73-55af0167a83f
[2017-10-23 12:09:13.904107] I [MSGID: 106502] [glusterd-handler.c:2834:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-10-23 12:09:13.906898] I [MSGID: 106163] [glusterd-handshake.c:1275:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31101
[2017-10-23 12:09:13.908810] I [MSGID: 106492] [glusterd-handler.c:2789:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 9ad7c3ad-f89b-4902-a98c-d3f8c466f185
[2017-10-23 12:09:13.908857] I [MSGID: 106502] [glusterd-handler.c:2834:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-10-23 12:09:13.915395] I [MSGID: 106163] [glusterd-handshake.c:1275:__glusterd_mgmt_hndsk_versions_ack] 0-management: using the op-version 31101
[2017-10-23 12:09:13.919283] I [MSGID: 106490] [glusterd-handler.c:2611:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 8788ec96-cf90-453e-9c73-55af0167a83f
[2017-10-23 12:09:13.923950] I [MSGID: 106493] [glusterd-handler.c:3873:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.35.214 (0), ret: 0, op_ret: 0
[2017-10-23 12:09:13.927540] I [MSGID: 106492] [glusterd-handler.c:2789:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 8788ec96-cf90-453e-9c73-55af0167a83f
[2017-10-23 12:09:13.927581] I [MSGID: 106502] [glusterd-handler.c:2834:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-10-23 12:09:13.929396] I [MSGID: 106490] [glusterd-handler.c:2611:__glusterd_handle_incoming_friend_req] 0-glusterd: Received probe from uuid: 9ad7c3ad-f89b-4902-a98c-d3f8c466f185
[2017-10-23 12:09:13.932502] I [MSGID: 106493] [glusterd-handler.c:3873:glusterd_xfer_friend_add_resp] 0-glusterd: Responded to 10.70.35.215 (0), ret: 0, op_ret: 0
[2017-10-23 12:09:13.935731] I [MSGID: 106493] [glusterd-rpc-ops.c:693:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 8788ec96-cf90-453e-9c73-55af0167a83f
[2017-10-23 12:09:13.935796] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick2/cross3_1 on port 49153
[2017-10-23 12:09:13.935880] I [MSGID: 106492] [glusterd-handler.c:2789:__glusterd_handle_friend_update] 0-glusterd: Received friend update from uuid: 9ad7c3ad-f89b-4902-a98c-d3f8c466f185
[2017-10-23 12:09:13.935924] I [MSGID: 106502] [glusterd-handler.c:2834:__glusterd_handle_friend_update] 0-management: Received my uuid as Friend
[2017-10-23 12:09:13.937532] I [MSGID: 106493] [glusterd-rpc-ops.c:693:__glusterd_friend_update_cbk] 0-management: Received ACC from uuid: 9ad7c3ad-f89b-4902-a98c-d3f8c466f185
[2017-10-23 12:09:14.886573] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_1a to existing process for /rhs/brick2/cross3_1
[2017-10-23 12:09:14.886640] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:15.886784] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:15.903935] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_1a to existing process for /rhs/brick2/cross3_1
[2017-10-23 12:09:15.903995] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:16.887417] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick3/cross3_10
[2017-10-23 12:09:16.889993] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:16.890621] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:16.890864] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick3/cross3_10 has disconnected from glusterd.
[2017-10-23 12:09:16.904259] I [glusterd-utils.c:5780:glusterd_brick_start] 0-management: discovered already-running brick /rhs/brick3/cross3_10
[2017-10-23 12:09:16.904295] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_10 on port 49157
[2017-10-23 12:09:18.890391] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_10a to existing process for /rhs/brick3/cross3_10
[2017-10-23 12:09:18.890460] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:18.904579] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_10a to existing process for /rhs/brick3/cross3_10
[2017-10-23 12:09:18.904633] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:19.890584] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:19.904776] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:20.891096] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick3/cross3_11
[2017-10-23 12:09:20.893531] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:20.894107] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:16.920189] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_10 on port 49157
[2017-10-23 12:09:20.894336] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick3/cross3_11 has disconnected from glusterd.
[2017-10-23 12:09:20.905082] I [glusterd-utils.c:5780:glusterd_brick_start] 0-management: discovered already-running brick /rhs/brick3/cross3_11
[2017-10-23 12:09:20.905117] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_11 on port 49159
[2017-10-23 12:09:22.894010] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_12 to existing process for /rhs/brick3/cross3_11
[2017-10-23 12:09:22.894073] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:22.905417] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_12 to existing process for /rhs/brick3/cross3_11
[2017-10-23 12:09:22.905458] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:23.894189] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:23.905601] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:24.894675] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick3/cross3_13
[2017-10-23 12:09:24.897370] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:24.899282] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:20.914787] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_11 on port 49159
[2017-10-23 12:09:24.899517] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick3/cross3_13 has disconnected from glusterd.
[2017-10-23 12:09:24.906106] I [glusterd-utils.c:5780:glusterd_brick_start] 0-management: discovered already-running brick /rhs/brick3/cross3_13
[2017-10-23 12:09:24.906139] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_13 on port 49161
[2017-10-23 12:09:26.897825] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick2/cross3_2 to existing process for /rhs/brick3/cross3_13
[2017-10-23 12:09:26.897908] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:26.906548] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick2/cross3_2 to existing process for /rhs/brick3/cross3_13
[2017-10-23 12:09:26.906581] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:27.898021] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:27.906673] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:28.898531] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick3/cross3_1
[2017-10-23 12:09:28.903621] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:28.904204] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:24.920026] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_13 on port 49161
[2017-10-23 12:09:28.904436] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick3/cross3_1 has disconnected from glusterd.
[2017-10-23 12:09:28.906969] I [glusterd-utils.c:5780:glusterd_brick_start] 0-management: discovered already-running brick /rhs/brick3/cross3_1
[2017-10-23 12:09:28.906996] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_1 on port 49166
[2017-10-23 12:09:30.904042] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick1/cross3_4 to existing process for /rhs/brick3/cross3_1
[2017-10-23 12:09:30.904103] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:30.907340] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick1/cross3_4 to existing process for /rhs/brick3/cross3_1
[2017-10-23 12:09:30.907401] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:31.904202] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:31.907538] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:09:32.904635] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/cross3_5
[2017-10-23 12:09:32.908376] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:32.908932] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:32.909143] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/cross3_5
[2017-10-23 12:09:28.921688] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_1 on port 49166
[2017-10-23 12:09:32.909212] E [MSGID: 101012] [common-utils.c:3830:gf_is_service_running] 0-: Unable to read pidfile: /var/run/gluster/vols/cross3_5/10.70.35.192-rhs-brick2-cross3_5.pid
[2017-10-23 12:09:32.911608] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:09:32.912055] W [glusterd-handler.c:6077:__glusterd_brick_rpc_notify] 0-management: got disconnect from stale rpc on /rhs/brick2/cross3_5
[2017-10-23 12:09:32.912386] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:09:32.912582] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick2/cross3_5 has disconnected from glusterd.
[2017-10-23 12:09:32.926393] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick2/cross3_5 on port 49167
[2017-10-23 12:10:02.910390] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick2/cross3_5
[2017-10-23 12:10:02.910573] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick3/cross3_6
[2017-10-23 12:10:02.913319] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:10:02.914055] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:10:02.914062] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick2/cross3_5
[2017-10-23 12:10:02.914148] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick3/cross3_6
[2017-10-23 12:10:02.916353] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:10:02.917237] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:10:02.917437] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick3/cross3_6 has disconnected from glusterd.
[2017-10-23 12:10:02.935220] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick3/cross3_6 on port 49152
[2017-10-23 12:10:32.915574] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick3/cross3_6
[2017-10-23 12:10:32.915764] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick1/cross3_7
[2017-10-23 12:10:32.918511] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:10:32.919061] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick3/cross3_6
[2017-10-23 12:10:32.919117] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick1/cross3_7
[2017-10-23 12:10:32.921095] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:10:32.921774] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:10:32.922243] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:10:32.922464] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick1/cross3_7 has disconnected from glusterd.
[2017-10-23 12:10:32.940065] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick1/cross3_7 on port 49155
[2017-10-23 12:11:02.920746] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick1/cross3_7
[2017-10-23 12:11:02.920866] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/cross3_8
[2017-10-23 12:11:02.925610] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:11:02.925941] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick1/cross3_7
[2017-10-23 12:11:02.926000] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/cross3_8
[2017-10-23 12:11:02.927023] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:11:02.927237] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick2/cross3_8 has disconnected from glusterd.
[2017-10-23 12:11:02.943577] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick2/cross3_8 on port 49158
[2017-10-23 12:11:04.926107] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_9 to existing process for /rhs/brick2/cross3_8
[2017-10-23 12:11:04.926170] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:11:04.926363] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick3/cross3_9 to existing process for /rhs/brick2/cross3_8
[2017-10-23 12:11:04.926400] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:11:05.926346] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:11:05.926510] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:11:06.928391] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick2/distrep_1 to existing process for /rhs/brick2/cross3_8
[2017-10-23 12:11:06.928631] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick1/distrep_1
[2017-10-23 12:11:06.931002] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:11:06.931572] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:11:06.931759] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick1/distrep_1 has disconnected from glusterd.
[2017-10-23 12:11:06.932130] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick1/distrep_1
[2017-10-23 12:11:06.934206] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:11:06.934750] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:11:06.934941] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick1/distrep_1 has disconnected from glusterd.
[2017-10-23 12:11:06.952954] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick1/distrep_1 on port 49160
[2017-10-23 12:11:36.932947] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick1/distrep_1
[2017-10-23 12:11:36.933000] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/distrep_2
[2017-10-23 12:11:36.938180] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:11:36.938774] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:11:36.938994] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick2/distrep_2 has disconnected from glusterd.
[2017-10-23 12:11:36.939143] I [glusterd-utils.c:5543:find_compat_brick_in_vol] 0-management: brick has not come up so cleaning up dead brick 10.70.35.192:/rhs/brick1/distrep_1
[2017-10-23 12:11:36.939160] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick2/distrep_2
[2017-10-23 12:11:36.939826] I [glusterd-utils.c:5872:glusterd_brick_start] 0-management: starting a fresh brick process for brick /rhs/brick1/distrep_2
[2017-10-23 12:11:36.943449] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-management: setting frame-timeout to 600
[2017-10-23 12:11:36.943625] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.943794] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.943997] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.944162] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.944312] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.944457] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.944601] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.944771] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.944945] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.945106] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.945252] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.945409] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.945558] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.945688] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.945843] I [rpc-clnt.c:1060:rpc_clnt_connection_init] 0-snapd: setting frame-timeout to 600
[2017-10-23 12:11:36.946307] I [socket.c:2465:socket_event_handler] 0-transport: EPOLLERR - disconnecting now
[2017-10-23 12:11:36.946486] I [MSGID: 106005] [glusterd-handler.c:6084:__glusterd_brick_rpc_notify] 0-management: Brick 10.70.35.192:/rhs/brick1/distrep_2 has disconnected from glusterd.
[2017-10-23 12:11:36.956763] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick2/distrep_2 on port 49163
[2017-10-23 12:11:36.960582] I [MSGID: 106143] [glusterd-pmap.c:277:pmap_registry_bind] 0-pmap: adding brick /rhs/brick1/distrep_2 on port 49164
[2017-10-23 12:11:38.938609] I [glusterd-utils.c:5337:attach_brick] 0-management: add brick /rhs/brick1/distrep_2 to existing process for /rhs/brick2/distrep_2
[2017-10-23 12:11:38.938690] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet
[2017-10-23 12:11:39.938856] I [glusterd-utils.c:5247:send_attach_req] 0-management: not connected yet

Comment 5 Nag Pavan Chilakam 2017-10-23 13:53:25 UTC
something similar I had hit in 3.3 BZ#1473327 - Brick Multiplexing: Seeing stale brick process when all gluster processes are stopped and then started with glusterd

Comment 6 Atin Mukherjee 2017-10-26 09:22:08 UTC
upstream patch : https://review.gluster.org/#/c/18577

Comment 9 Bala Konda Reddy M 2018-04-12 11:01:17 UTC
Build: 3.12.2-7

Followed the steps mentioned in the description.
Tried 15-18 times and, no stale brick processes are seen.
Tried this scenario during in-service upgrade as well. All bricks are online and no stale brick processes are seen after upgrade

Hence marking it as verified.

Comment 11 errata-xmlrpc 2018-09-04 06:38:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607


Note You need to log in before you can comment on or make changes to this bug.