Bug 1600145 - [geo-rep]: Worker still ACTIVE after killing bricks
Summary: [geo-rep]: Worker still ACTIVE after killing bricks
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
Assignee: Mohit Agrawal
QA Contact:
URL:
Whiteboard:
Depends On: 1599587
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-11 14:09 UTC by Mohit Agrawal
Modified: 2019-03-25 16:30 UTC (History)
11 users (show)

Fixed In Version: glusterfs-6.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1599587
Environment:
Last Closed: 2019-03-25 16:30:27 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gluster.org Gerrit 20645 0 None Merged [geo-rep]: Worker still ACTIVE after killing bricks 2018-12-13 04:47:43 UTC

Comment 1 Worker Ant 2018-07-11 14:33:37 UTC
REVIEW: https://review.gluster.org/20494 ([geo-rep]: Worker still ACTIVE after killing bricks) posted (#1) for review on master by MOHIT AGRAWAL

Comment 2 Kotresh HR 2018-07-13 12:56:25 UTC
Description of problem:
=======================
The ACTIVE brick processes for a geo-replication session were killed but it remains ACTIVE even after going down.

Before the bricks were killed:
-----------------------------
[root@dhcp42-18 scripts]# gluster volume geo-replication master 10.70.43.116::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.18     master        /rhs/brick1/b1    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:09:32          
10.70.42.18     master        /rhs/brick2/b4    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:06:17          
10.70.42.18     master        /rhs/brick3/b7    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:06:17          
10.70.41.239    master        /rhs/brick1/b2    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick2/b5    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick3/b8    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick1/b3    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick2/b6    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick3/b9    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
[root@dhcp42-18 scripts]# gluster v status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.41.239:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       28814
Brick 10.70.43.179:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       27173
Brick dhcp42-18.lab.eng.blr.redhat.com:/var
/lib/glusterd/ss_brick                      49152     0          Y       9969 
Self-heal Daemon on localhost               N/A       N/A        Y       10879
Self-heal Daemon on 10.70.41.239            N/A       N/A        Y       29525
Self-heal Daemon on 10.70.43.179            N/A       N/A        Y       27892
 
Task Status of Volume gluster_shared_storage
-----------------------------------------------------------------------------



After the bricks were killed using gf_attach:
---------------------------------------------
[root@dhcp42-18 scripts]# gluster v status
Status of volume: gluster_shared_storage
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.41.239:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       28814
Brick 10.70.43.179:/var/lib/glusterd/ss_bri
ck                                          49152     0          Y       27173
Brick dhcp42-18.lab.eng.blr.redhat.com:/var
/lib/glusterd/ss_brick                      49152     0          Y       9969 
Self-heal Daemon on localhost               N/A       N/A        Y       10879
Self-heal Daemon on 10.70.41.239            N/A       N/A        Y       29525
Self-heal Daemon on 10.70.43.179            N/A       N/A        Y       27892
 
Task Status of Volume gluster_shared_storage
------------------------------------------------------------------------------
There are no active volume tasks
 
Status of volume: master
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 10.70.42.18:/rhs/brick1/b1            N/A       N/A        N       N/A  
Brick 10.70.41.239:/rhs/brick1/b2           49152     0          Y       28814
Brick 10.70.43.179:/rhs/brick1/b3           49152     0          Y       27173
Brick 10.70.42.18:/rhs/brick2/b4            N/A       N/A        N       N/A  
Brick 10.70.41.239:/rhs/brick2/b5           49152     0          Y       28814
Brick 10.70.43.179:/rhs/brick2/b6           49152     0          Y       27173
Brick 10.70.42.18:/rhs/brick3/b7            N/A       N/A        N       N/A  
Brick 10.70.41.239:/rhs/brick3/b8           49152     0          Y       28814
Brick 10.70.43.179:/rhs/brick3/b9           49152     0          Y       27173
Self-heal Daemon on localhost               N/A       N/A        Y       10879
Self-heal Daemon on 10.70.41.239            N/A       N/A        Y       29525
Self-heal Daemon on 10.70.43.179            N/A       N/A        Y       27892
 
Task Status of Volume master
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp42-18 scripts]# gluster volume geo-replication master 10.70.43.116::slave status
 
MASTER NODE     MASTER VOL    MASTER BRICK      SLAVE USER    SLAVE                  SLAVE NODE      STATUS     CRAWL STATUS       LAST_SYNCED                  
-----------------------------------------------------------------------------------------------------------------------------------------------------
10.70.42.18     master        /rhs/brick1/b1    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:11:33          
10.70.42.18     master        /rhs/brick2/b4    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:12:02          
10.70.42.18     master        /rhs/brick3/b7    root          10.70.43.116::slave    10.70.42.246    Active     Changelog Crawl    2018-07-10 01:12:18          
10.70.41.239    master        /rhs/brick1/b2    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick2/b5    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.41.239    master        /rhs/brick3/b8    root          10.70.43.116::slave    10.70.43.116    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick1/b3    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick2/b6    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          
10.70.43.179    master        /rhs/brick3/b9    root          10.70.43.116::slave    10.70.42.128    Passive    N/A                N/A                          



Version-Release number of selected component (if applicable):
=============================================================
mainline

How reproducible:
=================
2/2


Steps to Reproduce:
1.Create a geo-replication session (3x3 master and slave volume)
2.Mount the master and slave volume
3.Create files on the master
4.kill brick using gf_attach 

Actual results:
===============
The workers still remain ACTIVE


Expected results:
================
The 3 ACTIVE workers should go to FAULTY and 3 PASSIVE workers should become ACTIVE and do the syncing

Comment 3 Worker Ant 2018-09-05 02:18:45 UTC
REVIEW: https://review.gluster.org/21078 ([geo-rep]: Worker still ACTIVE after killing bricks) posted (#1) for review on master by MOHIT AGRAWAL

Comment 4 Worker Ant 2018-09-05 02:22:25 UTC
REVIEW: https://review.gluster.org/21079 ([geo-rep]: Worker still ACTIVE after killing bricks) posted (#1) for review on master by MOHIT AGRAWAL

Comment 5 Worker Ant 2018-09-05 02:29:26 UTC
REVIEW: https://review.gluster.org/20645 ([geo-rep]: Worker still ACTIVE after killing bricks) posted (#20) for review on master by MOHIT AGRAWAL

Comment 6 Worker Ant 2018-12-13 04:47:42 UTC
REVIEW: https://review.gluster.org/20645 ([geo-rep]: Worker still ACTIVE after killing bricks) posted (#35) for review on master by Amar Tumballi

Comment 7 Shyamsundar 2019-03-25 16:30:27 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-6.0, please open a new bug report.

glusterfs-6.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://lists.gluster.org/pipermail/announce/2019-March/000120.html
[2] https://www.gluster.org/pipermail/gluster-users/


Note You need to log in before you can comment on or make changes to this bug.