+++ This bug was initially created as a clone of Bug #1230101 +++
Description of problem:
While trying to remove-brick with replica count 2 from the existing volume(replica 2), glusterd crashes with following bt:
#0 0x00007fcdd03e681c in subvol_matcher_update (req=0x25989cc) at glusterd-brick-ops.c:662
#1 __glusterd_handle_remove_brick (req=0x25989cc) at glusterd-brick-ops.c:985
#2 0x00007fcdd03542bf in glusterd_big_locked_handler (req=0x25989cc, actor_fn=0x7fcdd03e5f90 <__glusterd_handle_remove_brick>) at glusterd-handler.c:83
#3 0x0000003b0d8655b2 in synctask_wrap (old_task=<value optimized out>) at syncop.c:375
#4 0x0000003b028438f0 in ?? () from /lib64/libc.so.6
#5 0x0000000000000000 in ?? ()
[2015-06-10 14:18:01.134630] I [glusterd-handler.c:1404:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-06-10 14:18:01.137158] I [glusterd-handler.c:1404:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req
[2015-06-10 14:18:28.239515] I [glusterd-brick-ops.c:779:__glusterd_handle_remove_brick] 0-management: Received rem brick req
[2015-06-10 14:18:28.239593] I [glusterd-brick-ops.c:849:__glusterd_handle_remove_brick] 0-management: request to change replica-count to 2
frame : type(0) op(0)
signal received: 11
time of crash:
package-string: glusterfs 3.7.1
Steps to Reproduce:
1. Create 2X3 distributed-replicate volume
2. Start the volume
3. Shrink it to 2X2 distributed-replicate volume by explicitly mentioning replica 2 in 'remove-brick force'
4. Shrink the volume again to 2X1 distribute volume by explicitly mentioning replica 1 in 'remove-brick force'
Removing brick with replica count 2 from replica count 2 is a failure case, it should print usage or fail gracefully.
--- Additional comment from Red Hat Bugzilla Rules Engine on 2015-06-10 05:05:33 EDT ---
This bug is automatically being proposed for Red Hat Gluster Storage 3.1.0 by setting the release flag 'rhgs‑3.1.0' to '?'.
If this bug should be proposed for a different release, please manually change the proposed release flag.
--- Additional comment from Rahul Hinduja on 2015-06-10 05:07:22 EDT ---
[root@georep1 scripts]# gluster volume info
Volume Name: master
Volume ID: 7156c64c-a44b-40a4-98db-247a06d1f41e
Number of Bricks: 2 x 2 = 4
[root@georep1 scripts]# gluster volume remove-brick master replica 2 10.70.46.97:/rhs/brick1/b1 10.70.46.97:/rhs/brick2/b2 start
Connection failed. Please check if gluster daemon is operational.
--- Additional comment from Rahul Hinduja on 2015-06-10 05:15:20 EDT ---
sosreport @ http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1230101/
Additional Info: this volume is part of geo-rep master cluster.
--- Additional comment from SATHEESARAN on 2015-06-10 05:44:58 EDT ---
I have tried to reproduce the issue.
Its reproducible only with the following case :
1. Created 2X3 distributed-replicate volume
2. Shrink it to 2X2 distributed-replicate volume
3. Shrink it to 2X2 to 2X1 distribute volume
Here are few more observations :
1. There is no crash observed when creating a 2X2 volume and shrinking it to 2X1
2. There is no crash observed when creating a 2X3 volume and shrinking it to 2X2
3. There is no crash observed when trying to remove each brick from all replica sets and proper error message is thrown
REVIEW: http://review.gluster.org/11165 (glusterd: subvol_count value for replicate volume should be calculate correctly) posted (#1) for review on master by Gaurav Kumar Garg (email@example.com)
REVIEW: http://review.gluster.org/11165 (glusterd: subvol_count value for replicate volume should be calculate correctly) posted (#2) for review on master by Gaurav Kumar Garg (firstname.lastname@example.org)
REVIEW: http://review.gluster.org/11165 (glusterd: subvol_count value for replicate volume should be calculate correctly) posted (#4) for review on master by Gaurav Kumar Garg (email@example.com)
COMMIT: http://review.gluster.org/11165 committed in master by Krishnan Parthasarathi (firstname.lastname@example.org)
Author: Gaurav Kumar Garg <email@example.com>
Date: Wed Jun 10 15:11:39 2015 +0530
glusterd: subvol_count value for replicate volume should be calculate correctly
glusterd was crashing while trying to remove bricks from replica set
after shrinking nx3 replica to nx2 replica to nx1 replica.
This is because volinfo->subvol_count is calculating value from old
replica count value.
Signed-off-by: Gaurav Kumar Garg <firstname.lastname@example.org>
Reviewed-by: Atin Mukherjee <email@example.com>
Reviewed-by: Ravishankar N <firstname.lastname@example.org>
Tested-by: Gluster Build System <email@example.com>
Tested-by: NetBSD Build System <firstname.lastname@example.org>
Reviewed-by: Krishnan Parthasarathi <email@example.com>
Fix for this BZ is already present in a GlusterFS release. You can find clone of this BZ, fixed in a GlusterFS release and closed. Hence closing this mainline BZ as well.
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.
glusterfs-3.8.0 has been announced on the Gluster mailinglists , packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist  and the update infrastructure for your distribution.