Description of problem: Hit this on the build 3.6.0.46-1 Had a 4*2 distribute-replicate volume 'master' which was in a geo-rep relationship with another 4*2 distribute-replicate volume 'slave'. A replica brick set was removed to make the volume 'master' 3*2, and none of the files present in the replica set were rebalanced to the other bricks. The 'status' shows the parameter 'rebalanced-files' to be 0. Version-Release number of selected component (if applicable): 3.6.0.46-1 How reproducible: Hit it once. Have not tried it again. Steps that I followed. 1. Have a 4*2 distribute-replicate volume 'master' in geo-rep relationship with volume 'slave' 2. Stop the geo-rep session gluster volume geo-rep master dhcp42-130::slave stop 3. Remove one of the replica pairs gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 start 4. Check the status of remove-brick operation gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 status 5. Commit the remove-brick operation to reflect the correct values gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 commit Actual results: At step4, the files present in the replica set should have got rebalanced to the remaining bricks. The status command should have showed the correct value for the parameters 'rebalanced files', 'scanned files', 'files-skipped' Step 5, should have resulted in deletion of files from the local brick mountpoint. Expected results: Step4 does not show the expected movement of files due to remove-brick operation. Step5 shows all the previously-existing-files' still present on the local removed-brick-mountpoint. Additional info: [root@dhcp43-154 ~]# gluster system:: execute gsec_create Common secret pub file present at /var/lib/glusterd/geo-replication/common_secret.pem.pub [root@dhcp43-154 ~]# gluster v geo-rep master dhcp42-130::slave create push-pem dhcp42-130::slave is not empty. Please delete existing files in dhcp42-130::slave and retry, or use force to continue without deleting the existing files. geo-replication command failed [root@dhcp43-154 ~]# gluster v geo-rep master dhcp42-130::slave create push-pem force Creating geo-replication session between master & dhcp42-130::slave has been successful [root@dhcp43-154 ~]# gluster v geo-rep master dhcp42-130::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS -------------------------------------------------------------------------------------------------------------------------------------------------------- dhcp43-154.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Not Started N/A N/A dhcp43-154.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Not Started N/A N/A dhcp42-182.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Not Started N/A N/A dhcp42-182.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Not Started N/A N/A dhcp42-74.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Not Started N/A N/A dhcp42-74.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Not Started N/A N/A dhcp43-72.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Not Started N/A N/A dhcp43-72.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Not Started N/A N/A [root@dhcp43-154 ~]# [root@dhcp43-154 ~]# gluster v geo-rep master dhcp42-130::slave start Starting geo-replication session between master & dhcp42-130::slave has been successful [root@dhcp43-154 ~]# gluster v geo-rep master dhcp42-130::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS ------------------------------------------------------------------------------------------------------------------------------------------------------------ dhcp43-154.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp43-154.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp42-182.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp42-182.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp43-72.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp43-72.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp42-74.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Initializing... N/A N/A dhcp42-74.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Initializing... N/A N/A [root@dhcp43-154 ~]# gluster v geo-rep master dhcp42-130::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS ------------------------------------------------------------------------------------------------------------------------------------------------------- dhcp43-154.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp43-93::slave Active N/A Changelog Crawl dhcp43-154.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp43-93::slave Active N/A Changelog Crawl dhcp42-182.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-19::slave Passive N/A N/A dhcp42-182.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-19::slave Passive N/A N/A dhcp43-72.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-210::slave Passive N/A N/A dhcp43-72.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-210::slave Passive N/A N/A dhcp42-74.lab.eng.blr.redhat.com master /rhs/brick1/d1 root dhcp42-130::slave Active N/A Changelog Crawl dhcp42-74.lab.eng.blr.redhat.com master /rhs/brick2/d1 root dhcp42-130::slave Active N/A Changelog Crawl [root@dhcp43-154 ~]# gluster v i Volume Name: master Type: Distributed-Replicate Volume ID: fcf732d1-81d6-42d1-8915-cc2107fd72f2 Status: Started Snap Volume: no Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: dhcp43-154:/rhs/brick1/d1 Brick2: dhcp43-72:/rhs/brick1/d1 Brick3: dhcp42-74:/rhs/brick1/d1 Brick4: dhcp42-182:/rhs/brick1/d1 Brick5: dhcp43-154:/rhs/brick2/d1 Brick6: dhcp43-72:/rhs/brick2/d1 Brick7: dhcp42-74:/rhs/brick2/d1 Brick8: dhcp42-182:/rhs/brick2/d1 Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@dhcp43-154 ~]# [root@dhcp43-154 ~]# [root@dhcp43-154 ~]# mount /dev/mapper/vg_dhcp43154-lv_root on / type ext4 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) tmpfs on /dev/shm type tmpfs (rw) /dev/vda1 on /boot type ext4 (rw) /dev/mapper/RHS_vg1-RHS_lv1 on /rhs/brick1 type xfs (rw,noatime,nodiratime,inode64) /dev/mapper/RHS_vg2-RHS_lv2 on /rhs/brick2 type xfs (rw,noatime,nodiratime,inode64) none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) [root@dhcp43-154 ~]# mount -t glusterfs dhcp43-154:/master /mnt/master ERROR: Mount point does not exist Please specify a mount point Usage: man 8 /sbin/mount.glusterfs [root@dhcp43-154 ~]# mkdir /mnt/master [root@dhcp43-154 ~]# mount -t glusterfs dhcp43-154:/master /mnt/master [root@dhcp43-154 ~]# cd /mnt/master [root@dhcp43-154 master]# ls -a . a15 a23 a31 a4 a48 a56 a64 a72 a80 a89 a97 b14 b22 b30 b39 b47 b55 b63 b71 b8 b88 b96 c19 c39 c59 c76 c91 .. a16 a24 a32 a40 a49 a57 a65 a73 a81 a9 a98 b15 b23 b31 b4 b48 b56 b64 b72 b80 b89 b97 c2 c40 c62 c79 c92 a1 a17 a25 a33 a41 a5 a58 a66 a74 a82 a90 a99 b16 b24 b32 b40 b49 b57 b65 b73 b81 b9 b98 c20 c45 c63 c80 c96 a10 a18 a26 a34 a42 a50 a59 a67 a75 a83 a91 b1 b17 b25 b33 b41 b5 b58 b66 b74 b82 b90 b99 c22 c47 c64 c81 c97 a100 a19 a27 a35 a43 a51 a6 a68 a76 a84 a92 b10 b18 b26 b34 b42 b50 b59 b67 b75 b83 b91 c100 c25 c48 c65 c82 c98 a11 a2 a28 a36 a44 a52 a60 a69 a77 a85 a93 b100 b19 b27 b35 b43 b51 b6 b68 b76 b84 b92 c12 c28 c5 c68 c85 c99 a12 a20 a29 a37 a45 a53 a61 a7 a78 a86 a94 b11 b2 b28 b36 b44 b52 b60 b69 b77 b85 b93 c13 c29 c51 c70 c86 a13 a21 a3 a38 a46 a54 a62 a70 a79 a87 a95 b12 b20 b29 b37 b45 b53 b61 b7 b78 b86 b94 c16 c30 c53 c72 c88 a14 a22 a30 a39 a47 a55 a63 a71 a8 a88 a96 b13 b21 b3 b38 b46 b54 b62 b70 b79 b87 b95 c18 c36 c54 c74 c90 [root@dhcp43-154 master]# ls -l | wc -l 248 [root@dhcp43-154 master]# cd /rh rhev/ rhs/ [root@dhcp43-154 master]# cd /rh rhev/ rhs/ [root@dhcp43-154 master]# ls -l /rhs/brick1/d1/ | wc -l 75 [root@dhcp43-154 master]# ls -l /rhs/brick2/d1/ | wc -l 83 [root@dhcp43-154 master] [root@dhcp43-154 master]# ls -l /rhs/brick2/d1/ total 0 -rw-r--r-- 2 root root 0 Feb 17 20:00 a100 -rw-r--r-- 2 root root 0 Feb 17 20:00 a13 -rw-r--r-- 2 root root 0 Feb 17 20:00 a19 -rw-r--r-- 2 root root 0 Feb 17 20:00 a2 -rw-r--r-- 2 root root 0 Feb 17 20:00 a23 -rw-r--r-- 2 root root 0 Feb 17 20:00 a30 -rw-r--r-- 2 root root 0 Feb 17 20:00 a31 -rw-r--r-- 2 root root 0 Feb 17 20:00 a35 -rw-r--r-- 2 root root 0 Feb 17 20:00 a38 -rw-r--r-- 2 root root 0 Feb 17 20:00 a41 -rw-r--r-- 2 root root 0 Feb 17 20:00 a46 -rw-r--r-- 2 root root 0 Feb 17 20:00 a47 -rw-r--r-- 2 root root 0 Feb 17 20:00 a49 -rw-r--r-- 2 root root 0 Feb 17 20:00 a50 -rw-r--r-- 2 root root 0 Feb 17 20:00 a54 -rw-r--r-- 2 root root 0 Feb 17 20:00 a56 -rw-r--r-- 2 root root 0 Feb 17 20:00 a67 -rw-r--r-- 2 root root 0 Feb 17 20:00 a69 -rw-r--r-- 2 root root 0 Feb 17 20:00 a7 -rw-r--r-- 2 root root 0 Feb 17 20:00 a93 -rw-r--r-- 2 root root 0 Feb 17 20:00 a95 -rw-r--r-- 2 root root 0 Feb 17 20:00 a99 -rw-r--r-- 2 root root 0 Feb 19 15:16 b10 -rw-r--r-- 2 root root 0 Feb 19 15:16 b11 -rw-r--r-- 2 root root 0 Feb 19 15:16 b13 -rw-r--r-- 2 root root 0 Feb 19 15:16 b19 -rw-r--r-- 2 root root 0 Feb 19 15:16 b20 -rw-r--r-- 2 root root 0 Feb 19 15:16 b25 -rw-r--r-- 2 root root 0 Feb 19 15:16 b28 -rw-r--r-- 2 root root 0 Feb 19 15:16 b33 -rw-r--r-- 2 root root 0 Feb 19 15:16 b38 -rw-r--r-- 2 root root 0 Feb 19 15:16 b40 -rw-r--r-- 2 root root 0 Feb 19 15:16 b41 -rw-r--r-- 2 root root 0 Feb 19 15:16 b42 -rw-r--r-- 2 root root 0 Feb 19 15:16 b5 -rw-r--r-- 2 root root 0 Feb 19 15:16 b50 -rw-r--r-- 2 root root 0 Feb 19 15:16 b52 -rw-r--r-- 2 root root 0 Feb 19 15:16 b53 -rw-r--r-- 2 root root 0 Feb 19 15:16 b54 -rw-r--r-- 2 root root 0 Feb 19 15:16 b58 -rw-r--r-- 2 root root 0 Feb 19 15:16 b6 -rw-r--r-- 2 root root 0 Feb 19 15:16 b7 -rw-r--r-- 2 root root 0 Feb 19 15:16 b70 -rw-r--r-- 2 root root 0 Feb 19 15:16 b72 -rw-r--r-- 2 root root 0 Feb 19 15:16 b74 -rw-r--r-- 2 root root 0 Feb 19 15:16 b75 -rw-r--r-- 2 root root 0 Feb 19 15:16 b77 -rw-r--r-- 2 root root 0 Feb 19 15:16 b79 -rw-r--r-- 2 root root 0 Feb 19 15:16 b8 -rw-r--r-- 2 root root 0 Feb 19 15:16 b80 -rw-r--r-- 2 root root 0 Feb 19 15:16 b81 -rw-r--r-- 2 root root 0 Feb 19 15:16 b82 -rw-r--r-- 2 root root 0 Feb 19 15:16 b83 -rw-r--r-- 2 root root 0 Feb 19 15:16 b84 -rw-r--r-- 2 root root 0 Feb 19 15:16 b86 -rw-r--r-- 2 root root 0 Feb 19 15:16 b87 -rw-r--r-- 2 root root 0 Feb 19 15:16 b95 -rw-r--r-- 2 root root 0 Feb 19 15:16 b96 -rw-r--r-- 2 root root 0 Feb 19 16:37 c12 -rw-r--r-- 2 root root 0 Feb 19 16:37 c19 -rw-r--r-- 2 root root 0 Feb 19 16:37 c22 -rw-r--r-- 2 root root 0 Feb 19 16:37 c25 -rw-r--r-- 2 root root 0 Feb 19 16:37 c28 -rw-r--r-- 2 root root 0 Feb 19 16:37 c29 -rw-r--r-- 2 root root 0 Feb 19 16:37 c39 -rw-r--r-- 2 root root 0 Feb 19 16:37 c5 -rw-r--r-- 2 root root 0 Feb 19 16:37 c51 -rw-r--r-- 2 root root 0 Feb 19 16:37 c53 -rw-r--r-- 2 root root 0 Feb 19 16:37 c54 -rw-r--r-- 2 root root 0 Feb 19 16:37 c59 -rw-r--r-- 2 root root 0 Feb 19 16:37 c62 -rw-r--r-- 2 root root 0 Feb 19 16:37 c65 -rw-r--r-- 2 root root 0 Feb 19 16:37 c70 -rw-r--r-- 2 root root 0 Feb 19 16:37 c72 -rw-r--r-- 2 root root 0 Feb 19 16:37 c82 -rw-r--r-- 2 root root 0 Feb 19 16:37 c85 -rw-r--r-- 2 root root 0 Feb 19 16:37 c88 -rw-r--r-- 2 root root 0 Feb 19 16:37 c90 -rw-r--r-- 2 root root 0 Feb 19 16:37 c91 -rw-r--r-- 2 root root 0 Feb 19 16:37 c92 -rw-r--r-- 2 root root 0 Feb 19 16:37 c97 -rw-r--r-- 2 root root 0 Feb 19 16:37 c98 [root@dhcp43-154 master]# [root@dhcp43-154 master]# gluster v i Volume Name: master Type: Distributed-Replicate Volume ID: fcf732d1-81d6-42d1-8915-cc2107fd72f2 Status: Started Snap Volume: no Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: dhcp43-154:/rhs/brick1/d1 Brick2: dhcp43-72:/rhs/brick1/d1 Brick3: dhcp42-74:/rhs/brick1/d1 Brick4: dhcp42-182:/rhs/brick1/d1 Brick5: dhcp43-154:/rhs/brick2/d1 Brick6: dhcp43-72:/rhs/brick2/d1 Brick7: dhcp42-74:/rhs/brick2/d1 Brick8: dhcp42-182:/rhs/brick2/d1 Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@dhcp43-154 master]# [root@dhcp43-154 master]# gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 start volume remove-brick start: success ID: 0af990dd-684b-4851-91c7-571615d21f08 [root@dhcp43-154 master]# gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 247 0 0 completed 1.00 dhcp43-72 0 0Bytes 247 0 0 completed 1.00 [root@dhcp43-154 master]# ls /rhs/brick2/d1 | wc -l 82 [root@dhcp43-154 master]# ls /rhs/brick2/d1 | wc -l 82 [root@dhcp43-154 master]# gluster v i Volume Name: master Type: Distributed-Replicate Volume ID: fcf732d1-81d6-42d1-8915-cc2107fd72f2 Status: Started Snap Volume: no Number of Bricks: 4 x 2 = 8 Transport-type: tcp Bricks: Brick1: dhcp43-154:/rhs/brick1/d1 Brick2: dhcp43-72:/rhs/brick1/d1 Brick3: dhcp42-74:/rhs/brick1/d1 Brick4: dhcp42-182:/rhs/brick1/d1 Brick5: dhcp43-154:/rhs/brick2/d1 Brick6: dhcp43-72:/rhs/brick2/d1 Brick7: dhcp42-74:/rhs/brick2/d1 Brick8: dhcp42-182:/rhs/brick2/d1 Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@dhcp43-154 master]# gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y volume remove-brick commit: failed: geo-replication sessions are active for the volume master. Stop geo-replication sessions involved in this volume. Use 'volume geo-replication status' command for more info. [root@dhcp43-154 master]# gluster v geo-rep master dhcp42-130::slave stop Stopping geo-replication session between master & dhcp42-130::slave has been successful [root@dhcp43-154 master]# gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 247 0 0 completed 1.00 dhcp43-72 0 0Bytes 247 0 0 completed 1.00 [root@dhcp43-154 master]# gluster v remove-brick master replica 2 dhcp43-154:/rhs/brick2/d1 dhcp43-72:/rhs/brick2/d1 commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y volume remove-brick commit: success Check the removed bricks to ensure all files are migrated. If files with data are found on the brick path, copy them via a gluster mount point before re-purposing the removed brick. [root@dhcp43-154 master]# [root@dhcp43-154 master]# [root@dhcp43-154 master]# gluster v i Volume Name: master Type: Distributed-Replicate Volume ID: fcf732d1-81d6-42d1-8915-cc2107fd72f2 Status: Started Snap Volume: no Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: dhcp43-154:/rhs/brick1/d1 Brick2: dhcp43-72:/rhs/brick1/d1 Brick3: dhcp42-74:/rhs/brick1/d1 Brick4: dhcp42-182:/rhs/brick1/d1 Brick5: dhcp42-74:/rhs/brick2/d1 Brick6: dhcp42-182:/rhs/brick2/d1 Options Reconfigured: changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@dhcp43-154 master]# ls -l /rhs/brick2/d1 | wc -l 83 [root@dhcp43-154 master]# ssh dhcp42-72 'ls -l /rhs/brick2/d1 | wc -l' ssh: connect to host dhcp42-72 port 22: No route to host [root@dhcp43-154 master]# ssh dhcp43-72 'ls -l /rhs/brick2/d1 | wc -l' The authenticity of host 'dhcp43-72 (10.70.43.72)' can't be established. RSA key fingerprint is 73:fd:3a:81:fb:46:ce:d2:ba:38:d3:87:ac:02:6b:ac. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added 'dhcp43-72' (RSA) to the list of known hosts. root@dhcp43-72's password: 83 [root@dhcp43-154 master]# [root@dhcp43-154 master]# ls -l /rhs/brick1/d1 | wc -l 75 [root@dhcp43-154 master]# ssh 10.70.42.74 'ls -l /rhs/brick2/d1 | wc -l' root.42.74's password: 49 [root@dhcp43-154 master]# ssh 10.70.42.182 'ls -l /rhs/brick2/d1 | wc -l' root.42.182's password: 49 [root@dhcp43-154 master]# ssh 10.70.42.74 'ls -l /rhs/brick1/d1 | wc -l' root.42.74's password: 44 [root@dhcp43-154 master]# [root@dhcp43-154 master]# [root@dhcp43-154 master]# ls -l | wc -l 166 [root@dhcp43-154 master]#
SOS reports copied to: http://rhsqe-repo.lab.eng.blr.redhat.com/sosreports/1196029/
Similar issue is seen with dist-rep volume and not on pure distribute. Output below: [root@rhsauto032 mnt]# gluster v info test2 Volume Name: test2 Type: Distribute Volume ID: e963a35c-037f-48dc-ab7d-ce28fc1b65e0 Status: Started Snap Volume: no Number of Bricks: 4 Transport-type: tcp Bricks: Brick1: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/t2 Brick2: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/t2 Brick3: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick2/t2 Brick4: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick2/t2 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@rhsauto032 mnt]# gluster v remove-brick test2 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick2/t2 start volume remove-brick start: success ID: 1b779bad-38dc-483f-bfff-b3956ce90d71 [root@rhsauto032 mnt]# [root@rhsauto032 mnt]# gluster v remove-brick test2 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick2/t2 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- rhsauto034.lab.eng.blr.redhat.com 7 70.0MB 22 0 0 in progress 6.00 [root@rhsauto032 mnt]# gluster v remove-brick test2 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick2/t2 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- rhsauto034.lab.eng.blr.redhat.com 24 240.0MB 60 0 0 completed 13.00 [root@rhsauto032 mnt]# isssue is not seen on distribute volume. ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Remove brick on distr-rep volume re-produced the issue. [root@rhsauto032 ~]# gluster v create dist-rep replica 2 `hostname`:/rhs/brick1/d1 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d1 `hostname`:/rhs/brick1/d2 rhsauto034.lab.eng.blr.redhat.com: /rhs/brick1/d2 `hostname`:/rhs/brick1/d3 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 volume create: dist-rep: success: please start the volume to access data [root@rhsauto032 ~]# gluster v start dist-rep volume start: dist-rep: success [root@rhsauto032 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_rhsauto032-lv_root 17938864 1860484 15160468 11% / tmpfs 2027372 0 2027372 0% /dev/shm /dev/vda1 487652 28454 433598 7% /boot /dev/mapper/RHS_vg1-RHS_lv1 52135040 207888 51927152 1% /rhs/brick1 /dev/mapper/RHS_vg2-RHS_lv2 52135040 2296852 49838188 5% /rhs/brick2 /dev/mapper/RHS_vg3-RHS_lv3 52135040 33808 52101232 1% /rhs/brick3 /dev/mapper/RHS_vg4-RHS_lv4 52135040 33616 52101424 1% /rhs/brick4 /dev/mapper/RHS_vg5-RHS_lv5 52135040 33616 52101424 1% /rhs/brick5 rhsauto032.lab.eng.blr.redhat.com:test2 208540160 3719296 204820864 2% /mnt [root@rhsauto032 ~]# umount /mnt [root@rhsauto032 ~]# gluster v info dist-repo Volume dist-repo does not exist [root@rhsauto032 ~]# gluster v info dist-rep Volume Name: dist-rep Type: Distributed-Replicate Volume ID: 8578bce5-0c66-4388-8c6d-176f17aaf83e Status: Started Snap Volume: no Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d1 Brick2: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d1 Brick3: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d2 Brick4: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d2 Brick5: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d3 Brick6: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 nap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@rhsauto032 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_rhsauto032-lv_root 17938864 1860456 15160496 11% / tmpfs 2027372 0 2027372 0% /dev/shm /dev/vda1 487652 28454 433598 7% /boot /dev/mapper/RHS_vg1-RHS_lv1 52135040 207888 51927152 1% /rhs/brick1 /dev/mapper/RHS_vg2-RHS_lv2 52135040 2296852 49838188 5% /rhs/brick2 /dev/mapper/RHS_vg3-RHS_lv3 52135040 33808 52101232 1% /rhs/brick3 /dev/mapper/RHS_vg4-RHS_lv4 52135040 33616 52101424 1% /rhs/brick4 /dev/mapper/RHS_vg5-RHS_lv5 52135040 33616 52101424 1% /rhs/brick5 [root@rhsauto032 ~]# mount -t glusterfs `hostname`:dist-rep /mnt [root@rhsauto032 ~]# df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_rhsauto032-lv_root 17938864 1860416 15160536 11% / tmpfs 2027372 0 2027372 0% /dev/shm /dev/vda1 487652 28454 433598 7% /boot /dev/mapper/RHS_vg1-RHS_lv1 52135040 207888 51927152 1% /rhs/brick1 /dev/mapper/RHS_vg2-RHS_lv2 52135040 2296852 49838188 5% /rhs/brick2 /dev/mapper/RHS_vg3-RHS_lv3 52135040 33808 52101232 1% /rhs/brick3 /dev/mapper/RHS_vg4-RHS_lv4 52135040 33616 52101424 1% /rhs/brick4 /dev/mapper/RHS_vg5-RHS_lv5 52135040 33616 52101424 1% /rhs/brick5 rhsauto032.lab.eng.blr.redhat.com:dist-rep 156405120 3542144 152862976 3% /mnt [root@rhsauto032 ~]# cd /mnt [root@rhsauto032 mnt]# ls [root@rhsauto032 mnt]# for i in {1..100} > do > dd if=/dev/urandom of=$i bs=1M count=1 > done 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.228197 s, 4.6 MB/s 1+0 records in 1+0 records out 1048576 bytes (1.0 MB) copied, 0.213948 s, 4.9 MB/s 1+0 records in 1+0 records out root@rhsauto032 mnt]# gluster v info dist-rep Volume Name: dist-rep Type: Distributed-Replicate Volume ID: 8578bce5-0c66-4388-8c6d-176f17aaf83e Status: Started Snap Volume: no Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d1 Brick2: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d1 Brick3: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d2 Brick4: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d2 Brick5: rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d3 Brick6: rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 Options Reconfigured: performance.readdir-ahead: on snap-max-hard-limit: 256 snap-max-soft-limit: 90 auto-delete: disable [root@rhsauto032 mnt]# gluster v remove-brick dist-rep rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d3 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 start volume remove-brick start: success ID: 2898eaa5-0ed1-4951-a3ea-b5490c029c32 [root@rhsauto032 mnt]# gluster v remove-brick dist-rep rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d3 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 100 0 0 completed 0.00 rhsauto034.lab.eng.blr.redhat.com 0 0Bytes 100 0 0 completed 1.00 [root@rhsauto032 mnt]# gluster v remove-brick dist-rep rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d3 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 100 0 0 completed 0.00 rhsauto034.lab.eng.blr.redhat.com 0 0Bytes 100 0 0 completed 1.00 [root@rhsauto032 mnt]# gluster v remove-brick dist-rep rhsauto032.lab.eng.blr.redhat.com:/rhs/brick1/d3 rhsauto034.lab.eng.blr.redhat.com:/rhs/brick1/d3 status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 100 0 0 completed 0.00 rhsauto034.lab.eng.blr.redhat.com 0 0Bytes 100 0 0 completed 1.00 [root@rhsauto032 mnt]# root@rhsauto032 mnt]# ll /rhs/brick1/d3 total 29696 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 16 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 18 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 19 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 22 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 23 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 27 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 28 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 32 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 34 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 37 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 38 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 41 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 43 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 44 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 56 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 57 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 60 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 63 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 69 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 7 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 71 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 78 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 80 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 82 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 83 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 86 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 89 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 92 -rw-r--r-- 2 root root 1048576 Feb 26 00:36 99 [root@rhsauto032 mnt]#
https://code.engineering.redhat.com/gerrit/#/c/42829/
Verified the above bug on the build glusterfs-3.6.0.50-1 Had a 4*2 distribute-replicate volume set up in a geo-rep relationship with another 2*2 volume. Removed one of the replica pairs on the master to make the volume 3*2 and it succeeded. The files got synced on the other bricks. Moving the bug to fixed in 3.0.4. Detailed logs are attached.
Created attachment 1000354 [details] Detailed logs
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0682.html