Description of problem: In geo-rep mount-broker setup, status doesn't show paused when it is actually paused. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> # gluster v geo master geoaccount.43.170::slave pause Pausing geo-replication session between master & geoaccount.43.170::slave has been successful # gluster v geo master geoaccount.43.170::slave status MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS --------------------------------------------------------------------------------------------------------------------------------------------- redlake.blr.redhat.com master /bricks/brick1/master_b1 10.70.42.172::slave Active N/A Changelog Crawl redlake.blr.redhat.com master /bricks/brick2/master_b5 10.70.42.172::slave Active N/A Changelog Crawl redlake.blr.redhat.com master /bricks/brick3/master_b9 10.70.42.172::slave Active N/A Changelog Crawl redcloak.blr.redhat.com master /bricks/brick1/master_b2 10.70.42.240::slave Passive N/A N/A redcloak.blr.redhat.com master /bricks/brick2/master_b6 10.70.42.240::slave Passive N/A N/A redcloak.blr.redhat.com master /bricks/brick3/master_b10 10.70.42.240::slave Passive N/A N/A redcell.blr.redhat.com master /bricks/brick1/master_b3 10.70.43.170::slave Active N/A Changelog Crawl redcell.blr.redhat.com master /bricks/brick2/master_b7 10.70.43.170::slave Active N/A Changelog Crawl redcell.blr.redhat.com master /bricks/brick3/master_b11 10.70.43.170::slave Active N/A Changelog Crawl redeye.blr.redhat.com master /bricks/brick1/master_b4 10.70.42.208::slave Passive N/A N/A redeye.blr.redhat.com master /bricks/brick2/master_b8 10.70.42.208::slave Passive N/A N/A redeye.blr.redhat.com master /bricks/brick3/master_b12 10.70.42.208::slave Passive N/A N/A >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Version-Release number of selected component (if applicable): glusterfs-3.6.0.10-1.el6rhs How reproducible: Happens every time. Steps to Reproduce: 1. create and start a geo-rep mount-broker setup, using the following steps, 2. Create a new group on the slave nodes. For example, geogroup 3. Create a unprivileged account on the slave nodes. For example, geoaccount. Make it a member of geogroup on all the slave nodes. 4. Create a new directory on all the slave nodes owned by root and with permissions 0711. Ensure that the location where this directory is created is writable only by root but geoaccount is able to access it. For example, create a mountbroker-root directory at /var/mountbroker-root. 5. Add the following options to the glusterd volfile on the slave nodes, (which you can find in /etc/glusterfs/glusterd.vol) assuming the name of the slave volume as slavevol: option mountbroker-root /var/mountbroker-root option mountbroker-geo-replication.geoaccount slavevol option geo-replication-log-group geogroup option rpc-auth-allow-insecure on 6. Restart glusterd on all the slave nodes. Setup a passwdless ssh from one of the master node, to user on one of the slave node. For ex: to geoaccount 7. Create geo-rep relationship between master and slave to the user for ex: gluster volume geo-rep MASTERNODE geoaccount@SLAVENODE::slavevol create push-pem 8. In the slavenode which is used to create relationship, run /usr/libexec/glusterfs/set_geo_rep_pem_keys.sh as a root with user name as argument. Ex: # /usr/libexec/glusterfs/set_geo_rep_pem_keys.sh geoaccount 9. Start the geo-rep with slave user Ex: gluster volume geo-rep MASTERNODE geoaccount@SLAVENODE::slavevol start 10. start creating some data on master. 11. Pause the geo-rep relationship. EX: gluster volume geo-rep MASTERNODE geoaccount@SLAVENODE::slavevol pause 12. check the status using Ex: gluster volume geo-rep MASTERNODE geoaccount@SLAVENODE::slavevol status Actual results: status doesn't show paused when it is actually paused. Expected results: status should show if it is paused even in mount-broker setup. Additional info:
Subsequent resume command doesn't work saying "Geo-rep is not paused", only way to work-around is to use force option for resume.
Though the symptoms (effects) are different, the root cause is same as of the following bug. https://bugzilla.redhat.com/show_bug.cgi?id=1104649 Hence marking as duplicate. *** This bug has been marked as a duplicate of bug 1104121 ***
The upstream patch is under review for it. http://review.gluster.org/#/c/7977/ which fixes the issue.