Description of problem: geo-rep status shows Active/Passive for a node even when all the geo-rep related processes in that node are killed. Active/Passive means that the processes are running properly. In reality they are not running. They should be shown faulty/defunct when those processes are not running. Version-Release number of selected component (if applicable): glusterfs-3.4.0.44rhs-1.el6rhs.x86_64 How reproducible: Always Steps to Reproduce: 1. Create and start a geo-rep session between 2*2 dist-rep master node and 2*2 dist-rep slave node. 2. Now kill all the gsync related processes in one of the Active nodes. ps -aef | grep gluster | grep gluster | awk '{print $2}' | xargs kill -9 3. run geo-rep status Actual results: geo-rep status detail MASTER NODE MASTER VOL MASTER BRICK SLAVE STATUS CHECKPOINT STATUS CRAWL STATUS FILES SYNCD FILES PENDING BYTES PENDING DELETES PENDING FILES SKIPPED ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- pythagoras.blr.redhat.com master /rhs/bricks/brick0 euclid::slave Active N/A Changelog Crawl 2374 0 0 0 0 aryabhatta.blr.redhat.com master /rhs/bricks/brick1 gauss::slave Passive N/A N/A 0 0 0 0 0 ramanujan.blr.redhat.com master /rhs/bricks/brick2 riemann::slave Active N/A Changelog Crawl 1144 0 0 0 0 archimedes.blr.redhat.com master /rhs/bricks/brick3 euler::slave Passive N/A N/A 0 0 0 0 0 But in node 'ramanujan' the gsync processes are not running [root@ramanujan ~]# ps -aef | grep gluster | grep gsync [root@ramanujan ~]# Expected results: When the processes are not running it should show either faulty/defunct. The user should be informed of that the processes are not running. Else user may be under falsehood that geo-rep is syncing data without any issues. Additional info: Easy top reproduce.
If monitor process is killed. No running process is available to update the status files. Status command will just pick the content from the status file to show the output. Monitor process should not be killed manually for effective working of Geo-rep. If workers are killed, monitor process takes care of updating status files and restarting workers. Killing monitor and showing previous state is expected behavior. To stop Geo-rep, please use Geo-rep stop command. Possible enhancement would be, status command should check monitor pid status before showing the status output. Status should be shown as "Stopped" if respective monitor process is not running.
Closing this bug since RHGS 2.1 release reached EOL. Required bugs are cloned to RHGS 3.1. Please re-open this issue if found again.