Created attachment 1457584 [details] Geo-replication session that doesnt start Description of problem: I have been using ansible to automate the scheduling of geo-replication sessions using cron and 'python /usr/share/glusterfs/scripts/schedule_georep.py' Here is an example of the playbook to set this cron job. http://pastebin.test.redhat.com/615005 This method was taken from the following doc: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.3/html/administration_guide/chap-managing_geo-replication-schedule_cron_job Using this method I have found that the geo-replication session does not always start when the cron job is scheduled for. I have seen this happen many time where the cron job is scheduled for say 2:30 and I start tracking the geo-rep session, it then passes 2:30 and the job never kicks off. When I create the geo-replication session I set the job for 2 minutes in the future, the setup is always finished, as far as I can tell, and it is just waiting to start. For the most part it does start, but about 20% of the time the geo-replication session never initiates. I have also extended the time to kick off the cron job to 5 minutes after the geo-rep session has been created and I have still seen the same behaviour the geo-replication session will still be in the created state long after the cron job was supposed to start. Version-Release number of selected component (if applicable): gluster 3.8.4 rhhi 1.1 How reproducible: about 20% percent of the time Steps to Reproduce: 1. Configure geo-rep session 2. Find a time 5 minutes into the future and create cron job to start a geo-replication session at that time 3. Watch the gluster v geo-rep status to track if the geo-rep session starts when the job is set for. Actual results: About 20% of the time the geo-replication session never starts Expected results: The geo-replication session should start at the time scheduled 100% of the time. Additional info: I have included a screenshot of this issue Left side of screen: My ansible playbook that is polling the gluster v geo-rep status command this will fail as you can see here if the the time has gona past the scheduled time of the cron job and the status command has not changed from created. Upper Right screen: This is the 'watch gluster v geo-rep status' command as you can see the geo-rep status is still at created. Lower Right Screen: This is the 'watch crontab -l' command you can see that the job was scheduled to start at 3:58, but it never did. You can see at the top of the screen that it is 3:59 and the session has not kicked off, it missed its window of opportunity. This job never kicks off late it just will wait until the next time it is scheduled.
Marking priority as medium as the suggested way to schedule geo-rep is via the UI Kotresh, can you take a look at the failure to kick off geo-rep?
Closing as no data available. Please re-open if you have the requested data to debug this.