Description of problem: checkpoint doesn't reach because of stale checkpoint. logs from geo-rep, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2014-01-29 19:04:02.516289] I [master(/bricks/master_brick1):438:crawlwrap] _GMaster: crawl interval: 3 seconds [2014-01-29 19:04:06.903163] I [master(/bricks/master_brick5):587:checkpt_service] _GMaster: checkpoint now:1391002418.973412 completed [2014-01-29 19:04:07.96450] W [master(/bricks/master_brick1):580:checkpt_service] _GMaster: completion time 2014-01-29 19:04:02.236168 for checkpoint now:1391002418.973412 became stale [2014-01-29 19:04:39.211195] I [monitor(monitor):81:set_state] Monitor: new state: Stable >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Version-Release number of selected component (if applicable):glusterfs-3.4.0.58rhs-1 How reproducible: doesn't happen everytime. Steps to Reproduce: 1. create and start geo-rep session between master and slave. 2. create data on master and set the checkpoint. 3. check the log file for checkpoint logs, Actual results: checkpoint doesn't reach saying checkpoint became stale. Expected results: checkpoint should complete properly when it checkpoint set is successful Additional info:
Verified with build: glusterfs-3.7.1-7.el6rhs.x86_64 Tried below 2 cases: a. Set the checkpoint, and kill the active brick before checkpoint could reach. b. Set the checkpoint, and bring down the active brick Node before checkpoint could reach. c. Set the checkpoint, and let the checkpoint reach to make checkpoint completed as "YES" All the above scenario, checkpoint eventually completed and status detail shows "YES". Didnt observe checkpoint becoming stale. Moving the bug to verified state.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html