Bug 1059255 - dist-geo-rep : checkpoint doesn't reach because checkpoint became stale.
Summary: dist-geo-rep : checkpoint doesn't reach because checkpoint became stale.
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: geo-replication
Version: 2.1
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: RHGS 3.1.0
Assignee: Aravinda VK
QA Contact: Rahul Hinduja
URL:
Whiteboard: checkpoint
Depends On: 1064309
Blocks: 1202842 1223636
TreeView+ depends on / blocked
 
Reported: 2014-01-29 13:54 UTC by Vijaykumar Koppad
Modified: 2015-07-29 04:33 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.7.0-2.el6rhs
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-07-29 04:33:42 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2015:1495 0 normal SHIPPED_LIVE Important: Red Hat Gluster Storage 3.1 update 2015-07-29 08:26:26 UTC

Description Vijaykumar Koppad 2014-01-29 13:54:50 UTC
Description of problem: checkpoint doesn't reach because of stale checkpoint. 

logs from geo-rep,

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
[2014-01-29 19:04:02.516289] I [master(/bricks/master_brick1):438:crawlwrap] _GMaster: crawl interval: 3 seconds
[2014-01-29 19:04:06.903163] I [master(/bricks/master_brick5):587:checkpt_service] _GMaster: checkpoint now:1391002418.973412 completed
[2014-01-29 19:04:07.96450] W [master(/bricks/master_brick1):580:checkpt_service] _GMaster: completion time 2014-01-29 19:04:02.236168 for checkpoint now:1391002418.973412 became stale
[2014-01-29 19:04:39.211195] I [monitor(monitor):81:set_state] Monitor: new state: Stable

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>



Version-Release number of selected component (if applicable):glusterfs-3.4.0.58rhs-1


How reproducible: doesn't happen everytime. 


Steps to Reproduce:
1. create and start geo-rep session between master and slave.
2. create data on master and set the checkpoint.
3. check the log file for checkpoint logs,

Actual results: checkpoint doesn't reach saying checkpoint became stale.
 

Expected results: checkpoint should complete properly when it checkpoint set is successful 


Additional info:

Comment 5 Rahul Hinduja 2015-07-07 11:32:44 UTC
Verified with build: glusterfs-3.7.1-7.el6rhs.x86_64

Tried below 2 cases:

a. Set the checkpoint, and kill the active brick before checkpoint could reach.
b. Set the checkpoint, and bring down the active brick Node before checkpoint could reach. 
c. Set the checkpoint, and let the checkpoint reach to make checkpoint completed as "YES"

All the above scenario, checkpoint eventually completed and status detail shows "YES". Didnt observe checkpoint becoming stale. Moving the bug to verified state.

Comment 8 errata-xmlrpc 2015-07-29 04:33:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-1495.html


Note You need to log in before you can comment on or make changes to this bug.