Bug 1330450

Summary: [geo-rep]: schedule_georep.py doesn't touch the mount in every iteration
Product: [Community] GlusterFS Reporter: Aravinda VK <avishwan>
Component: geo-replicationAssignee: Aravinda VK <avishwan>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.7.11CC: avishwan, bugs, chrisw, csaba, nlevinki, rhinduja, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: glusterfs-3.7.12 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 1328399 Environment:
Last Closed: 2016-06-28 12:15:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1328397, 1328399    
Bug Blocks:    

Description Aravinda VK 2016-04-26 09:15:47 UTC
+++ This bug was initially created as a clone of Bug #1328399 +++

+++ This bug was initially created as a clone of Bug #1328397 +++

Description of problem:
=======================

Ran the script while there was no IO inprogress, checkpoint never reached for few of the active workers and eventually the script never completed. The reason is not to touch the mount point in every iteration. 

Modified script provided by dev works:

[root@dhcp37-182 ~]# diff /usr/share/glusterfs/scripts/schedule_georep.py /tmp/schedule_georep.py
134d133
<              "--xlator-option=\"*dht.lookup-unhashed=off\"",
138d136
<              "--client-pid=-1",
142d139
< 
148c145
<     #cleanup(hostname, volname, mnt)
---
>     cleanup(hostname, volname, mnt)
416,422d412
<             if not summary["checkpoints_ok"]:
<                 # If Checkpoint is not complete after a iteration means brick
<                 # was down and came online now. SETATTR on mount is not
<                 # recorded, So again issue touch on mount root So that
<                 # Stime will increase and Checkpoint will complete.
<                 touch_mount_root(args.mastervol)
< 
432a423,428
>         else:
>             # If Checkpoint is not complete after a iteration means brick
>             # was down and came online now. SETATTR on mount is not
>             # recorded, So again issue touch on mount root So that
>             # Stime will increase and Checkpoint will complete.
>             touch_mount_root(args.mastervol)
[root@dhcp37-182 ~]# 

Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.7.9-1.el7rhgs.x86_64

How reproducible:
=================

1/1

Steps to Reproduce:
===================
1. Create data on master volume (6x2)
2. Create geo-rep session
3. Run the script

--- Additional comment from Vijay Bellur on 2016-04-19 06:33:45 EDT ---

REVIEW: http://review.gluster.org/14029 (geo-rep: Fix checkpoint issue in scheduler) posted (#1) for review on master by Aravinda VK (avishwan)

--- Additional comment from Vijay Bellur on 2016-04-20 06:59:08 EDT ---

REVIEW: http://review.gluster.org/14029 (geo-rep: Fix checkpoint issue in scheduler) posted (#2) for review on master by Aravinda VK (avishwan)

--- Additional comment from Vijay Bellur on 2016-04-22 03:05:27 EDT ---

REVIEW: http://review.gluster.org/14029 (geo-rep: Fix checkpoint issue in scheduler) posted (#3) for review on master by Aravinda VK (avishwan)

--- Additional comment from Vijay Bellur on 2016-04-26 05:14:32 EDT ---

COMMIT: http://review.gluster.org/14029 committed in master by Aravinda VK (avishwan) 
------
commit 8590c1cf3c27468177c425c920cab01f52b251e5
Author: Aravinda VK <avishwan>
Date:   Tue Apr 19 15:30:19 2016 +0530

    geo-rep: Fix checkpoint issue in scheduler
    
    If checkpoint is not met, Scheduler script should touch the
    Mount point so that SETATTR will get recorded in every brick
    Changelog. Script was not touching the mount point in each iteration.
    
    BUG: 1328399
    Change-Id: I2718a764fb3e550742c9dcd316724683561ddf18
    Signed-off-by: Aravinda VK <avishwan>
    Reviewed-on: http://review.gluster.org/14029
    Smoke: Gluster Build System <jenkins.com>
    Reviewed-by: Kotresh HR <khiremat>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 1 Vijay Bellur 2016-04-26 09:16:59 UTC
REVIEW: http://review.gluster.org/14071 (geo-rep: Fix checkpoint issue in scheduler) posted (#1) for review on release-3.7 by Aravinda VK (avishwan)

Comment 2 Vijay Bellur 2016-04-28 03:13:09 UTC
REVIEW: http://review.gluster.org/14071 (geo-rep: Fix checkpoint issue in scheduler) posted (#2) for review on release-3.7 by Aravinda VK (avishwan)

Comment 3 Vijay Bellur 2016-04-28 11:48:25 UTC
REVIEW: http://review.gluster.org/14071 (geo-rep: Fix checkpoint issue in scheduler) posted (#3) for review on release-3.7 by Aravinda VK (avishwan)

Comment 4 Vijay Bellur 2016-04-29 06:48:27 UTC
REVIEW: http://review.gluster.org/14071 (geo-rep: Fix checkpoint issue in scheduler) posted (#4) for review on release-3.7 by Aravinda VK (avishwan)

Comment 5 Vijay Bellur 2016-04-29 10:18:00 UTC
COMMIT: http://review.gluster.org/14071 committed in release-3.7 by Aravinda VK (avishwan) 
------
commit 6ca8c94bc614a2fade8aeb49340a59b1195e310c
Author: Aravinda VK <avishwan>
Date:   Tue Apr 19 15:30:19 2016 +0530

    geo-rep: Fix checkpoint issue in scheduler
    
    If checkpoint is not met, Scheduler script should touch the
    Mount point so that SETATTR will get recorded in every brick
    Changelog. Script was not touching the mount point in each iteration.
    
    BUG: 1330450
    Change-Id: I2718a764fb3e550742c9dcd316724683561ddf18
    Signed-off-by: Aravinda VK <avishwan>
    Reviewed-on: http://review.gluster.org/14029
    Smoke: Gluster Build System <jenkins.com>
    Reviewed-by: Kotresh HR <khiremat>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    (cherry picked from commit 8590c1cf3c27468177c425c920cab01f52b251e5)
    Reviewed-on: http://review.gluster.org/14071

Comment 6 Kaushal 2016-06-28 12:15:11 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.12, please open a new bug report.

glusterfs-3.7.12 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] https://www.gluster.org/pipermail/gluster-devel/2016-June/049918.html
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user