Bug 1328399 - [geo-rep]: schedule_georep.py doesn't touch the mount in every iteration
Summary: [geo-rep]: schedule_georep.py doesn't touch the mount in every iteration
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: geo-replication
Version: mainline
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Aravinda VK
QA Contact:
URL:
Whiteboard:
Depends On: 1328397
Blocks: 1330450
TreeView+ depends on / blocked
 
Reported: 2016-04-19 10:21 UTC by Aravinda VK
Modified: 2016-06-16 14:04 UTC (History)
7 users (show)

Fixed In Version: glusterfs-3.8rc2
Clone Of: 1328397
: 1330450 (view as bug list)
Environment:
Last Closed: 2016-06-16 14:04:01 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Aravinda VK 2016-04-19 10:21:19 UTC
+++ This bug was initially created as a clone of Bug #1328397 +++

Description of problem:
=======================

Ran the script while there was no IO inprogress, checkpoint never reached for few of the active workers and eventually the script never completed. The reason is not to touch the mount point in every iteration. 

Modified script provided by dev works:

[root@dhcp37-182 ~]# diff /usr/share/glusterfs/scripts/schedule_georep.py /tmp/schedule_georep.py
134d133
<              "--xlator-option=\"*dht.lookup-unhashed=off\"",
138d136
<              "--client-pid=-1",
142d139
< 
148c145
<     #cleanup(hostname, volname, mnt)
---
>     cleanup(hostname, volname, mnt)
416,422d412
<             if not summary["checkpoints_ok"]:
<                 # If Checkpoint is not complete after a iteration means brick
<                 # was down and came online now. SETATTR on mount is not
<                 # recorded, So again issue touch on mount root So that
<                 # Stime will increase and Checkpoint will complete.
<                 touch_mount_root(args.mastervol)
< 
432a423,428
>         else:
>             # If Checkpoint is not complete after a iteration means brick
>             # was down and came online now. SETATTR on mount is not
>             # recorded, So again issue touch on mount root So that
>             # Stime will increase and Checkpoint will complete.
>             touch_mount_root(args.mastervol)
[root@dhcp37-182 ~]# 

Version-Release number of selected component (if applicable):
==============================================================

glusterfs-3.7.9-1.el7rhgs.x86_64

How reproducible:
=================

1/1

Steps to Reproduce:
===================
1. Create data on master volume (6x2)
2. Create geo-rep session
3. Run the script

Comment 1 Vijay Bellur 2016-04-19 10:33:45 UTC
REVIEW: http://review.gluster.org/14029 (geo-rep: Fix checkpoint issue in scheduler) posted (#1) for review on master by Aravinda VK (avishwan)

Comment 2 Vijay Bellur 2016-04-20 10:59:08 UTC
REVIEW: http://review.gluster.org/14029 (geo-rep: Fix checkpoint issue in scheduler) posted (#2) for review on master by Aravinda VK (avishwan)

Comment 3 Vijay Bellur 2016-04-22 07:05:27 UTC
REVIEW: http://review.gluster.org/14029 (geo-rep: Fix checkpoint issue in scheduler) posted (#3) for review on master by Aravinda VK (avishwan)

Comment 4 Vijay Bellur 2016-04-26 09:14:32 UTC
COMMIT: http://review.gluster.org/14029 committed in master by Aravinda VK (avishwan) 
------
commit 8590c1cf3c27468177c425c920cab01f52b251e5
Author: Aravinda VK <avishwan>
Date:   Tue Apr 19 15:30:19 2016 +0530

    geo-rep: Fix checkpoint issue in scheduler
    
    If checkpoint is not met, Scheduler script should touch the
    Mount point so that SETATTR will get recorded in every brick
    Changelog. Script was not touching the mount point in each iteration.
    
    BUG: 1328399
    Change-Id: I2718a764fb3e550742c9dcd316724683561ddf18
    Signed-off-by: Aravinda VK <avishwan>
    Reviewed-on: http://review.gluster.org/14029
    Smoke: Gluster Build System <jenkins.com>
    Reviewed-by: Kotresh HR <khiremat>
    CentOS-regression: Gluster Build System <jenkins.com>
    NetBSD-regression: NetBSD Build System <jenkins.org>

Comment 5 Niels de Vos 2016-06-16 14:04:01 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user


Note You need to log in before you can comment on or make changes to this bug.