Bug 1359612

Summary: [RFE] Geo-replication Logging Improvements
Product: [Community] GlusterFS Reporter: Aravinda VK <avishwan>
Component: geo-replicationAssignee: Aravinda VK <avishwan>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.10.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1359613 1387990 (view as bug list) Environment:
Last Closed: 2017-03-06 17:21:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1359613, 1387990    

Description Aravinda VK 2016-07-25 06:50:27 UTC
Description of problem:

Many improvements can be made to Geo-replication logs to understand the current status and other details.

- Session Creation/Recreation - Will help to understand when session was first created/recreated

- Log the push-pem status(Since hook script runs in background)

- On Start,
    . Monitor Start Message
    . Number of local bricks
    . Worker Start Message with respective Slave node to which it is trying to connect
    . Agent Start Message

- Worker State Changes
    . Change in connected Slave Node
    . Change in Status(Initializing/Active/Passive/Faulty/Stopped/Paused/Resumed)
    . Change in Crawl(Hybrid/History/Changelog)

- Checkpoint Set and Completion Time

- Sync time logs:
  If Rsync failure, Summary of errors (CHANGELOG start-end name and error)

- When Config Changed, log CONFIG_NAME OLD_VALUE and NEW_VALUE

- On worker restart,
   Log last worker's and agents uptime (Because monitor knows about it :P)

- Performance metrics: (Optional only if enabled)
    - After each Changelog batch processed
        - Number of Changelogs processed and START and END changelog
        - Number of ENTRY, DATA and META
        - Time took to Parse Changelogs, Entry Operations, Meta and Sync

Comment 1 Worker Ant 2016-10-20 09:42:39 UTC
REVIEW: http://review.gluster.org/15684 (geo-rep: Logging improvements) posted (#1) for review on master by Aravinda VK (avishwan)

Comment 2 Worker Ant 2016-10-24 07:01:53 UTC
COMMIT: http://review.gluster.org/15684 committed in master by Aravinda VK (avishwan) 
------
commit cdc30ed8eacb6772e0dabb863ef51cef794d60dd
Author: Aravinda VK <avishwan>
Date:   Thu Oct 20 15:05:38 2016 +0530

    geo-rep: Logging improvements
    
    - Redundant log messages removed.
    - Worker and connected slave node details added in "starting worker" log
    - Added log for Monitor state change
    - Added log for Worker status change(Initializing/Active/Passive/Faulty)
    - Added log for Crawl status Change
    - Added log for config set and reset
    - Added log for checkpoint set, reset and completion
    
    BUG: 1359612
    Change-Id: Icc7173ff3c93de4b862bdb1a61760db7eaf14271
    Signed-off-by: Aravinda VK <avishwan>
    Reviewed-on: http://review.gluster.org/15684
    Smoke: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    Reviewed-by: Kotresh HR <khiremat>

Comment 3 Shyamsundar 2017-03-06 17:21:32 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/