Bug 1382236

Summary: glusterfind pre session hangs indefinitely
Product: [Community] GlusterFS Reporter: Milind Changire <mchangir>
Component: glusterfindAssignee: Milind Changire <mchangir>
Status: CLOSED CURRENTRELEASE QA Contact: bugs <bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: mainlineCC: amukherj, ashah, avishwan, bugs, khiremat, rhs-bugs, rnalakka, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: glusterfs-3.10.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1379790 Environment:
Last Closed: 2017-03-06 17:29:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1379790    

Comment 1 Milind Changire 2016-10-12 06:31:54 UTC
Problem:
glusterfind pre command seems to hang for days when listing millions of files with changelog.py processes consuming 100% CPU

RCA:
changelog.py processes indeed consume 100% CPU due to CPU intensive SQLite operations during changelog parsing.
Part of the time consumed during changelog parsing was due to a bug causing repetitive parsing of previously parsed changelogs.

Comment 2 Worker Ant 2016-10-12 06:33:28 UTC
REVIEW: http://review.gluster.org/15609 (tools/glusterfind: kill remote processes and separate run-time directories) posted (#7) for review on master by Milind Changire (mchangir)

Comment 3 Worker Ant 2016-10-13 05:38:25 UTC
REVIEW: http://review.gluster.org/15609 (tools/glusterfind: kill remote processes and separate run-time directories) posted (#8) for review on master by Milind Changire (mchangir)

Comment 4 Worker Ant 2016-10-14 10:29:22 UTC
REVIEW: http://review.gluster.org/15609 (tools/glusterfind: kill remote processes and separate run-time directories) posted (#9) for review on master by Milind Changire (mchangir)

Comment 5 Worker Ant 2016-10-17 06:47:07 UTC
REVIEW: http://review.gluster.org/15609 (tools/glusterfind: kill remote processes and separate run-time directories) posted (#10) for review on master by Milind Changire (mchangir)

Comment 6 Worker Ant 2016-10-25 12:00:59 UTC
COMMIT: http://review.gluster.org/15609 committed in master by Aravinda VK (avishwan) 
------
commit feea851fad4f89b48bfe89fe3b75250cc7bd6501
Author: Milind Changire <mchangir>
Date:   Mon Oct 17 12:16:36 2016 +0530

    tools/glusterfind: kill remote processes and separate run-time directories
    
    Problem #1:
    Hitting CTRL+C leaves stale processes on remote nodes if glusterfind pre
    has been initiated.
    
    Solution #1:
    Adding "-t -t" to ssh command-line forces pseudo-terminal to be assigned
    to remote process. When local process receives Keyboard Interrupt,
    SIGHUP is immediately conveyed to the remote terminal causing remote
    changelog.py process to terminate immediately.
    
    Problem #2:
    Concurrent glusterfind pre runs are not possible on the same glusterfind
    session in case of a runaway process.
    
    Solution #2:
    glusterfind pre runs now add random directory name to the working
    directory to store and manage temporary database and changelog
    processing.
    If KeyboardInterrupt is received, the function call
    run_cmd_nodes("cleanup", args, tmpfilename=gtmpfilename)
    cleans up the remote run specific directory.
    
    Patch:
    7571380 cli/xml: Fix wrong XML format in volume get command
    broke "gluster volume get <vol> changelog.rollover-time --xml"
    Now fixed function utils.py::get_changelog_rollover_time()
    
    Fixed spurious trailing space getting written if second path is empty in
    main.py::write_output()
    Fixed repetitive changelog processing in changelog.py::get_changes()
    
    Change-Id: Ia8d96e2cd47bf2a64416bece312e67631a1dbf29
    BUG: 1382236
    Signed-off-by: Milind Changire <mchangir>
    Reviewed-on: http://review.gluster.org/15609
    Smoke: Gluster Build System <jenkins.org>
    CentOS-regression: Gluster Build System <jenkins.org>
    NetBSD-regression: NetBSD Build System <jenkins.org>
    Reviewed-by: Aravinda VK <avishwan>

Comment 7 Shyamsundar 2017-03-06 17:29:11 UTC
This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.10.0, please open a new bug report.

glusterfs-3.10.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://lists.gluster.org/pipermail/gluster-users/2017-February/030119.html
[2] https://www.gluster.org/pipermail/gluster-users/