Bug 1046022
Summary: | "gluster volume heal <vol-name> info", doesn't responds till self-heal is complete on the volume | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | SATHEESARAN <sasundar> |
Component: | glusterfs | Assignee: | Pranith Kumar K <pkarampu> |
Status: | CLOSED ERRATA | QA Contact: | spandura |
Severity: | high | Docs Contact: | Anjana Suparna Sriram <asriram> |
Priority: | high | ||
Version: | 2.1 | CC: | grajaiya, pkarampu, sharne, spandura, vagarwal, vbellur |
Target Milestone: | --- | Keywords: | ZStream |
Target Release: | RHGS 2.1.2 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.4.0.57rhs | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2014-02-25 08:10:06 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
SATHEESARAN
2013-12-23 08:57:34 UTC
As a way to fix false +ves in heal info, we started taking locks to figure out whether files need self-heal or not. If for all the files self-heal-daemon wins taking lock before self-heal-info, then this can happen. The bug description is a bit inaccurate. The locks are taken per file. lets say we have file a, b, c which need self-heal, both heal info (to find whether it needs self-heal) and self-heal-daemon (to do the actual heal) want to take locks. Now for each file if self-heal-daemon always gets the lock on the files before heal info. It seems like heal info doesn't respond until heal on the volume is complete. There are still false +ves for metadata and entry self-heal. Please add DocText for this Known Issue. (In reply to Pranith Kumar K from comment #1) > As a way to fix false +ves in heal info, we started taking locks to figure > out whether files need self-heal or not. If for all the files > self-heal-daemon wins taking lock before self-heal-info, then this can > happen. The bug description is a bit inaccurate. The locks are taken per > file. lets say we have file a, b, c which need self-heal, both heal info (to > find whether it needs self-heal) and self-heal-daemon (to do the actual > heal) want to take locks. Now for each file if self-heal-daemon always gets > the lock on the files before heal info. It seems like heal info doesn't > respond until heal on the volume is complete. There are still false +ves for > metadata and entry self-heal. Pranith, I hit a scenario where, gluster volume heal <vol-name> info" takes more than 50 minutes to respond back. And I think this is too high Check the timestamp available with command, << Note Timestamp here when command was triggered, [Wed Jan 8 20:08:57 UTC 2014 root.37.187:~ ] # gluster volume heal dr-imgstore info Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick1/drdir1/ /c33c0d51-e8f5-409d-9a52-fea048db0645/images/94b388b5-2906-43e8-b372-bd6bfce099f6/ff2fffbf-a14f-4727-9bea-8afa672e9bc8 Number of entries: 1 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick1/drdir1/ /c33c0d51-e8f5-409d-9a52-fea048db0645/images/94b388b5-2906-43e8-b372-bd6bfce099f6/ff2fffbf-a14f-4727-9bea-8afa672e9bc8 Number of entries: 1 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick2/drdir2/ Number of entries: 0 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick2/drdir2/ Number of entries: 0 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/addbrick1/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/addbrick1/ Number of entries: 0 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/addbrick2/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/addbrick2/ Number of entries: 0 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/addbrick3/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/addbrick3/ Number of entries: 0 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/addbrick4/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/addbrick4/ Number of entries: 0 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/addbrick5/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/addbrick5/ Number of entries: 0 <<<<############### long hang [Wed Jan 8 20:59:50 UTC 2014 root.37.187:~ ] # <<< Note timestamp This may have impact on documentation. Please check the relevant document sections in administration guide. This bug is introduced after bigbend and fixed before corbett, so no need to add any doctext. Please set doc-text flag to '-' Tested with glusterfs-3.4.0.57rhs-1.el6rhs "gluster volume heal <vol-name>", doesn't hang for long time but return back immediately. [Tue Jan 14 15:49:30 UTC 2014 root.37.187:~ ] # gluster volume heal drvol Launching heal operation to perform index self heal on volume drvol has been successful Use heal info commands to check status [Tue Jan 14 15:50:51 UTC 2014 root.37.187:~ ] # gluster volume heal drvol info Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick1/drdir1/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/7e4d8003-9248-4e82-8c41-9c4093de1623/b2dc01a7-4833-41c8-9e0f-84102f97b80d - Possibly undergoing heal /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/1217280d-e8d5-4f79-826f-64514e6f5c56 /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/e1503573-d342-442b-902d-f5cb55e48edc /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/fe34200e-3614-4fbf-ab46-62ba6e39b20e Number of entries: 4 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick1/drdir1/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/7e4d8003-9248-4e82-8c41-9c4093de1623/b2dc01a7-4833-41c8-9e0f-84102f97b80d - Possibly undergoing heal Number of entries: 1 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick2/drdir2/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/6d845676-0267-4b44-9856-712feda16035/27f64b50-b1c1-4ce7-a3a6-08523efa1dfc - Possibly undergoing heal /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/1217280d-e8d5-4f79-826f-64514e6f5c56 /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/e1503573-d342-442b-902d-f5cb55e48edc /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/fe34200e-3614-4fbf-ab46-62ba6e39b20e Number of entries: 4 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick2/drdir2/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/6d845676-0267-4b44-9856-712feda16035/27f64b50-b1c1-4ce7-a3a6-08523efa1dfc - Possibly undergoing heal Number of entries: 1 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick3/drdir3/ Number of entries: 0 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick3/drdir3/ Number of entries: 0 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick4/drdir4/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/de44188d-1ed1-40cc-9373-cca801b23d6d/2f8fafc7-d755-4b5a-9cfe-fb0ce83b54d8 - Possibly undergoing heal /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/1217280d-e8d5-4f79-826f-64514e6f5c56 /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/e1503573-d342-442b-902d-f5cb55e48edc /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/fe34200e-3614-4fbf-ab46-62ba6e39b20e Number of entries: 4 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick4/drdir4/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/de44188d-1ed1-40cc-9373-cca801b23d6d/2f8fafc7-d755-4b5a-9cfe-fb0ce83b54d8 - Possibly undergoing heal Number of entries: 1 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/add-dir1/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/add-dir1/ Number of entries: 0 [Tue Jan 14 15:51:12 UTC 2014 root.37.187:~ ] # gluster volume heal drvol info Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick1/drdir1/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/7e4d8003-9248-4e82-8c41-9c4093de1623/b2dc01a7-4833-41c8-9e0f-84102f97b80d - Possibly undergoing heal /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/1217280d-e8d5-4f79-826f-64514e6f5c56 /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/e1503573-d342-442b-902d-f5cb55e48edc /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/fe34200e-3614-4fbf-ab46-62ba6e39b20e Number of entries: 4 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick1/drdir1/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/7e4d8003-9248-4e82-8c41-9c4093de1623/b2dc01a7-4833-41c8-9e0f-84102f97b80d - Possibly undergoing heal Number of entries: 1 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick2/drdir2/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/6d845676-0267-4b44-9856-712feda16035/27f64b50-b1c1-4ce7-a3a6-08523efa1dfc - Possibly undergoing heal /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/1217280d-e8d5-4f79-826f-64514e6f5c56 /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/e1503573-d342-442b-902d-f5cb55e48edc /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/fe34200e-3614-4fbf-ab46-62ba6e39b20e Number of entries: 4 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick2/drdir2/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/6d845676-0267-4b44-9856-712feda16035/27f64b50-b1c1-4ce7-a3a6-08523efa1dfc - Possibly undergoing heal Number of entries: 1 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick3/drdir3/ Number of entries: 0 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick3/drdir3/ Number of entries: 0 Brick rhss1.lab.eng.blr.redhat.com:/rhs/brick4/drdir4/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/de44188d-1ed1-40cc-9373-cca801b23d6d/2f8fafc7-d755-4b5a-9cfe-fb0ce83b54d8 - Possibly undergoing heal /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/1217280d-e8d5-4f79-826f-64514e6f5c56 /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/e1503573-d342-442b-902d-f5cb55e48edc /0218725d-3846-4c6d-b9d7-c05bd55c031b/master/vms/fe34200e-3614-4fbf-ab46-62ba6e39b20e Number of entries: 4 Brick rhss2.lab.eng.blr.redhat.com:/rhs/brick4/drdir4/ /0218725d-3846-4c6d-b9d7-c05bd55c031b/images/de44188d-1ed1-40cc-9373-cca801b23d6d/2f8fafc7-d755-4b5a-9cfe-fb0ce83b54d8 - Possibly undergoing heal Number of entries: 1 Brick rhss3.lab.eng.blr.redhat.com:/rhs/brick1/add-dir1/ Number of entries: 0 Brick rhss4.lab.eng.blr.redhat.com:/rhs/brick1/add-dir1/ Number of entries: 0 As the problem related to this bug is solved, this bug could be closed. But again,"gluster volume heal <vol-name> info" gives out few entries with message, "Possibly undergoing heal", and there are entries without this message. What is the significance of having entries with this message This behavior have to be documented, in that case Cancelling need_info as requires_doc_text flag is set to '-' based on comment 6. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2014-0208.html |