Bug 1290653
Summary: | [GlusterD]: GlusterD log is filled with error messages - " Failed to aggregate response from node/brick" | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Byreddy <bsrirama> | |
Component: | glusterd | Assignee: | Atin Mukherjee <amukherj> | |
Status: | CLOSED ERRATA | QA Contact: | Byreddy <bsrirama> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.1 | CC: | amukherj, asrivast, bsrirama, nicolas, nlevinki, rhinduja, sasundar, vbellur | |
Target Milestone: | --- | Keywords: | ZStream | |
Target Release: | RHGS 3.1.3 | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | glusterd | |||
Fixed In Version: | glusterfs-3.7.9-1 | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1290734 (view as bug list) | Environment: | ||
Last Closed: | 2016-06-23 04:59:02 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1268895, 1290734, 1299184, 1310999 |
Description
Byreddy
2015-12-11 04:40:26 UTC
I don't think this has anything to do with RHEVM setup. I remember seeing this log in some set up and on further analysis I figured out that we have inadequate logs for this path to get to the actual reason of the failure, we need to improve the logging part here and you can expect a patch to be coming soon in upstream. However, if I try to set up a two node cluster and create a volume and run volume status I don't see this log. Do you have a reproducer for this? I have seen this issue also in RHGS 3.1.1 ( glusterfs-3.7.1-16.el7rhgs ) too (In reply to Atin Mukherjee from comment #2) > I don't think this has anything to do with RHEVM setup. I remember seeing > this log in some set up and on further analysis I figured out that we have > inadequate logs for this path to get to the actual reason of the failure, we > need to improve the logging part here and you can expect a patch to be > coming soon in upstream. However, if I try to set up a two node cluster and > create a volume and run volume status I don't see this log. > > Do you have a reproducer for this? Atin, You are right this is not happening with RHEVM only, just with below steps we can consistently reproduce the issue. 1. Have two node cluster 2. Create a Sample (Distributed volume) volume using both the nodes and start it 3. Issue "gluster volume status all tasks" command on one of the node 4. Check the glusterd logs. An upstream patch http://review.gluster.org/#/c/12950/ is posted for review. Hi, I'm witnessing the same repeated "Failed to aggregate response from node/brick" messages in a replica-3 centos 7.2 nodes with 3.7.6-1 gluster, approx. every 10 seconds. (In reply to Nicolas Ecarnot from comment #6) > Hi, > > I'm witnessing the same repeated "Failed to aggregate response > from node/brick" messages in a replica-3 centos 7.2 nodes with 3.7.6-1 > gluster, approx. every 10 seconds. Nicolas, FYI There was already a patch sent upstream master which is in "needs-code-review" state. And that change was tracked as part of bug - https://bugzilla.redhat.com/show_bug.cgi?id=1290734 and hope the issue will be fixed in glusterfs-3.7.7 This bug is to track the issue for the product - "Red Hat Gluster Storage" Thanks Looks good to me The fix is now available in rhgs-3.1.3 branch, hence moving the state to Modified. Verified this bug using the build "glusterfs-3.7.9-1". Checked glusterd log after executing the "volume status all tasks", No error message like "Failed to aggregate response from node/brick" in the log. Moving to verified state with above details. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2016:1240 |