Bug 1384316
Summary: | [Eventing]: Events not seen when command is triggered from one of the peer nodes | |||
---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Sweta Anandpara <sanandpa> | |
Component: | glusterfs | Assignee: | Aravinda VK <avishwan> | |
Status: | CLOSED ERRATA | QA Contact: | Sweta Anandpara <sanandpa> | |
Severity: | high | Docs Contact: | ||
Priority: | unspecified | |||
Version: | rhgs-3.2 | CC: | amukherj, avishwan, rhinduja, vbellur | |
Target Milestone: | --- | |||
Target Release: | RHGS 3.2.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | glusterfs-3.8.4-6 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1388862 (view as bug list) | Environment: | ||
Last Closed: | 2017-03-23 06:09:39 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1351528, 1388862, 1399482 |
Description
Sweta Anandpara
2016-10-13 05:55:40 UTC
Added debuginfo package, and Atin figured out that the event IS actually being sent. Did a glustereventsd reload on the affected node N4, and started receiving events. node-reload is one of the program called when we do a webhook-add, which would in turn do a glustereventsd reload. For some reason when I did a webhook add in this setup, glustereventsd reload would have failed. Just a hypothesis as of now. Will create a new webhook and add it in this same setup. Will observe the success/failure/errors seen while doing so, and will update. Until then anyone seeing similar issue can do a work around of 'service glustereventsd reload' on the impacted node, and the cluster and its events should work as expected. Deleted the said webhook, and tried to add the same webhook again to the cluster. That did show up an exception where it failed to run 'gluster system:: execute eventsapi.py node-reload' It fails in the same node N4 everytime, and I am unable to figure out the reason why. It works on all the other nodes of the cluster. [root@dhcp35-101 yum.repos.d]# gluster-eventsapi webhook-del http://10.70.35.109:9000/listen Traceback (most recent call last): File "/usr/sbin/gluster-eventsapi", line 459, in <module> runcli() File "/usr/lib/python2.6/site-packages/gluster/cliutils/cliutils.py", line 212, in runcli cls.run(args) File "/usr/sbin/gluster-eventsapi", line 274, in run sync_to_peers() File "/usr/sbin/gluster-eventsapi", line 129, in sync_to_peers out = execute_in_peers("node-reload") File "/usr/lib/python2.6/site-packages/gluster/cliutils/cliutils.py", line 125, in execute_in_peers raise GlusterCmdException((rc, out, err, " ".join(cmd))) gluster.cliutils.cliutils.GlusterCmdException: (1, '', 'Commit failed on 10.70.35.104. Error: Unable to end. Error : Success\n', 'gluster system:: execute eventsapi.py node-reload') [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# gluster-eventsapi status Webhooks: None +-----------------------------------+-------------+-----------------------+ | NODE | NODE STATUS | GLUSTEREVENTSD STATUS | +-----------------------------------+-------------+-----------------------+ | 10.70.35.100 | UP | UP | | 10.70.35.104 | UP | UP | | dhcp35-115.lab.eng.blr.redhat.com | UP | UP | | localhost | UP | UP | +-----------------------------------+-------------+-----------------------+ [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# gluster-eventsapi webhook-test http://10.70.35.109:9000/listen +-----------------------------------+-------------+----------------+ | NODE | NODE STATUS | WEBHOOK STATUS | +-----------------------------------+-------------+----------------+ | 10.70.35.100 | UP | OK | | 10.70.35.104 | UP | OK | | dhcp35-115.lab.eng.blr.redhat.com | UP | OK | | localhost | UP | OK | +-----------------------------------+-------------+----------------+ [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# gluster-eventsapi webhook-add http://10.70.35.109:9000/listen Traceback (most recent call last): File "/usr/sbin/gluster-eventsapi", line 459, in <module> runcli() File "/usr/lib/python2.6/site-packages/gluster/cliutils/cliutils.py", line 212, in runcli cls.run(args) File "/usr/sbin/gluster-eventsapi", line 232, in run sync_to_peers() File "/usr/sbin/gluster-eventsapi", line 129, in sync_to_peers out = execute_in_peers("node-reload") File "/usr/lib/python2.6/site-packages/gluster/cliutils/cliutils.py", line 125, in execute_in_peers raise GlusterCmdException((rc, out, err, " ".join(cmd))) gluster.cliutils.cliutils.GlusterCmdException: (1, '', 'Commit failed on 10.70.35.104. Error: Unable to end. Error : Success\n', 'gluster system:: execute eventsapi.py node-reload') [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# gluster-eventsapi status Webhooks: http://10.70.35.109:9000/listen +-----------------------------------+-------------+-----------------------+ | NODE | NODE STATUS | GLUSTEREVENTSD STATUS | +-----------------------------------+-------------+-----------------------+ | 10.70.35.100 | UP | UP | | 10.70.35.104 | UP | UP | | dhcp35-115.lab.eng.blr.redhat.com | UP | UP | | localhost | UP | UP | +-----------------------------------+-------------+-----------------------+ [root@dhcp35-101 yum.repos.d]# [root@dhcp35-101 yum.repos.d]# This is similar to BZ 1379963. `glustereventsd` on one node is not reloaded and it doesn't know the information about new Webhook added. Upstream patch sent to auto reload webhooks configuration if file changes http://review.gluster.org/15731 Upstream patches: (master) http://review.gluster.org/15731 ( 3.9) http://review.gluster.org/15963 Downstream patch: https://code.engineering.redhat.com/gerrit/91515 Will not be able to verify this until the selinux issue wrt events is fixed (BZ 1379963). Tested this on glusterfs-3.8.4-13 build with selinux-policy-3.7.19-292.el6_8.3 BZ 1419869 talks about a new avc seen, and the workaround mentioned does result in no traceback. Commands when executed from any of the peer nodes ends up in events being seen on the registered webhook. Moving this BZ to verified in 3.2. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |