Bug 1395613
Summary: | Delayed Events if any one Webhook is slow | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Aravinda VK <avishwan> |
Component: | eventsapi | Assignee: | Aravinda VK <avishwan> |
Status: | CLOSED ERRATA | QA Contact: | Sweta Anandpara <sanandpa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.2 | CC: | amukherj, rhinduja |
Target Milestone: | --- | ||
Target Release: | RHGS 3.2.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | glusterfs-3.8.4-7 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1357754 | Environment: | |
Last Closed: | 2017-03-23 06:19:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1357754, 1401261 | ||
Bug Blocks: | 1351528 |
Description
Aravinda VK
2016-11-16 09:53:29 UTC
Upstream patch sent to Master branch http://review.gluster.org/15966 Separate threads maintained for each Webhook so that slow Webhook will not affect the other webhooks. With this change we do not need configurable option for Timeout. upstream mainline : http://review.gluster.org/15966 upstream 3.9 : http://review.gluster.org/#/c/16021 downstream : https://code.engineering.redhat.com/gerrit/92047 Tested and verified this on the build 3.8.4-8 Registered 2 webhooks in my 4-node cluster setup and had a delay of 5 seconds in one of the webhooks. Ended up doing multiple operations (like volume start/stop, georep start/stop, georep config set, bitrot enable/disable, quota enable/disable) which in turn generated the corresponding events. Delay was seen only in the webhook in which sleep (of 5 seconds) was configured. The other webhook always displayed the events as and when they were generated. Moving this BZ to verified in 3.2 [root@dhcp47-60 ~]# rpm -qa | grep gluster glusterfs-3.8.4-8.el7rhgs.x86_64 glusterfs-cli-3.8.4-8.el7rhgs.x86_64 glusterfs-api-3.8.4-8.el7rhgs.x86_64 glusterfs-geo-replication-3.8.4-8.el7rhgs.x86_64 vdsm-gluster-4.17.33-1.el7rhgs.noarch gluster-nagios-addons-0.2.8-1.el7rhgs.x86_64 glusterfs-client-xlators-3.8.4-8.el7rhgs.x86_64 glusterfs-server-3.8.4-8.el7rhgs.x86_64 gluster-nagios-common-0.2.4-1.el7rhgs.noarch glusterfs-fuse-3.8.4-8.el7rhgs.x86_64 glusterfs-events-3.8.4-8.el7rhgs.x86_64 glusterfs-libs-3.8.4-8.el7rhgs.x86_64 python-gluster-3.8.4-8.el7rhgs.noarch [root@dhcp47-60 ~]# [root@dhcp47-60 ~]# gluster peer status Number of Peers: 3 Hostname: 10.70.47.61 Uuid: f4b259db-7add-4d01-bb5e-3c7f9c077bb4 State: Peer in Cluster (Connected) Hostname: 10.70.47.26 Uuid: 95c24075-02aa-49c1-a1e4-c7e0775e7128 State: Peer in Cluster (Connected) Hostname: 10.70.47.27 Uuid: 8d1aaf3a-059e-41c2-871b-6c7f5c0dd90b State: Peer in Cluster (Connected) [root@dhcp47-60 ~]# [root@dhcp47-60 ~]# [root@dhcp47-60 ~]# gluster-eventsapi status Webhooks: http://10.70.46.245:9000/listen http://10.70.46.246:9000/listen +-------------+-------------+-----------------------+ | NODE | NODE STATUS | GLUSTEREVENTSD STATUS | +-------------+-------------+-----------------------+ | 10.70.47.61 | UP | OK | | 10.70.47.26 | UP | OK | | 10.70.47.27 | UP | OK | | localhost | UP | OK | +-------------+-------------+-----------------------+ [root@dhcp47-60 ~]# [root@dhcp47-60 ~]# gluster v list gluster_shared_storage ozone [root@dhcp47-60 ~]# [root@dhcp47-60 ~]# [root@dhcp47-60 ~]# gluster v info Volume Name: gluster_shared_storage Type: Replicate Volume ID: 78323062-1c40-4153-8c65-5235450ca620 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: 10.70.47.61:/var/lib/glusterd/ss_brick Brick2: 10.70.47.26:/var/lib/glusterd/ss_brick Brick3: dhcp47-60.lab.eng.blr.redhat.com:/var/lib/glusterd/ss_brick Options Reconfigured: transport.address-family: inet performance.readdir-ahead: on nfs.disable: on cluster.enable-shared-storage: enable Volume Name: ozone Type: Distributed-Replicate Volume ID: 2a014bec-4feb-45f8-b2c3-4741a64b2e45 Status: Started Snapshot Count: 0 Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.70.47.60:/bricks/brick0/ozone0 Brick2: 10.70.47.61:/bricks/brick0/ozone1 Brick3: 10.70.47.26:/bricks/brick0/ozone2 Brick4: 10.70.47.27:/bricks/brick0/ozone3 Options Reconfigured: features.quota-deem-statfs: on features.inode-quota: on features.quota: on features.scrub-throttle: aggressive features.scrub-freq: hourly features.scrub: Inactive features.bitrot: off changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on transport.address-family: inet performance.readdir-ahead: on nfs.disable: on cluster.enable-shared-storage: enable [root@dhcp47-60 ~]# gluster v geo-rep status MASTER NODE MASTER VOL MASTER BRICK SLAVE USER SLAVE SLAVE NODE STATUS CRAWL STATUS LAST_SYNCED ----------------------------------------------------------------------------------------------------------------------------------------------------------------- 10.70.47.60 ozone /bricks/brick0/ozone0 root ssh://10.70.46.239::slave 10.70.46.242 Active Changelog Crawl 2016-12-17 22:03:43 10.70.47.26 ozone /bricks/brick0/ozone2 root ssh://10.70.46.239::slave 10.70.46.239 Passive N/A N/A 10.70.47.61 ozone /bricks/brick0/ozone1 root ssh://10.70.46.239::slave 10.70.46.240 Passive N/A N/A 10.70.47.27 ozone /bricks/brick0/ozone3 root ssh://10.70.46.239::slave 10.70.46.218 Active Changelog Crawl 2016-12-17 22:03:37 [root@dhcp47-60 ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html |