Bug 1576794
Summary: | Gluster native event webhook fails sometimes | ||
---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Rohan Kanade <rkanade> |
Component: | web-admin-tendrl-gluster-integration | Assignee: | Shubhendu Tripathi <shtripat> |
Status: | CLOSED ERRATA | QA Contact: | Filip Balák <fbalak> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | rhgs-3.4 | CC: | fbalak, mbukatov, nthomas, rhs-bugs, sankarshan |
Target Milestone: | --- | ||
Target Release: | RHGS 3.4.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | tendrl-ui-1.6.3-2.el7rhgs tendrl-ansible-1.6.3-4.el7rhgs tendrl-notifier-1.6.3-3.el7rhgs tendrl-commons-1.6.3-5.el7rhgs tendrl-api-1.6.3-3.el7rhgs tendrl-monitoring-integration-1.6.3-3.el7rhgs tendrl-node-agent-1.6.3-5.el7rhgs | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-09-04 07:05:38 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1503137 |
Description
Rohan Kanade
2018-05-10 12:08:28 UTC
Package version where we saw this was tendrl-gluster-integration-1.6.3-2.el7rhgs I tried to reproduce it with calling peer probe and peer detach in loop on several nodes at the same time but I was unable to reproduce it with tendrl-gluster-integration-1.6.3-2.el7rhgs. `for x in {1..1000}; do gluster peer detach <node>; gluster peer probe <node>; done&` Rohan, can you please provide better reproducer steps to generate large number of gluster native events? I dont have any more info about this, please close this if not required Apologies, I missed out one detail To reproduce this issue: 1) Send a very large number of HTTP POST requests to "http://$storage-node:8697/listen" 2) Check tendrl-monitoring-integration error logs or check HTTP response for error codes 500, 404 etc or check if any request has been dropped and not processed I tested this with old version: tendrl-gluster-integration-1.5.4-14.el7rhgs.noarch tendrl-ansible-1.5.4-7.el7rhgs.noarch tendrl-api-1.5.4-4.el7rhgs.noarch tendrl-api-httpd-1.5.4-4.el7rhgs.noarch tendrl-commons-1.5.4-9.el7rhgs.noarch tendrl-grafana-plugins-1.5.4-14.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.5.4-14.el7rhgs.noarch tendrl-node-agent-1.5.4-16.el7rhgs.noarch tendrl-notifier-1.5.4-6.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.5.4-6.el7rhgs.noarch and with new version: tendrl-gluster-integration-1.6.3-7.el7rhgs.noarch tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-api-1.6.3-4.el7rhgs.noarch tendrl-api-httpd-1.6.3-4.el7rhgs.noarch tendrl-commons-1.6.3-9.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-7.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-7.el7rhgs.noarch tendrl-node-agent-1.6.3-9.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-8.el7rhgs.noarch In both cases the server was able to process all requests that I sent to it but I have also done load testing with ApacheBench tool (ab): $ ab -c 10 -n 50000 -p d.json -T application/json http://<gluster-node>:8697/listen $ cat d.json {"event": "CLIENT_DISCONNECT", "message": {"brick_path": "/gluster/brick1/brick1", "client_identifier": "172.28.128.204:49132", "client_uid": "tendrl-node-1-1340-2018/05/02-07:01:16:694187-glustervol-client-0-0-0", "server_identifier": "172.28.128.204:49152"}, "nodeid": "3f7532a7-cd02-4536-9371-c97a00a2fa3e", "ts": 1525244478} Results of this load testing suggest that after usage of cherrypy the number of requests that server can handle is significantly greater. --> VERIFIED Old version ----------- Time taken for tests: 101.149 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Time per request: 20.230 [ms] (mean) Time per request: 2.023 [ms] (mean, across all concurrent requests) Transfer rate: 80.13 [Kbytes/sec] received 244.26 kb/s sent 324.40 kb/s total New version ----------- Time taken for tests: 74.399 seconds Complete requests: 50000 Failed requests: 0 Write errors: 0 Requests per second: 672.05 [#/sec] (mean) Time per request: 14.880 [ms] (mean) Time per request: 1.488 [ms] (mean, across all concurrent requests) Transfer rate: 95.82 [Kbytes/sec] received 332.09 kb/s sent 427.91 kb/s total Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2616 |