| Summary: | Scalability Testing: 150 + users logged into, and deploying instances from, one conductor instance results in "Response code: 503" - Response message: Service Temporarily Unavailable | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Retired] CloudForms Cloud Engine | Reporter: | Ronelle Landy <rlandy> | ||||||
| Component: | aeolus-conductor | Assignee: | Tzu-Mainn Chen <tzumainn> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | wes hayutin <whayutin> | ||||||
| Severity: | medium | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 1.0.0 | CC: | akarol, asettle, athomas, brad, cpelland, deltacloud-maint, hbrock, juwu, ssachdev, tzumainn | ||||||
| Target Milestone: | rc | Keywords: | Triaged | ||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: |
During a scale-testing of 150+ users logging in Conductor and deploying instances, some instances returned a response message: "Service Temporarily Unavailable." This bug fix reduces query numbers and adds eager loading so fewer instances will return this error.
|
Story Points: | --- | ||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2012-12-04 14:58:35 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
Ronelle Landy
2012-03-12 22:37:08 UTC
Created attachment 569514 [details]
Test Results Table
Just to be sure, can you tell me which jmeter test you used? Was it aeolus-performance-testing/jmeter/create-deployment-and-launch? I used a mix of tests and scripts: - To add the 200 users, I used https://github.com/aeolusproject/aeolus-performance-testing/blob/master/jmeter/scripts/configure-and-create-users.sh - To build and push the 1 image to mock, I used: https://github.com/aeolusproject/aeolus-performance-testing/blob/master/jmeter/build-and-push/fedora15.jmx (only one thread needed) - To log in the 170 users and deploy the instances concurrently, I used the attached .jmx testplan. Need to place all these jmeter testplans on github but in the mean time, I'm attaching it to the BZ. Created attachment 570697 [details]
jmeter testplan
Patch created: https://fedorahosted.org/pipermail/aeolus-devel/2012-March/009670.html I don't claim it fixes everything - a comprehensive review feels out of scope pre-1.1, and would probably involve things like revisiting permissions and caching - but it does fix some pretty big issues. Patches pushed to master: commit d951bad80e60b4aaee3db859210f9e2a8601f571 BZ 802571 refactor provider_account sort-by-priority to correctly sort nils, regardless of db commit 9f2ee9f1cf28be988b02ab3d650d919856fff70d BZ 802571 don't use deployment.as_json unnecessarily commit 7b3f91e5d8bdbc2e9db0ec6d051fad9a1f647873 BZ 802571 don't query provider multiple times commit bc9ef2f278a7d96ce3b7c02e072f141a04c89d87 BZ 802571 added eager loading and other minor efficiency fixes Added fix (https://fedorahosted.org/pipermail/aeolus-devel/2012-March/009742.html), which brings the total number of patches needed to five: commit d951bad80e60b4aaee3db859210f9e2a8601f571 BZ 802571 refactor provider_account sort-by-priority to correctly sort nils, regardless of db commit 9f2ee9f1cf28be988b02ab3d650d919856fff70d BZ 802571 don't use deployment.as_json unnecessarily commit 7b3f91e5d8bdbc2e9db0ec6d051fad9a1f647873 BZ 802571 don't query provider multiple times commit bc9ef2f278a7d96ce3b7c02e072f141a04c89d87 BZ 802571 added eager loading and other minor efficiency fixes commit 1c35c34898731c016e95bc158d2d9ee977f81235 BZ 802571 fix to previous patch so that list_for_user works Latest run of these tests resulted in just 22 instances of "Service Temporarily Unavailable", all during a call to /conductor/users. $ grep "Service Temporarily Unavailable" resultsTable200.csv | wc -l 22 This will probably be highly dependent on the system that is hosting Conductor, and how many thin servers are running. I'm ok with closing this for now, as it is a marked improvement over previous runs of this same test script. Verified. aeolus-all-0.13.22-1.el6cf.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHEA-2012-1516.html |