Bug 1489507
Summary: | Simultaneous service catalog request do not honour quotas | ||
---|---|---|---|
Product: | Red Hat CloudForms Management Engine | Reporter: | Satoe Imaishi <simaishi> |
Component: | Automate | Assignee: | Tina Fitzgerald <tfitzger> |
Status: | CLOSED ERRATA | QA Contact: | Anurag <ansinha> |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 5.7.0 | CC: | akarol, alsilva, ansinha, cpelland, dajohnso, gmccullo, greartes, jhardy, jsisul, mfeifer, mkanoor, obarenbo, rmanes, rspagnol, sacpatil, simaishi, tfitzger, vparekh, wfitzger |
Target Milestone: | GA | Keywords: | ZStream |
Target Release: | 5.8.4 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | quota:service | ||
Fixed In Version: | 5.8.4.4 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | 1456819 | Environment: | |
Last Closed: | 2018-06-25 14:17:08 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | CFME Core | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1456819, 1497912 | ||
Bug Blocks: |
Comment 2
CFME Bot
2017-09-07 18:16:17 UTC
Until https://bugzilla.redhat.com/show_bug.cgi?id=1497912 is fixed, you can't test this BZ Hi Sachin/Vatsal, There are 2 separate issue here. The instructions above specify: Created a service catalog with vm_name, instance_type & number_of_vms as fields. There is an open issue with using Service dialog values to override quota related items such the number_of_vms and instance type. This fix can be tested by using specifying the number_of_vms and instance_type in the service item, rather than specifying it in the Service dialog. These changes support the quota calculations for active provisions, but there is an Automate method change required as well. I'll be happy to work with whoever will be testing this fix. Thanks, Tina (In reply to Tina Fitzgerald from comment #6) > Hi Sachin/Vatsal, > > There are 2 separate issue here. > > The instructions above specify: > Created a service catalog with vm_name, instance_type & number_of_vms as > fields. > > There is an open issue with using Service dialog values to override quota > related items such the number_of_vms and instance type. > > This fix can be tested by using specifying the number_of_vms and > instance_type in the service item, rather than specifying it in the Service > dialog. > > These changes support the quota calculations for active provisions, but > there is an Automate method change required as well. > > I'll be happy to work with whoever will be testing this fix. > > Thanks, > Tina I agree with Tina. Simultaneous request issue should be tested when, - Provisioning VMs using lifecycle(BZ#1465087) - Provisioning VMs using Service dialog/Catalog(BZ#1497912) @Vatsal: Let me know if you need help from Tina or me for testing those cases. Of course a lot more need to be covered during testing like, - Maximum parallel requests from, 1. same browser but private windows with same user 2. Same machine but different browsers with same user 3. Different machine with same user 4. Multiple users belonging to same group &/or tenant(across providers) covering 1, 2 and 3. - Storage quotas(attached disk) - Quotas for flavors(instance_types) - Memory - CPUs - and retirement days. This will be hard to test manually. You need to come up with test cases. (In reply to Tina Fitzgerald from comment #6) > Hi Sachin/Vatsal, > > There are 2 separate issue here. > > The instructions above specify: > Created a service catalog with vm_name, instance_type & number_of_vms as > fields. > > There is an open issue with using Service dialog values to override quota > related items such the number_of_vms and instance type. > > This fix can be tested by using specifying the number_of_vms and > instance_type in the service item, rather than specifying it in the Service > dialog. > > These changes support the quota calculations for active provisions, but > there is an Automate method change required as well. > > I'll be happy to work with whoever will be testing this fix. > > Thanks, > Tina Tina, If this fix was for using number_of_vms via service items, it seems to be working. But doing it via dialog still seems broken. Vatsal, The dialog override for number_of_vms is being tracked in a separate ticket. Satoe, Can we backport these 3 PR to resolve this issue? https://github.com/ManageIQ/manageiq-automation_engine/pull/69 https://github.com/ManageIQ/manageiq-content/pull/191 - This needs to be applied first. https://github.com/ManageIQ/manageiq-content/pull/196 https://github.com/ManageIQ/manageiq/pull/16170 - This PR is part of the solution but has already been backported to fine. Let me know if you have any questions. Thanks Billy New commit detected on ManageIQ/manageiq-content/fine: https://github.com/ManageIQ/manageiq-content/commit/137152446e08d45ccaaf326715e95cae169603f8 commit 137152446e08d45ccaaf326715e95cae169603f8 Author: Madhu Kanoor <mkanoor> AuthorDate: Thu Oct 5 09:59:18 2017 -0400 Commit: Satoe Imaishi <simaishi> CommitDate: Mon Nov 20 10:23:20 2017 -0500 Merge pull request #191 from billfitzgerald0120/used_active_provisions Refactor used quota method and test. (cherry picked from commit 5cc73ae823ff9ec8ec3bf2c4253cb8f416a25fc8) https://bugzilla.redhat.com/show_bug.cgi?id=1489507 .../QuotaMethods.class/__methods__/used.rb | 51 ++++++++++++++--- .../QuotaMethods.class/__methods__/used_spec.rb | 65 ++++++++++++++++++++++ 2 files changed, 107 insertions(+), 9 deletions(-) create mode 100644 spec/content/automate/ManageIQ/System/CommonMethods/QuotaMethods.class/__methods__/used_spec.rb New commit detected on ManageIQ/manageiq-content/fine: https://github.com/ManageIQ/manageiq-content/commit/0db92735f28c731ff45652ec8ddc1164cb243165 commit 0db92735f28c731ff45652ec8ddc1164cb243165 Author: Madhu Kanoor <mkanoor> AuthorDate: Fri Oct 13 10:05:54 2017 -0400 Commit: Satoe Imaishi <simaishi> CommitDate: Mon Nov 20 10:28:09 2017 -0500 Merge pull request #196 from billfitzgerald0120/used_new_active_provisions Added active provisions to quota count. (cherry picked from commit a38ebc9efffa62117cf717ca89ce168501736c53) https://bugzilla.redhat.com/show_bug.cgi?id=1489507 .../QuotaMethods.class/__methods__/used.rb | 29 +++++++++++++++++++- .../unit/method_validation/calculate_used_spec.rb | 26 ------------------ .../QuotaMethods.class/__methods__/used_spec.rb | 32 +++++++++++++++++----- 3 files changed, 53 insertions(+), 34 deletions(-) delete mode 100644 spec/automation/unit/method_validation/calculate_used_spec.rb Vatsal, I checked your appliance and logs. Your provision requests are 6 seconds apart. It appears that the first request didn't add the 10 requested vm's to the active provision request fast enough to stop/block the second one. Log lines below ... [----] I, [2017-12-15T04:33:19.645508 #2142:12450f8] INFO -- : Q-task_id([service_template_provision_request_7]) <AEMethod validate_quota> Item: vms Used: (0) Requested: (10) Max: (10) Warn: (10) [----] I, [2017-12-15T04:33:25.778621 #2135:1245f6c] INFO -- : Q-task_id([service_template_provision_request_8]) <AEMethod validate_quota> Item: vms Used: (0) Requested: (10) Max: (10) Warn: (10) I need to look at this further ... Thanks Billy Vatsal, Can you try the same test but wait 1 minute before you start the second provision? Please let me know. Thanks Billy (In reply to William Fitzgerald from comment #15) > Vatsal, > > Can you try the same test but wait 1 minute before you start the second > provision? > > Please let me know. > > Thanks > > Billy I tried, it seems to work when requests are 1 minute+ apart. Hi Vatsal, Our current Quota implementation does a best effort to calculate active provisions. The active provision code looks for MiqQueue entries with: 1. class_name: MiqProvisionRequest or ServiceTemplateProvisionRequest method_name: create_request_tasks state: dequeue or 2. class_name: MiqAeEngine method_name: deliver tracking_label includes: "_provision_" It is possible that when 2 requests are submitted at exactly the same time, one of the requests could have an MiqQueue entry that falls outside the above criteria. As you saw when you staggered the requests, the active provisions were successfully calculated. We have plans for enhancing quota in the next major release, but the current MiqQueue calculations have this very small window where the request would not be included as you have described. The main issue reported here was no active provisions were included in quota at all. Since the code has been changed to calculate active provisions, can we validate this ticket? Let me know if you have any questions. Thanks, Tina (In reply to Tina Fitzgerald from comment #17) > Hi Vatsal, > > Our current Quota implementation does a best effort to calculate active > provisions. > > The active provision code looks for MiqQueue entries with: > > 1. class_name: MiqProvisionRequest or ServiceTemplateProvisionRequest > method_name: create_request_tasks > state: dequeue > or > 2. class_name: MiqAeEngine > method_name: deliver > tracking_label includes: "_provision_" > > It is possible that when 2 requests are submitted at exactly the same time, > one of the requests could have an MiqQueue entry that falls outside the > above criteria. As you saw when you staggered the requests, the active > provisions were successfully calculated. > > We have plans for enhancing quota in the next major release, but the current > MiqQueue calculations have this very small window where the request would > not be included as you have described. > > The main issue reported here was no active provisions were included in quota > at all. Since the code has been changed to calculate active provisions, can > we validate this ticket? > > Let me know if you have any questions. > > Thanks, > Tina Hey Tina, According to the BZ reported here, the issue is not reported for requests being 1 minute apart, it says specifically for simultaneous requests from 2 different user at the same time. And for that scenario, I still find quota failing. Update - I changed the active provision method to calculate counts based on active MiqRequests instead of specific MiqQueue entries. Created a WIP PR and are still testing changes. New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/34a83cfa8d554c08cb811113a06d4bc06847d9f2 commit 34a83cfa8d554c08cb811113a06d4bc06847d9f2 Author: Tina Fitzgerald <tfitzger> AuthorDate: Tue Jan 16 13:30:37 2018 -0500 Commit: Tina Fitzgerald <tfitzger> CommitDate: Thu Jan 18 11:53:17 2018 -0500 Change the way active_provisions are calculated by using MiqRequests instead of MiqQueue entries. Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1489507 app/models/mixins/miq_provision_quota_mixin.rb | 26 ++----- spec/models/miq_provision_request_spec.rb | 82 ++++++---------------- ...ervice_template_provision_request_quota_spec.rb | 55 +++------------ spec/support/service_template_helper.rb | 3 + 4 files changed, 41 insertions(+), 125 deletions(-) New commit detected on ManageIQ/manageiq/fine: https://github.com/ManageIQ/manageiq/commit/1371379942d7500723842ebbea64312892bfe30d commit 1371379942d7500723842ebbea64312892bfe30d Author: Tina Fitzgerald <tfitzger> AuthorDate: Tue Jan 16 13:30:37 2018 -0500 Commit: Tina Fitzgerald <tfitzger> CommitDate: Fri Jan 19 17:46:19 2018 -0500 Resolve conflicts from Cherry-pick 34a83cfa8d554c08cb811113a06d4bc06847d9f2 Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1489507 app/models/mixins/miq_provision_quota_mixin.rb | 27 ++----- spec/models/miq_provision_request_spec.rb | 82 ++++++---------------- ...ervice_template_provision_request_quota_spec.rb | 55 +++------------ spec/support/service_template_helper.rb | 3 + 4 files changed, 41 insertions(+), 126 deletions(-) Hi Vatsal, In the case of requests 6 and 7, I see the active count, counting each other, but I don't see the 1 used. Request 6 It says 0 used: [----] I, [2018-01-31T00:49:10.540519 #2115:57ffd3c] INFO -- : Q-task_id([service_template_provision_request_6]) <AEMethod used> Quota Used: {:cpu=>0, :memory=>0, :vms=>0, :storage=>0, :provisioned_storage=>0} [----] I, [2018-01-31T00:49:10.544002 #2115:57ffd3c] INFO -- : Q-task_id([service_template_provision_request_6]) <AEMethod used> Quota source type: group Plus 1 active(It's counting request 7) [----] I, [2018-01-31T00:49:10.588145 #2115:57ffd3c] INFO -- : Q-task_id([service_template_provision_request_6]) <AEMethod used> Quota active_provisions_by_group: {:cpu=>1, :memory=>1073741824, :vms=>1, :storage=>201326592, :provisioned_storage=>0} Equals total used of 1 [----] I, [2018-01-31T00:49:10.591723 #2115:57ffd3c] INFO -- : Q-task_id([service_template_provision_request_6]) <AEMethod used> Quota Totals: {:cpu=>1, :memory=>1073741824, :vms=>1, :storage=>201326592, :provisioned_storage=>0} Request 7 It says 0 used: [----] I, [2018-01-31T00:49:18.796192 #2115:54488ec] INFO -- : Q-task_id([service_template_provision_request_7]) <AEMethod used> Quota Used: {:cpu=>0, :memory=>0, :vms=>0, :storage=>0, :provisioned_storage=>0} [----] I, [2018-01-31T00:49:18.800805 #2115:54488ec] INFO -- : Q-task_id([service_template_provision_request_7]) <AEMethod used> Quota source type: group Plus 1 active(It's counting request 6) [----] I, [2018-01-31T00:49:18.850577 #2115:54488ec] INFO -- : Q-task_id([service_template_provision_request_7]) <AEMethod used> Quota active_provisions_by_group: {:cpu=>1, :memory=>1073741824, :vms=>1, :storage=>201326592, :provisioned_storage=>0} Equals total used of 1 [----] I, [2018-01-31T00:49:18.854801 #2115:54488ec] INFO -- : Q-task_id([service_template_provision_request_7]) <AEMethod used> Quota Totals: {:cpu=>1, :memory=>1073741824, :vms=>1, :storage=>201326592, :provisioned_storage=>0} With 0 used, and 2 active, it makes sense that they both pass quota. Assuming the same group is used in request 8, it properly denies the request with 2 used(requests 6 and 7). Thanks, Tina A similar issue for Tenant quota: https://bugzilla.redhat.com/show_bug.cgi?id=1540888 Vatsal and I had a BlueJeans session this morning to discuss quota and how active provisions are calculated. The current quota calculations for active provisions includes all active requests and each active request considers all other active requests. The following example was taken from the test appliance Vatsal setup for quota. The automation.log shows 2 requests, request 8 and 9, both active. The quota calculations for request 8 would include counts from request 9, and the quota calculations for request 9 would include counts from request 8. This is by design, the goal being to not over allocate quota as previously reported. [----] I, [2018-02-01T10:17:04.034729 #1985:5c62320] INFO -- : Q-task_id([service_template_provision_request_8]) <AEMethod used> Quota Used: {:cpu=>0, :memory=>0, :vms=>0, :storage=>0, :provisioned_storage=>0} [----] I, [2018-02-01T10:17:04.040796 #1985:5c62320] INFO -- : Q-task_id([service_template_provision_request_8]) <AEMethod used> Quota source type: group [----] I, [2018-02-01T10:17:04.108723 #1985:5c62320] INFO -- : Q- task_id([service_template_provision_request_8]) <AEMethod used> Quota active_provisions_by_group: {:cpu=>3, :memory=>3221225472, :vms=>3, :storage=>603979776, :provisioned_storage=>0} [----] I, [2018-02-01T10:17:04.112748 #1985:5c62320] INFO -- : Q-task_id([service_template_provision_request_8]) <AEMethod used> Quota Totals: {:cpu=>3, :memory=>3221225472, :vms=>3, :storage=>603979776, :provisioned_storage=>0} ----] I, [2018-02-01T10:17:05.611724 #1995:c24659c] INFO -- : Q-task_id([service_template_provision_request_9]) <AEMethod used> Quota Used: {:cpu=>0, :memory=>0, :vms=>0, :storage=>0, :provisioned_storage=>0} [----] I, [2018-02-01T10:17:05.615621 #1995:c24659c] INFO -- : Q-task_id([service_template_provision_request_9]) <AEMethod used> Quota source type: group [----] I, [2018-02-01T10:17:05.660655 #1995:c24659c] INFO -- : Q-task_id([service_template_provision_request_9]) <AEMethod used> Quota active_provisions_by_group: {:cpu=>3, :memory=>3221225472, :vms=>3, :storage=>603979776, :provisioned_storage=>0} [----] I, [2018-02-01T10:17:05.664677 #1995:c24659c] INFO -- : Q-task_id([service_template_provision_request_9]) <AEMethod used> Quota Totals: {:cpu=>3, :memory=>3221225472, :vms=>3, :storage=>603979776, :provisioned_storage=>0} So from the discussion with Tina about this BZ and active provisioning, some points to notice here, 1. The current fix is not for the case where you pass number_of_vms(or other resources?) via dialog to the service. If group quota is 10 and you pass 8 from dialog to service, and try to provision 2 such service at the same time(or say quite near that they would count as active_provision) from 2 different users in the same group, it lets both of them pass and provisions 16 even when quota was 10. Current fix works only when the resource values you are requesting are hard-coded in the service you made. 2. A observation that I filed in https://bugzilla.redhat.com/show_bug.cgi?id=1540888 If there are 'n' resources available and 2 requests come, both requesting <= 'n' resources, both gets error quota exceeded. They both see the other as active, and each other's resources requested as used, so they both fail saying quota exceeded. Tina confirmed that this is not a BZ, but a hotfix. This would stop provisioning over the quota, but would also give false errors even when quota was available. @Tina, this seems a half fix, should I file a new ticket specifically for passing values from dialog? Also, I think point 2 should be documented properly before going GA. Hi Vatsal, Your observations are correct. Yes, please open a new ticket specifically for passing the number of VMs from the service dialog. Based on the above discussions, can we validate this ticket? Thanks, Tina Hey Tina, after having some discussion with QEs about this (half) fix, I still don't feel confident to verify this ticket. The description of this ticket itself mentions about ordering service catalog with passing values from the dialog. So the current fix doesn't seem to solve the primary issue here. And provisioning services with just using values given while creating, I think that issue was fixed in this BZ https://bugzilla.redhat.com/show_bug.cgi?id=1497912 itself. *** Bug 1540888 has been marked as a duplicate of this bug. *** New commit detected on ManageIQ/manageiq-content/fine: https://github.com/ManageIQ/manageiq-content/commit/f9b5e6a4904df2ed5bbe463d3a01c210c63484ea commit f9b5e6a4904df2ed5bbe463d3a01c210c63484ea Author: Madhu Kanoor <mkanoor> AuthorDate: Wed Apr 18 15:31:51 2018 -0400 Commit: Madhu Kanoor <mkanoor> CommitDate: Wed Apr 18 15:31:51 2018 -0400 Merge pull request #203 from billfitzgerald0120/quota_overrides Calculate quota using service dialogs overrides. (cherry picked from commit daf5f5031dbe9b516350ae37a1d5a9f43ad1afa0) Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1489507 content/automate/ManageIQ/System/CommonMethods/QuotaMethods.class/__methods__/requested.rb | 59 +- spec/automation/unit/method_validation/calculate_requested_spec.rb | 99 +- 2 files changed, 141 insertions(+), 17 deletions(-) Hi Team, This issue is resolved and now working fine. I have verified this on version: Version 5.8.4.4.20180611155620_d2f7ab6 hence changing the status of bug from on_qa to verified. Thanks for fixing this issue. Thanks, -Anurag Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:1972 |