Thanks Gellert. We can reproduce and are in the process of determining the correct way to fix.
Here's the first part: https://github.com/ManageIQ/activerecord-virtual_attributes/pull/21
https://github.com/ManageIQ/manageiq/pull/18798
The second PR has been opened to leverage the first one. We're still working on reviewing it and getting it merged.
New commit detected on ManageIQ/manageiq/master: https://github.com/ManageIQ/manageiq/commit/168f3ad530627603d1478aa6c3cff306e3f8b610 commit 168f3ad530627603d1478aa6c3cff306e3f8b610 Author: Keenan Brock <keenan> AuthorDate: Thu May 9 16:01:12 2019 -0400 Commit: Keenan Brock <keenan> CommitDate: Thu May 9 16:01:12 2019 -0400 specify virtual_delegate types to avoid deadlock deriving the attribute type for delegates required the target class to be loaded. This forced a cascade of load_schema calls that end up introducing a race condition. This PR explicitly declares the attribute type so the target class no longer needs to be loaded and the race condition (and subsequent deadlocks) are avoided. https://bugzilla.redhat.com/show_bug.cgi?id=1700378 Gemfile | 2 +- app/models/ems_cluster.rb | 2 +- app/models/entitlement.rb | 2 +- app/models/host.rb | 14 +- app/models/miq_group.rb | 2 +- app/models/miq_product_feature.rb | 2 +- app/models/miq_report_result.rb | 2 +- app/models/miq_server.rb | 2 +- app/models/miq_widget.rb | 4 +- app/models/mixins/authentication_mixin.rb | 1 + app/models/mixins/compliance_mixin.rb | 2 + app/models/mixins/drift_state_mixin.rb | 4 +- app/models/mixins/ownership_mixin.rb | 4 +- app/models/vm_or_template.rb | 24 +- 14 files changed, 35 insertions(+), 32 deletions(-)
I think I can move this to post now? I guess I'll find out pretty quick if I'm wrong...
Hi, Parthvi, To recreate this on a version that doesn't have this fix, you'll need to continually try to get two requests to hit the web service worker at the exact correct time. If you get the requests to complete successfully, the timing wasn't right and you'll need to kill the web service worker and try again. Before you run this, you'll want to decrease the web service workers count to 1. Then, you should tail -f log/production.log and run the script below. You'll be looking for a log message that shows up after a request "hangs" for 30+ seconds: "Long running http(s) request" Examples for this log line can be found in the PR that added it: https://github.com/ManageIQ/manageiq/pull/17842 If you don't get this message after you run the script below and wait 30+ seconds, you'll need to kill the web service worker. 2.times do Thread.new do `curl -L https://admin:smartvm@localhost/api/vms` end Thread.new do `curl -L https://admin:smartvm@localhost/api/notifications?expand=resources&attributes=details&sort_by=id&sort_order=desc&limit=100` end end
FIXED. Verified on 5.11.0.8.