Bug 2027947
Summary: | HypervisorHeartbeatUpdateJob is taking long time to process and updates wrong consumer records | |||
---|---|---|---|---|
Product: | Red Hat Satellite | Reporter: | Hao Chang Yu <hyu> | |
Component: | Candlepin | Assignee: | satellite6-bugs <satellite6-bugs> | |
Status: | CLOSED ERRATA | QA Contact: | jcallaha | |
Severity: | medium | Docs Contact: | ||
Priority: | unspecified | |||
Version: | 6.10.0 | CC: | juwatts, nmoumoul, redakkan, wpoteat, zhunting | |
Target Milestone: | 6.12.0 | Keywords: | Triaged | |
Target Release: | Unused | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | candlepin-4.0.14-1, candlepin-4.1.9-1 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2028765 2028766 (view as bug list) | Environment: | ||
Last Closed: | 2022-11-16 13:33:03 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 2028765, 2028766 | |||
Bug Blocks: |
Description
Hao Chang Yu
2021-12-01 06:14:33 UTC
Can we also get an explain plan on the updated query? Thanks. "If I understand correctly, the query is suppose to update only the "lastcheckin" of the reported hypervisors" It is supposed to update any hypervisor with the corresponding reporter id and org. It is not determined by the contents of the hypervisor report. In scenarios where none of the hypervisors have changed, the lastcheckin date is updated via the heartbeat, but there is not a hypervisor report sent at all for the HypervisorUpdateJob. I see where the current query gets it wrong and updates all rows for the org. Will fix. Verified in Satellite 6.12 Snap 14 Ran the hypervisor/guest flood script provided by https://github.com/JacobCallahan/content-host-d python flood.py -s my.sat.host.com -m host --hypervisors 3000 --guests 1 -t ubi7 --exit-criteria reg --limit 25 with some additional test hypervisors included, this brought the total to 3,010 candlepin=# select count(*) from cp_consumer a join cp_consumer_hypervisor b on a.id = b.consumer_id join cp_owner c on c.id = a.owner_id where c.account = 'Default_Organization'; count ------- 3010 (1 row) Later, additional testing would add an additional 1,010 hypervisors. These final 1,000 are what was repeatedly submitted to the Satellite for updates. The overall update job completed twice as fast as the initial report, in about 2m. INFO org.candlepin.async.JobManager - Job "Hypervisor Update" completed in 29693ms Additionally, I decompiled the ConsumerCurator.class file and found the updated query associated with the heartbeat update. That query matches the suggested changes. query = "UPDATE cp_consumer consumer SET lastcheckin = :checkin FROM cp_consumer_hypervisor hypervisor, cp_owner owner WHERE consumer.id = hypervisor.consumer_id AND hypervisor.reporter_id = :reporter AND consumer.owner_id = owner.id AND owner.account = :ownerKey"; Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Satellite 6.12 Release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:8506 The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days |