Description of problem:
I think there may be a regression in the following commit.
The transaction can be huge and takes time to commit(I guess) if there are many hosts with thousands of guests to update. It seems that during the commit, most of the rows in katello_subscription_facet table are locked due to the following line. If I comment out this line from my reproducer, the "/rhsm/<uuid>/certificates/serials requests didn't get block while the hypervisor update is running.
To minimize to performance issue, I think we may need to move the transaction to under each host or remove the transaction completely.
@hosts.each do |uuid, host|
I use a stupid way to reproduce the issue so it might not be accurate to reflect the real environment.
1. I modified the code to run the update 100 times within the transaction.
@hosts.each do |uuid, host|
2. And then trigger the "virt-who -do"
3. On the Satellite, run the following to capture the passenger requests
watch passenger-status --show=requests
4. On a content host run the request many times until it is blocked.
curl -k --cert /etc/pki/consumer/cert.pem --key /etc/pki/consumer/key.pem https://my_satellite_fqdn/rhsm/consumers/<uuid>/certificates/serials
RHSM certs checks request is stuck
Version : 4.0.18
Date : 2019-09-30 14:34:34 +1000
host = my_satellite.com
uri = /rhsm/consumers/a40cc335-8ba9-481c-8d10-59bc5420601a/certificates/serials
connected at = 2019-09-30 14:33:50 (43 sec ago)
state = FORWARDING_BODY_TO_APP
RHSM certs checks request should process quicker.
*** This bug has been marked as a duplicate of bug 1600201 ***
Moving to subscription management component as per comment 10 this BZ is being used to track a katello side issue.
Moving this bug to POST for triage into Satellite 6 since the upstream issue https://projects.theforeman.org/issues/27974 has been resolved.
Created attachment 1652270 [details]
hotfix RPM for Satellite 6.6
Hotfix RPM is available for Satellite 6.6.1. To install it:
1. Take a snapshot or complete backup of Satellite server
2. Download the attached hotfix RPM and copy it to Satellite server
3. # satellite-maintain packages unlock
4. # yum install tfm-rubygem-katello-18.104.22.168-2.HOTFIXRHBZ1756955.el7sat.noarch.rpm
5. # satellite-maintain packages lock
6. # systemctl restart httpd
Verified in Satellite 6.7 Snap 10
Approximately followed the reproducer steps found in the original bug.
After performing the setup modifications, I looped the cert check 1000 times.
Each completed without any issues in an average runtime of 0.35s
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.