Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Description of problem:
I think there may be a regression in the following commit.
https://github.com/Katello/katello/commit/81530a06de177a78275b229d0ec491579ce016f4#diff-bf897becee6d218f2e9b589c5f66dcfdR21
The transaction can be huge and takes time to commit(I guess) if there are many hosts with thousands of guests to update. It seems that during the commit, most of the rows in katello_subscription_facet table are locked due to the following line. If I comment out this line from my reproducer, the "/rhsm/<uuid>/certificates/serials requests didn't get block while the hypervisor update is running.
https://github.com/Katello/katello/blob/master/app/models/katello/host/subscription_facet.rb#L131
To minimize to performance issue, I think we may need to move the transaction to under each host or remove the transaction completely.
For example:
@hosts.each do |uuid, host|
ActiveRecord::Base.transaction do
update_subscription_facet(uuid, host)
end
end
How reproducible:
I use a stupid way to reproduce the issue so it might not be accurate to reflect the real environment.
1. I modified the code to run the update 100 times within the transaction.
ActiveRecord::Base.transaction do
100.times do
@hosts.each do |uuid, host|
update_subscription_facet(uuid, host)
end
end
end
2. And then trigger the "virt-who -do"
3. On the Satellite, run the following to capture the passenger requests
watch passenger-status --show=requests
4. On a content host run the request many times until it is blocked.
curl -k --cert /etc/pki/consumer/cert.pem --key /etc/pki/consumer/key.pem https://my_satellite_fqdn/rhsm/consumers/<uuid>/certificates/serials
Actual results:
RHSM certs checks request is stuck
passenger-status --show=requests
Version : 4.0.18
Date : 2019-09-30 14:34:34 +1000
Instance: 30428
1 clients:
Client 19:
host = my_satellite.com
uri = /rhsm/consumers/a40cc335-8ba9-481c-8d10-59bc5420601a/certificates/serials
connected at = 2019-09-30 14:33:50 (43 sec ago)
state = FORWARDING_BODY_TO_APP
Expected results:
RHSM certs checks request should process quicker.
Hotfix RPM is available for Satellite 6.6.1. To install it:
1. Take a snapshot or complete backup of Satellite server
2. Download the attached hotfix RPM and copy it to Satellite server
3. # satellite-maintain packages unlock
4. # yum install tfm-rubygem-katello-3.12.0.30-2.HOTFIXRHBZ1756955.el7sat.noarch.rpm
5. # satellite-maintain packages lock
6. # systemctl restart httpd
Verified in Satellite 6.7 Snap 10
Approximately followed the reproducer steps found in the original bug.
After performing the setup modifications, I looped the cert check 1000 times.
Each completed without any issues in an average runtime of 0.35s
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2020:1454