Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1982970 - Fact updates causing unnecessary compliance recalculation in Candlepin
Summary: Fact updates causing unnecessary compliance recalculation in Candlepin
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Candlepin
Version: 6.9.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: 6.11.0
Assignee: satellite6-bugs
QA Contact: Lai
URL:
Whiteboard:
Depends On: 1991960 2044821 2060927
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-16 07:04 UTC by Hao Chang Yu
Modified: 2024-12-20 20:29 UTC (History)
8 users (show)

Fixed In Version: candlepin-4.0.17-1, candlepin-4.1.12-1, candlepin-4.2.1-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1991960 2044821 2060927 (view as bug list)
Environment:
Last Closed: 2022-07-05 14:29:34 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 2044944 1 None None None 2022-02-22 13:15:31 UTC
Red Hat Bugzilla 2044946 1 None None None 2022-02-22 13:15:31 UTC
Red Hat Bugzilla 2059131 1 high CLOSED Fact updates sending unnecessary compliance created events 2022-04-19 12:11:58 UTC
Red Hat Bugzilla 2059135 1 high CLOSED Fact updates sending unnecessary compliance created events 2022-04-14 11:06:41 UTC
Red Hat Bugzilla 2059137 1 unspecified CLOSED Fact updates sending unnecessary compliance created events 2022-04-15 13:53:38 UTC
Red Hat Issue Tracker ENT-4790 0 None None None 2022-03-04 15:36:58 UTC
Red Hat Product Errata RHSA-2022:5498 0 None None None 2022-07-05 14:29:49 UTC

Internal Links: 2059131

Description Hao Chang Yu 2021-07-16 07:04:07 UTC
Description of problem:
When a consumer has updated 1 or more facts, its compliance will be re-calculated by the Candlepin. This is a very expensive especially when there are many consumers (such as 20k+) registered to the Satellite and each of them have some facts update very frequently. Besides, this may also create many "compliance.created" events which can also give a lot of pressure to the messaging broker and eventually cause paging.

https://github.com/candlepin/candlepin/blob/master/src/main/java/org/candlepin/policy/js/compliance/hash/HashableStringGenerators.java#L216

Steps to Reproduce:
1. On the client, ensure the facts are updated

subscription-manager facts --update

2. Set any custom fact in /etc/rhsm/facts/
3. Stop the rhsmcertd service. we will trigger it manually later

systemctl stop rhsmcertd

4. On Satellite, tail the candlepin audit log

tail -f /var/log/candlepin/audit.log

5. On the client, run rhsmcertd immediately.

rhsmcertd -n

6. Wait for 1 mins and then kill the rhsmcertd process

Actual results:
### Recalculated twice here ###
2021-07-16 16:42:12,531 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=f8e8f51a-c20e-4d42-aa6d-0f5d621b530d type=CREATED owner=8ac705086e2c97c4016e2c9863b60001 eventData={"reasons":[],"status":"valid"}
2021-07-16 16:42:13,970 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=f8e8f51a-c20e-4d42-aa6d-0f5d621b530d type=CREATED owner=8ac705086e2c97c4016e2c9863b60001 eventData={"reasons":[],"status":"valid"}

2021-07-16 16:42:13,977 principalType=trusteduser principal=foreman_admin target=SYSTEM_PURPOSE_COMPLIANCE entityId=f8e8f51a-c20e-4d42-aa6d-0f5d621b530d type=CREATED owner=8ac705086e2c97c4016e2c9863b60001 eventData={"nonCompliantUsage":null,"compliantAddOns":{},"nonCompliantRole":null,"reasons":[],"compliantSLA":{},"nonCompliantAddOns":[],"compliantRole":{},"nonCompliantSLA":null,"compliantUsage":{},"status":"not specified"}

2021-07-16 16:42:13,983 principalType=trusteduser principal=foreman_admin target=CONSUMER entityId=8ac705087a3e1670017a3e1796520001 type=MODIFIED owner=8ac705086e2c97c4016e2c9863b60001 eventData=null

### Recalculate compliance again!! This is also a bug. It seems that "syspurpose compliance" and the "subscription compliance" are sharing the same "compliancestatushash" column in the cp_consumer table so after calculating the syspurpose compliance, Candlepin will replace the column with its digest. ###
2021-07-16 16:42:19,315 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=f8e8f51a-c20e-4d42-aa6d-0f5d621b530d type=CREATED owner=8ac705086e2c97c4016e2c9863b60001 eventData={"reasons":[],"status":"valid"} 


Expected results:
In my opinion, the compliance shouldn't be re-calculated on every fact updates. Or maybe it should only be re-calculated when a certain facts that Candlepin cares has changed.


Additional info:
Let me know if you need a separate bugzilla for the syspurpose compliance hash issue above.

Comment 2 Nikos Moumoulidis 2021-12-16 14:09:43 UTC
While this is something we are investigating and planning to fix, it looks to be more of a performance enhancement than a bug,
so I would not consider backporting to Satellite 6.9; fixing in 6.10+ seems more appropriate.

Comment 4 Nikos Moumoulidis 2021-12-21 15:52:13 UTC
(In reply to Hao Chang Yu from comment #0)
> ### Recalculate compliance again!! This is also a bug. It seems that
> "syspurpose compliance" and the "subscription compliance" are sharing the
> same "compliancestatushash" column in the cp_consumer table so after
> calculating the syspurpose compliance, Candlepin will replace the column
> with its digest. ###

Hi Hao,

You were right about this. Apparently there are 2 different columns for these, but one of them (compliancestatushash) is being shared right now,
while the other is being ignored. Can you please file a separate bug for that? That bug alone should be easy to fix, while the general 
unnecessary compliance calculations reduction, which needs a bit of a redesign effort, will be tracked in this bug.

Thanks,
Nikos

Comment 5 Nikos Moumoulidis 2022-01-25 11:27:11 UTC
(In reply to Hao Chang Yu from comment #0)
> ### Recalculate compliance again!! This is also a bug. It seems that
> "syspurpose compliance" and the "subscription compliance" are sharing the
> same "compliancestatushash" column in the cp_consumer table so after
> calculating the syspurpose compliance, Candlepin will replace the column
> with its digest. ###

FYI I have created https://bugzilla.redhat.com/show_bug.cgi?id=2044944 and https://bugzilla.redhat.com/show_bug.cgi?id=2044946 for fixing the hash column overwrite,
and we will use this Satellite bug for a longer term effort of reducing the compliance recalculations.

Comment 6 Nikos Moumoulidis 2022-02-28 09:59:20 UTC
For fixing the sub-issue of unnecessary compliance.created event generation (but not the compliance recalculation itself), I have created the following:
https://bugzilla.redhat.com/show_bug.cgi?id=2059131
https://bugzilla.redhat.com/show_bug.cgi?id=2059135
https://bugzilla.redhat.com/show_bug.cgi?id=2059137

Comment 10 Lai 2022-06-07 08:45:41 UTC
Steps to Retest:
1. Get a client machine up and running and register to satellite (I used a capsule)
2. Enable the rhsmcertd: systemctl enabl rhsmcertd
3. Set a custom fact in /etc/rhsm/facts/capsule.fact with one of the following (I did uname.machine -> echo '{"uname.machine": "bobby"}') so that "target:COMPLIANCE" can be triggered:

cpu.core(s)_per_socket
memory.memtotal
uname.machine
band.storage.usage
cpu.cpu_socket(s)
virt.is_guest

4. On the client, ensure the facts are updated

subscription-manager facts --update

5. Stop the rhsmcertd service. we will trigger it manually later

systemctl stop rhsmcertd

6. On Satellite, tail the candlepin audit log

tail -f /var/log/candlepin/audit.log

7. On the client, run rhsmcertd immediately.

rhsmcertd -n

6. Wait for 1 mins and then kill the rhsmcertd process

Expected result:
There shouldn't be a compliance recalculation right after the other in the same timeframe.

Actual result:
There isn't a compliance recalculation right after the other in the same timeframe

# tail -f /var/log/candlepin/audit.log
2022-06-07 04:27:20,431 principalType=trusteduser principal=foreman_admin target=SYSTEM_PURPOSE_COMPLIANCE entityId=ce3b20d2-81dc-4b85-a35e-769879a3f1b6 type=CREATED owner=8a818230812a075601812a0eee270001 eventData={"nonCompliantUsage":null,"compliantAddOns":{},"nonCompliantRole":null,"reasons":[],"nonCompliantServiceType":null,"compliantSLA":{},"nonCompliantAddOns":[],"compliantRole":{},"nonCompliantSLA":null,"compliantUsage":{},"status":"not specified","compliantServiceType":{}}
2022-06-07 04:27:20,439 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=ce3b20d2-81dc-4b85-a35e-769879a3f1b6 type=CREATED owner=8a818230812a075601812a0eee270001 eventData={"reasons":[],"status":"valid"}
2022-06-07 04:27:20,444 principalType=trusteduser principal=foreman_admin target=SYSTEM_PURPOSE_COMPLIANCE entityId=ce3b20d2-81dc-4b85-a35e-769879a3f1b6 type=CREATED owner=8a818230812a075601812a0eee270001 eventData={"nonCompliantUsage":null,"compliantAddOns":{},"nonCompliantRole":null,"reasons":[],"nonCompliantServiceType":null,"compliantSLA":{},"nonCompliantAddOns":[],"compliantRole":{},"nonCompliantSLA":null,"compliantUsage":{},"status":"not specified","compliantServiceType":{}}
2022-06-07 04:27:20,450 principalType=trusteduser principal=foreman_admin target=ENTITLEMENT entityId=037b97a6123d4ec3aa3ff047eb9dc7da type=CREATED owner=8a818230812a075601812a0eee270001 eventData=null
2022-06-07 04:29:41,419 principalType=trusteduser principal=foreman_admin target=CONSUMER entityId=8a81822d813d175401813d467b8c0e94 type=MODIFIED owner=8a818230812a075601812a0eee270001 eventData=null
2022-06-07 04:29:41,426 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=ce3b20d2-81dc-4b85-a35e-769879a3f1b6 type=CREATED owner=8a818230812a075601812a0eee270001 eventData={"reasons":[{"productName":"Red Hat Satellite Infrastructure Subscription","message":"Supports architecture aarch64,ia64,ppc,ppc64,ppc64le,s390,s390x,x86,x86_64 but the system is bobby machine."}],"status":"partial"}
2022-06-07 04:32:27,018 principalType=trusteduser principal=foreman_admin target=CONSUMER entityId=8a81822d813d175401813d467b8c0e94 type=MODIFIED owner=8a818230812a075601812a0eee270001 eventData=null
2022-06-07 04:32:27,037 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=ce3b20d2-81dc-4b85-a35e-769879a3f1b6 type=CREATED owner=8a818230812a075601812a0eee270001 eventData={"reasons":[{"productName":"Red Hat Satellite Infrastructure Subscription","message":"Supports architecture aarch64,ia64,ppc,ppc64,ppc64le,s390,s390x,x86,x86_64 but the system is some kind of name."}],"status":"partial"}
2022-06-07 04:40:13,635 principalType=trusteduser principal=foreman_admin target=CONSUMER entityId=8a81822d813d175401813d467b8c0e94 type=MODIFIED owner=8a818230812a075601812a0eee270001 eventData=null
2022-06-07 04:40:13,642 principalType=trusteduser principal=foreman_admin target=COMPLIANCE entityId=ce3b20d2-81dc-4b85-a35e-769879a3f1b6 type=CREATED owner=8a818230812a075601812a0eee270001 eventData={"reasons":[{"productName":"Red Hat Satellite Infrastructure Subscription","message":"Supports architecture aarch64,ia64,ppc,ppc64,ppc64le,s390,s390x,x86,x86_64 but the system is a name worth naming."}],"status":"partial"}

There is a couple of `target=COMPLIANCE` but if you notice at 4:40:13, there's only one and that's the most recent changes.  The other compliance was from past setup.

Verified on 6.11 snap 23 with candlepin-4.1.13-1.el8sat.noarch on rhel7 and rhel8

Comment 13 errata-xmlrpc 2022-07-05 14:29:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498


Note You need to log in before you can comment on or make changes to this bug.