1972501 – After promoting the content view, Candlepin failed to mark the entitlement certificates as dirty

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1972501 - After promoting the content view, Candlepin failed to mark the entitlement certificates as dirty

Summary: After promoting the content view, Candlepin failed to mark the entitlement ce...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Satellite
Classification:	Red Hat
Component:	Candlepin
Sub Component:
Version:	6.9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	6.11.0
Assignee:	satellite6-bugs
QA Contact:	Imaan
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	2026504 (view as bug list)
Depends On:	1973257 2016418
Blocks:
TreeView+	depends on / blocked

Reported:	2021-06-16 04:52 UTC by Hao Chang Yu
Modified:	2024-10-01 18:38 UTC (History)
CC List:	12 users (show)
Fixed In Version:	candlepin-3.1.28-2, candlepin-4.1.8-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1973257 2016418 (view as bug list)
Environment:
Last Closed:	2022-07-05 14:29:32 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	6956231	0	None	None	None	2022-07-09 06:53:37 UTC
Red Hat Product Errata	RHSA-2022:5498	0	None	None	None	2022-07-05 14:29:45 UTC

Internal Links: 2059660

Description Hao Chang Yu 2021-06-16 04:52:11 UTC

Description of problem:
There are 2 issues here.
- When many entitlements are attached to the environment and many new content_ids are added, Candlepin will take very long time (45mins+) to run the "RegenEnvEntitlementCertsJob".
- "RegenEnvEntitlementCertsJob" might fail to process if 1 or more affected entitlements got revoked, such as guests migration will trigger entitlement revocation.

Steps to Reproduce:
1. Create a new content view. Attach 1 repo to it and publish the 1.0 version.
2. Register 1000 hosts to the content view. Attach 10 or more subscriptions to the hosts. You can create multiple custom products to attach but 1 subscription must be the virtual subscription (guest of <hypervisor> pool).
3. Attach 3 or more new repositories (same product. For example rhel7 server, optional, extras, satellite tools etc) to the content view and publish the 2.0 version.
4. Tail the /var/log/candlepin/candlepin.log you should see the following:

[thread=Thread-502 (ActiveMQ-client-global-threads)] [job=xxxxx, job_key=RegenEnvEntitlementCertsJob, org=redhat, csid=] INFO  org.candlepin.async.JobManager - Starting job "Regenerate Environment Entitlement Certificates" using class: org.candlepin.async.tasks.RegenEnvEntitlementCertsJob
[thread=Thread-502 (ActiveMQ-client-global-threads)] [job=xxxxx, job_key=RegenEnvEntitlementCertsJob, org=redhat, csid=] INFO  org.candlepin.controller.EntitlementCertificateGenerator - Regenerating relevant certificates in environment: xxxxxxxxxxxxxxxxxxxxx

5. Wait for about 2 to 3 minutes, then you can use the virt-who fake report to simulate the VM migrations.
6. Then run the following curl command to simulate the client checkin.

curl -v -k -u <admin>:<pass> https://satellite.example.com/rhsm/consumers/<subscription uuid of the migrated VM>/certificates/serials

7. You should see the following entitlement revocation and auto healing in candlepin.log:

[req=xxxxxxxx, org=, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=GET, uri=/candlepin/consumers/<uuid>/certificates/serials
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Batch revoking 1 entitlements
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Starting batch delete of pools
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Starting batch delete of entitlements
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Starting delete flush
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - All deletes flushed successfully
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - Recomputing status for 1 consumers.
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.CandlepinPoolManager - All statuses recomputed.
[req=xxxxxxxx, org=my_org, csid=] INFO  org.candlepin.controller.Entitler - Attempting to heal host machine with UUID "<uuid>" for guest with UUID "<uuid>"
[req=xxxxxxxx, org=redhat, csid=] INFO  org.candlepin.policy.js.autobind.AutobindRules - Rules did not select a pool for products: [] and consumer installed products: []
<snip>

8. If you don't want to do step 6 and 7, I think you can also try to remove the affected entitlements manually from the VMs via Satellite web ui.
9. After awhile, the RegenEnvEntitlementCertsJob will fail wit the following error:

[thread=QuartzScheduler_Worker-1] [job=regen_entitlement_cert_of_envXXXX-XXXX-XXXX-XXXX-XXXXXXXX, org=, csid=] INFO  org.candlepin.controller.EntitlementCertificateGenerator - Found 1000 certificates to regenerate.
[thread=QuartzScheduler_Worker-1] [job=regen_entitlement_cert_of_envXXXX-XXXX-XXXX-XXXX-XXXXXXXX, org=, csid=] ERROR org.hibernate.internal.ExceptionMapperStandardImpl - HHH000346: Error during managed flush [Row was updated or deleted by another transaction (or unsaved-value mapping was incorrect) : [org.candlepin.model.Entitlement#XXXXXXXXXXXXXXXXXXXXXXXXX]]  <================
...
[thread=QuartzScheduler_Worker-1] [job=regen_entitlement_cert_of_envXXXX-XXXX-XXXX-XXXX-XXXXXXXX, org=, csid=] ERROR org.candlepin.pinsetter.tasks.KingpinJob - Job: org.candlepin.pinsetter.tasks.RegenEnvEntitlementCertsJob encountered a problem.
...
[thread=QuartzScheduler_Worker-1] [job=regen_entitlement_cert_of_envXXXX-XXXX-XXXX-XXXX-XXXXXXXX, org=, csid=] INFO  org.candlepin.pinsetter.tasks.KingpinJob - Job completed: time=2748452  <=========== 45 minutes


Actual results:
Failed to mark entitlement as dirty. Clients are unable to see and enable new repositories.


Expected results:
No error. Clients can see and enable new repositories.

Additional infos:
In my opinion, slowness is caused by the following reasons:
- Large number of entitlments are attached to the environments. For example each host is attaching 10+ entitlments
- Adding multiple new contents/repositories to the content views.

# src/main/java/org/candlepin/controller/EntitlementCertificateGenerator.java
    public void regenerateCertificatesOf(String environmentId, Collection<String> contentIds,
        boolean lazy) {

        log.info("Regenerating relevant certificates in environment: {}", environmentId);

        Set<Entitlement> entsToRegen = new HashSet<>();

        entLoop: for (Entitlement entitlement : this.entitlementCurator.listByEnvironment(environmentId)) { <=======
            // Impl note:
            // Since the entitlements came from the DB, we should be safe to traverse the graph as
            // necessary without any sanity checks (so long as our model's restrictions aren't
            // broken).

            for (String contentId : contentIds) {   <======== Each entitlement needs to loop multiple times here doesn't seem to be efficient
                if (entitlement.getPool().getProduct().hasContent(contentId)) {
                    entsToRegen.add(entitlement);
                    continue entLoop;
                }
                Collection<Product> providedProducts = entitlement.getPool().getProduct()
                    .getProvidedProducts();
                for (Product provided : providedProducts) {
                    if (provided.hasContent(contentId)) {
                        entsToRegen.add(entitlement);
                        continue entLoop;
                    }
                }
            }
        }

Comment 3 Nikos Moumoulidis 2021-12-15 08:44:13 UTC

*** Bug 2026504 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2022-07-05 14:29:32 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Satellite 6.11 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5498

Note You need to log in before you can comment on or make changes to this bug.