Bug 1931027

Summary: Entitlement certificate is missing content section for a custom product
Product: Red Hat Satellite Reporter: Alexey Masolov <amasolov>
Component: CandlepinAssignee: Chris Roberts <chrobert>
Status: CLOSED ERRATA QA Contact: Vladimír Sedmík <vsedmik>
Severity: high Docs Contact:
Priority: high    
Version: 6.8.0CC: ahumbe, ajambhul, akapse, alsouza, anrussel, aperotti, archvilen, avnkumar, aymeric.marchal, bbuckingham, bcourt, benjamin.hunt, bnerickson87, bshahu, caitslin, casl, cdonnell, chrobert, crog, dsynk, elepape, francesco.trentini, fratto, goetz.dirk, gscarbor, hakon.gislason, jalviso, jan.vanmullem, jason.grantz, jbhatia, jlenz, JONATHAN.SATTELBERGER, jpasqual, jrichards2, juholmes, karnsing, kkinge, marco.verschuur, matthias.zoeschg, mhjacks, mjia, momran, msunil, nicolas.marcotte, nmoumoul, onerleka, osousa, paji, pcreech, pdudley, pmendezh, pmoravec, rcavalca, redakkan, redhatbugs, risantam, rrajput, sadas, saydas, sfroemer, shughes, smajumda, swachira, tharring, thomas.zajic, timo.alatalo, vcojot, wpoteat
Target Milestone: 6.13.0Keywords: PrioBumpGSS, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
URL: https://projects.theforeman.org/issues/35599
Whiteboard:
Fixed In Version: candlepin-4.0.20-1, candlepin-4.1.19-1, candlepin-4.2.11-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1931913 1931923 2142944 2150116 2166748 (view as bug list) Environment:
Last Closed: 2023-05-03 13:20:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1931913, 1931923    
Bug Blocks: 2142944    
Attachments:
Description Flags
Candlepin Log
none
candlepin standalone reproducer
none
improved standalone candlepin reproducer
none
reproducer with manual client none

Description Alexey Masolov 2021-02-20 06:55:22 UTC
Description of problem:
Some custom repositories are not available for clients. Its entitlement certificates are missing the content part. 

=== working cert ===
+-------------------------------------------+
	Entitlement Certificate
+-------------------------------------------+

Certificate:
	Path: /etc/pki/entitlement/2995025920494102558.pem
	Version: 3.4
	Serial: 2995025920494102558
	Start Date: 2020-04-23 15:06:04+00:00
	End Date: 2049-12-01 00:00:00+00:00
	Pool ID: 8a13135d7186020a0171a7930ed93e2f
...
Product:
	ID: 146332194096
	Name: PostgreSQL-10
	Version:
	Arch: ALL
	Tags:
	Brand Type:
	Brand Name:
...
Authorized Content URLs:
	/ORG/DEV/CCV_RHEL7_7/custom/PostgreSQL-10/PostgreSQL-10-RHEL7

Content:
	Type: yum
	Name: PostgreSQL-10-RHEL7
	Label: ORG_PostgreSQL-10_PostgreSQL-10-RHEL7
	Vendor: Custom
	URL: /ORG/DEV/CCV_RHEL7_7/custom/PostgreSQL-10/PostgreSQL-10-RHEL7
	GPG: ../../katello/api/v2/repositories/6044/gpg_key_content
	Enabled: True
	Expires: 1
	Required Tags:
	Arches: ALL

== broken cert ===
+-------------------------------------------+
	Entitlement Certificate
+-------------------------------------------+

Certificate:
	Path: /etc/pki/entitlement/3202744398838568589.pem
	Version: 3.4
	Serial: 3202744398838568589
	Start Date: 2020-04-23 15:21:31+00:00
	End Date: 2049-12-01 00:00:00+00:00
	Pool ID: 8a13135d7186020a0171a7a132aa3fc8
...
Product:
	ID: 755521270373
	Name: PostgreSQL-11
	Version:
	Arch: ALL
	Tags:
	Brand Type:
	Brand Name:

Order:
	Name: PostgreSQL-11
	Number:
	SKU: 755521270373
	Contract:
	Account:
	Service Type:
	Roles:
	Service Level:
	Usage:
	Add-ons:
	Quantity: Unlimited
	Quantity Used: 1
	Socket Limit:
	RAM Limit:
	Core Limit:
	Virt Only: False
	Stacking ID:
	Warning Period: 0
	Provides Management: False
===

Further investigation indicated missing data in cp2_product_content. 

Version-Release number of selected component (if applicable):
Satellite 6.8.3

How reproducible:
We do have a reproducer based on customer data.

Actual results:
Custom content is not available for clients

Expected results:
Custom content should be available for clients.

Comment 12 Nikos Moumoulidis 2021-02-23 15:07:16 UTC
*** Bug 1928837 has been marked as a duplicate of this bug. ***

Comment 14 Chris "Ceiu" Rog 2021-02-23 16:51:37 UTC
For those of you still hitting this issue, I would like the following, with as much detail as your organization permits:

- What steps/commands are used to create the custom products and repos (content)? Does the custom product specify an architecture?

- What steps/commands are the clients using to register? Are they registering using an environment, or an activation key? Do the client facts line up with those specified by the affected product(s) (i.e. arch)?

- If the affected clients are registering to an environment, what content (repos) have been promoted to that environment? If using an activation key, what content overrides are specified (if any), and what products are attached to the key?



If one or more terms are not clear, by all means, feel free to ask. The terminology deviates between each layer, and it's sometimes hard to clearly make the distinction between what Candlepin uses internally, and what client tooling uses externally.

Comment 21 Dirk Götz 2021-02-24 09:18:38 UTC
I have no access to the environment at the moment, but to answer your questions from what I remember.

- What steps/commands are used to create the custom products and repos (content)? Does the custom product specify an architecture?

The customer was hit by this bug when using the foreman_scc_manager plugin to add SLES repositories. Because of the structure of SUSE repositories and the way the plugin handles repositories, it was decided to limit it to architecture x86_64 .

- What steps/commands are the clients using to register? Are they registering using an environment, or an activation key? Do the client facts line up with those specified by the affected product(s) (i.e. arch)?

The systems were registered using subscription-manager both on already installed systems and during provisioning of new ones. In both cases an activation key was used. The architecture was correctly reported as x86_64 from what I was told.

- If the affected clients are registering to an environment, what content (repos) have been promoted to that environment? If using an activation key, what content overrides are specified (if any), and what products are attached to the key?

A content view and composite content view was used to stage the content and promote it to all lifecycle environments, all this was fine when I verified it. On the key was no override and all products were attached, but there was an override for repo_gpgcheck = 0 to help with https://bugzilla.redhat.com/show_bug.cgi?id=1858231 executed as script on the systems.

I hope this helps!

Comment 24 caitslin 2021-02-25 16:14:38 UTC
- What steps/commands are used to create the custom products and repos (content)? Does the custom product specify an architecture?

To create a Product:

$ hammer product create --organization "My Awesome Org" \
    --name "Icinga2" \
    --description "Icinga2 Client and Server Packages" \
    --sync-plan "Weekly"

To create a Product Repository:

$ hammer repository create --organization "My Awesome Org" \
    --content-type yum \
    --download-policy immediate \
    --http-proxy-policy global_default_http_proxy \
    --publish-via-http false \
    --product "Icinga2" \
    --name "EL8 Client" \
    --url "https://packages.icinga.com/epel/8Client/release" \
    --ignorable-content "drpm,srpm,distribution"

No architecture restrictions are set when creating Product Repos. The running configuration shows the "Restrict to architecture" setting as "Default".

- What steps/commands are the clients using to register? Are they registering using an environment, or an activation key? Do the client facts line up with those specified by the affected product(s) (i.e. arch)?

Content Hosts register with an Activation Key:

$ subscription-manager register --org="My_Awesome_Org" --activationkey="rhel8"

I have confirmed that client facts (namely, architecture) do match. I compared the output of "subscription-manager facts" against the data reported during client host registration.

- If the affected clients are registering to an environment, what content (repos) have been promoted to that environment? If using an activation key, what content overrides are specified (if any), and what products are attached to the key?

Products attached to "rhel8" Activation Key:

Custom: CrowdStrike, Elastic Stack, EPEL, Icinga2, OCS Inventory
Red Hat: RHEL Academic Site Subscription

Regarding Content Overrides, if a Product provides multiple repositories (EL6, EL7, EL8, etc.), a RHEL 8 Activation Key will have the EL{6,7} repos disabled.

For what it's worth, Product attachments may vary depending on the function of the client host. All client hosts will be attached to the above Custom Products, but could also have an attachment to other Custom Products (Docker, or PostgreSQL, for example) that are not explicitly provided by the Activation Key. These are enabled by running:

$ subscription-manager attach --pool=POOL-ID

I'm not sure if this sheds light onto anything or muddles the issue. Nonetheless, I wanted to provide as much data as possible.

Comment 25 Chris "Ceiu" Rog 2021-02-25 21:12:34 UTC
If you look directly in the Candlepin database, what are the results of the following queries (replacing <PRODUCT_NAME> for the name of an affected product)?

[1]
SELECT product.uuid AS "Product UUID", product.name AS "Product Name", product.product_id, content.* FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON content.uuid = pc.content_uuid WHERE product.name LIKE '%<PRODUCT_NAME>%' ORDER BY product.name ASC;

[2]
candlepin=> SELECT ak.name AS "AK Name", akp.product_uuid, product.product_id, pc.content_uuid, content.content_id, content.name FROM cp_activation_key ak JOIN cp2_activation_key_products akp ON akp.key_id = ak.id JOIN cp2_products product ON product.uuid = akp.product_uuid LEFT JOIN (cp2_product_content pc JOIN cp2_content content ON pc.content_uuid = content.uuid) ON pc.product_uuid = product.uuid WHERE product.name LIKE '%<PRODUCT_NAME>%' ORDER BY ak.name ASC;


Also, I'm assuming not from your answer, but are you using environments at all with the affected clients?

Comment 26 caitslin 2021-02-26 15:06:51 UTC
Here is the output from the Candlepin database:

candlepin=# SELECT product.uuid AS "Product UUID", product.name AS "Product Name", product.product_id, content.* FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON content.uuid = pc.content_uuid WHERE product.name LIKE '%Icinga2%' ORDER BY product.name ASC;
           Product UUID           | Product Name |  product_id  |               uuid               |  content_id   |          created           |          updated           |      
   contenturl         |                        gpgurl                        |                 label                 | metadataexpire |    name    | releasever | requiredtags | typ
e | vendor | arches | entity_version | locked 
----------------------------------+--------------+--------------+----------------------------------+---------------+----------------------------+----------------------------+------
----------------------+------------------------------------------------------+---------------------------------------+----------------+------------+------------+--------------+----
--+--------+--------+----------------+--------
 8a9084ac774028f001776470230f044c | Icinga2      | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | 2021-01-15 11:48:48.671-05 | 2021-02-02 15:30:06.593-05 | /cust
om/Icinga2/EL7_Server | ../../katello/api/v2/repositories/29/gpg_key_content | My_Awesome_Org_Icinga2_EL7_Server |              1 | EL7 Server |            |              | yum
  | Custom |        |    -1107742830 |      0
 8a9084ac774028f001776470230f044c | Icinga2      | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | 2021-01-15 11:48:44.226-05 | 2021-02-02 15:30:06.564-05 | /cust
om/Icinga2/EL7_Client | ../../katello/api/v2/repositories/28/gpg_key_content | My_Awesome_Org_Icinga2_EL7_Client |              1 | EL7 Client |            |              | yum
  | Custom |        |    -1749933833 |      0
(2 rows)

candlepin=# SELECT ak.name AS "AK Name", akp.product_uuid, product.product_id, pc.content_uuid, content.content_id, content.name FROM cp_activation_key ak JOIN cp2_activation_key_products akp ON akp.key_id = ak.id JOIN cp2_products product ON product.uuid = akp.product_uuid LEFT JOIN (cp2_product_content pc JOIN cp2_content content ON pc.content_uuid = content.uuid) ON pc.product_uuid = product.uuid WHERE product.name LIKE '%Icinga2%' ORDER BY ak.name ASC;
               AK Name                |           product_uuid           |  product_id  |           content_uuid           |  content_id   |    name    
--------------------------------------+----------------------------------+--------------+----------------------------------+---------------+------------
 04297ad2-e62e-4fb0-816c-8272cdd26f69 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 04297ad2-e62e-4fb0-816c-8272cdd26f69 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 16a5ded5-d348-43d8-a288-b15261c27c82 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 16a5ded5-d348-43d8-a288-b15261c27c82 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 17575732-ead8-475a-a74a-4698f0dde727 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 17575732-ead8-475a-a74a-4698f0dde727 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 5b1762a1-04a7-4476-b849-179ec3941638 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 5b1762a1-04a7-4476-b849-179ec3941638 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 6051fa03-33af-4a7e-b10b-9aaba91a182c | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 6051fa03-33af-4a7e-b10b-9aaba91a182c | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 7b515fc3-aa5b-4e7f-8e1e-3d0b2984512b | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 7b515fc3-aa5b-4e7f-8e1e-3d0b2984512b | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 866fbed6-f419-4b7b-8005-12df13425202 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 866fbed6-f419-4b7b-8005-12df13425202 | 8a9084ac774028f001776470230f044c | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
(14 rows)

candlepin=# 


The first thing I noticed is that the "EL6 Client" and "EL8 Client" repositories aren't listed above. No changes have been made to the Product or its Repositories. The Icinga2 Product Repositories are synced on a weekly basis and Content Views are published daily. The only substantial change has been that over 100 client hosts have been registered since I reported on the previous bug that the issue was resolved. Then, I only had four client hosts registered.

Regarding Environments, we are using Lifecycle Environments so that we can control the Content Views made available to Content Hosts. If that's not the response you were expecting then I might not be thinking of the same "Environment" feature as you.

Comment 27 Chris "Ceiu" Rog 2021-03-01 14:23:11 UTC
Alright, two more queries:

1) SELECT product.uuid, product.name, content.uuid, content.name FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.name LIKE '%Icinga2%';

2) SELECT env.name, content.uuid, content.content_id, content.name FROM cp_environment env JOIN cp2_environment_content ec ON ec.environment_id = env.id JOIN cp2_content content on ec.content_uuid = content.uuid WHERE content.name LIKE '%Icinga2%';



I think I'm starting to get a handle on what's going on here, if the above queries return what I expect.

For the environments, I mostly just want to make sure that if one is in use, that it's not filtering out content that is otherwise linked properly. There's probably some slight deviation in our definitions, but I don't think it'll matter here -- we'll let the query do the explaining and focus more on that if needed.

Comment 28 Chris "Ceiu" Rog 2021-03-01 14:25:21 UTC
Sorry, just noticed that the content name does not contain the Icinga string. In the above two queries, change the "WHERE content.name LIKE" bit to check content.contenturl instead.

Comment 29 caitslin 2021-03-01 17:52:21 UTC
Hey Chris:

Here's the output of the two queries with the content.contenturl adjustments:

candlepin=# SELECT product.uuid, product.name, content.uuid, content.name FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';
               uuid               |  name   |               uuid               |    name    
----------------------------------+---------+----------------------------------+------------
 8a9084ac774028f001776470230f044c | Icinga2 | 8a9084ac774028f00177647023010446 | EL7 Server
 8a9084ac774028f001776470230f044c | Icinga2 | 8a9084ac774028f00177647022e4043f | EL7 Client
                                  |         | 8a9084ac774028f00177647023180451 | EL6 Client
                                  |         | 8a9084ac774028f00177647022fd0445 | EL8 Client
(4 rows)

candlepin=# SELECT env.name, content.uuid, content.content_id, content.name FROM cp_environment env JOIN cp2_environment_content ec ON ec.environment_id = env.id JOIN cp2_content content on ec.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';
                 name                  |               uuid               |  content_id   |    name    
---------------------------------------+----------------------------------+---------------+------------
 Library                               | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 RHEL_7/RHEL_7_Rolling                 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 Library/RHEL_7_Oracle                 | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 Library/RHEL_7                        | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 RHEL_7/RHEL_7                         | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 Library/RHEL_7_Rolling                | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 RHEL_7_Oracle/RHEL_7_Oracle           | 8a9084ac774028f00177647022e4043f | 1610729324252 | EL7 Client
 Library                               | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 CentOS_8/CentOS_8_Rolling             | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Library/Fedora_30                     | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Library/RHEL_8                        | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 RHEL_8/RHEL_8                         | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Fedora_30/Fedora_30                   | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Library/RHEL_8_Rolling                | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Library/CentOS_8                      | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 CentOS_8/CentOS_8                     | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Library/CentOS_8_Rolling              | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 RHEL_8/RHEL_8_Rolling                 | 8a9084ac774028f00177647022fd0445 | 1610729332842 | EL8 Client
 Library                               | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 RHEL_7/RHEL_7_Rolling                 | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 Library/RHEL_7                        | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 RHEL_7/RHEL_7                         | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 Library/RHEL_7_Rolling                | 8a9084ac774028f00177647023010446 | 1610729328698 | EL7 Server
 Library                               | 8a9084ac774028f00177647023180451 | 1610729319629 | EL6 Client
 Library/RHEL_6                        | 8a9084ac774028f00177647023180451 | 1610729319629 | EL6 Client
 RHEL_6/RHEL_6                         | 8a9084ac774028f00177647023180451 | 1610729319629 | EL6 Client
 Library/RHEL_6_Oracle                 | 8a9084ac774028f00177647023180451 | 1610729319629 | EL6 Client
 RHEL_6_Oracle/RHEL_6_Oracle           | 8a9084ac774028f00177647023180451 | 1610729319629 | EL6 Client
(28 rows)

candlepin=#

Comment 30 Chris "Ceiu" Rog 2021-03-01 19:47:54 UTC
Okay, that's what I was expecting. Somewhere along the line, something is unlinking (or failing to link) the EL6 and EL8 content from the product, which is why it never shows up (and was the case in the other successful reproducer as well). More work will need to be done to determine exactly *why* that happened, but at the very least we have a firm direction to move in.

As far as getting you a fix immediately, a quick DB update and restart of CP (and likely Sat as a whole) should get you going:

INSERT INTO cp2_product_content(product_uuid, content_uuid) VALUES ('8a9084ac774028f001776470230f044c', '8a9084ac774028f00177647023180451'), ('8a9084ac774028f001776470230f044c', '8a9084ac774028f00177647022fd0445');

This will create the link from the product which contains the EL7 client/server to the "missing" content and regenerating the entitlement cert should enable the repo for clients. Note that this technically violates some of the product versioning bits, but it shouldn't have a meaningful impact on anything for one-off custom products like this. That said, try to avoid doing this in favor of using intended tooling wherever possible.

As a (hopefully) final request, I know you provided some steps above, but could you list your exact commands/steps for creating this particular product repo, end-to-end? I'm hoping to be able to spot something in the operations generated by the tooling that would lead to this state.

Comment 31 caitslin 2021-03-01 20:43:36 UTC
I have built an installation and configuration role in Ansible. Ansible uses the shell module to run hammer commands on the application server. For architectural reference, the databases are stored on a separate server. Nevertheless, these are the commands that were run (organization obfuscated):

$ hammer product create --organization "My Awesome Org" \
    --name "Icinga2" \
    --description "Icinga2 Client and Server Packages" \
    --sync-plan "Weekly"

$ hammer repository create --organization "My Awesome Org" \
    --content-type yum \
    --download-policy immediate \
    --http-proxy-policy global_default_http_proxy \
    --publish-via-http false \
    --product "Icinga2" \
    --name "EL8 Client" \
    --url "https://packages.icinga.com/epel/8Client/release" \
    --ignorable-content "drpm,srpm,distribution"

For what it's worth, not all, but some, other Custom products are exhibiting the same behavior as what we're seeing with Icinga2. The same commands above were run for all Custom products. Should I run similar queries on the CP database to restore those linkages? Or is that too dangerous?

Initially, all was well; shortly after building this system I only had four client hosts registered. A few weeks passed and I had since registered over 100 client hosts. I then noticed the issue as reports started coming in that repos were missing. I'm not sure if that could have anything to do with it.

Comment 32 Chris "Ceiu" Rog 2021-03-01 21:12:16 UTC
Hrmm... maybe. First, regarding the above, what about the EL7 server, and EL6 and EL7 clients? How were those added to the product/repo?

Comment 33 caitslin 2021-03-02 14:45:03 UTC
They were added the same way:

$ hammer repository create --organization "My Awesome Org" \
    --content-type yum \
    --download-policy immediate \
    --http-proxy-policy global_default_http_proxy \
    --publish-via-http false \
    --product "Icinga2" \
    --name "EL6 Client" \
    --url "https://packages.icinga.com/epel/6Client/release" \
    --ignorable-content "drpm,srpm,distribution"

$ hammer repository create --organization "My Awesome Org" \
    --content-type yum \
    --download-policy immediate \
    --http-proxy-policy global_default_http_proxy \
    --publish-via-http false \
    --product "Icinga2" \
    --name "EL7 Client" \
    --url "https://packages.icinga.com/epel/7Client/release" \
    --ignorable-content "drpm,srpm,distribution"

$ hammer repository create --organization "My Awesome Org" \
    --content-type yum \
    --download-policy immediate \
    --http-proxy-policy global_default_http_proxy \
    --publish-via-http false \
    --product "Icinga2" \
    --name "EL7 Server" \
    --url "https://packages.icinga.com/epel/7Server/release" \
    --ignorable-content "drpm,srpm,distribution"

Comment 34 Chris "Ceiu" Rog 2021-03-02 15:19:43 UTC
Alright, so nothing special or fancy.

Rewinding a bit, to summarize your situation, after you initially created these, the first handful of clients were receiving the content as expected but somewhere after that some of the content went missing? Were there any other operations done between initial creation and when you noticed they no longer received content (such as a manifest import, or other content/product affecting operation)? 

If you run the same set of commands listed above for a new dummy product (test-icinga, icinga3, or some such) and then run the two content queries [1][2], do you get the expected result?

The goal with the above questions is to determine if the linkage issue is happening during/around creation time, or if something later is coming along and breaking it.





[1] SELECT product.uuid AS "Product UUID", product.name AS "Product Name", product.product_id, content.* FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON content.uuid = pc.content_uuid WHERE product.name LIKE '%Icinga2%' ORDER BY product.name ASC;

[2] SELECT product.uuid, product.name, content.uuid, content.name FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';

Comment 35 caitslin 2021-03-02 16:16:12 UTC
> Rewinding a bit, to summarize your situation, after you initially created these, 
> the first handful of clients were receiving the content as expected but somewhere 
> after that some of the content went missing?

Correct. Our previous system was experiencing the same issue. I initially setup four clients on this new system. I ensured all was working as expected over a one - two week span. I then setup the remainder of our clients with this system. Recently, it was reported to me that certain errata and packages were unavailable. I found it was due to missing repo content, which turned out to be the entitlement certificate missing the content section. I watched this actively happen as a RHEL 8 test system, I confirmed, had the content section in the certificate, and then it was missing after a "subscription-manager refresh". All subsequent attempts to fix the issue failed. I was so desperate that I created an API client using the Candlepin documentation to trigger lazy certificate regeneration (https://www.candlepinproject.org/docs/candlepin/lazy_cert_regen.html). Since that didn't work, I decided to file a BZ ticket (https://bugzilla.redhat.com/show_bug.cgi?id=1928837) which got merged into this ticket.

> Were there any other operations done between initial creation and when you noticed 
> they no longer received content (such as a manifest import, or other content/product affecting operation)?

The "Duo Security" custom product and repos were added for RHEL 7 and RHEL 8, but via web GUI instead of the hammer command. Shortly after that product was added is when repositories appeared to randomly go missing. So far the issue has been found with Icinga2 and CentOS custom products and repos.

> If you run the same set of commands listed above for a new dummy product (test-icinga, 
> icinga3, or some such) and then run the two content queries [1][2], do you get the expected result?

I will run the test and post the result ASAP.

A big thanks in advance for all your efforts!

Comment 38 Chris "Ceiu" Rog 2021-03-03 20:06:46 UTC
@caitslin 

Looks like it created the new product and linked it as expected. If you were to add the other repos to fully mirror the real product (EL6, EL7 client/server), does that retain the correct linkage (that is, do all lines look like the first three lines of your output in comment 36)? If so, I would ask that you make a note somewhere of the current time, and then periodically monitor it. If those links for the DummyIcinga2 ever change, I would like to see the full candlepin.log from the time they are created to when you notice the breakage. Somewhere in there I'd expect to see a request targeting the products or content (as Candlepin calls them) that would cause the link to be broken. A quick search for the uuids and/or the product IDs should show something in the logs -- especially if debug logging is enabled.

If you happen to have the full candlepin.log history from the time you created an affected product to when you noticed them being broken, that would also probably work. If you'd rather we do that, I would recommend opening a customer case or attaching the file privately here if that's an option available to you. There's a lot of sensitive bits in there though, so I would highly recommend the former. If neither of those works, let me know and we can figure out another option.

As far as what we're doing on our end -- we're stepping through logs sent in by others hitting this issue to see if anything stands out in that regard.

Also, I've privated both of those comments for you to mitigate/minimize the accidental data leak. Probably minor at this point, but hopefully that helps.

Comment 39 caitslin 2021-03-05 20:12:25 UTC
@crog 

> If you were to add the other repos to fully mirror the real product 
> (EL6, EL7 client/server), does that retain the correct linkage (that 
> is, do all lines look like the first three lines of your output in 
> comment 36)?

Actually, unless I'm misreading this, it looks like duplicates were created.

candlepin=# SELECT product.uuid, product.name, content.uuid, content.name FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';
               uuid               |     name     |               uuid               |    name    
----------------------------------+--------------+----------------------------------+------------
 8a9084ac7797e7c40177f8e7acc50a02 | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac7797e7c4017803f50d520a7d | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac7797e7c4017803f50d520a7d | DummyIcinga2 | 8a9084ac7797e7c4017803f50d070a7c | EL6 Client
 8a9084ac7797e7c4017803f655370a83 | DummyIcinga2 | 8a9084ac7797e7c4017803f654ee0a82 | EL7 Client
 8a9084ac7797e7c4017803f655370a83 | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac7797e7c4017803f655370a83 | DummyIcinga2 | 8a9084ac7797e7c4017803f50d070a7c | EL6 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f654ee0a82 | EL7 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f76fd20a89 | EL7 Server
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f50d070a7c | EL6 Client
 8a9084ac774028f001776470230f044c | Icinga2      | 8a9084ac774028f00177647023010446 | EL7 Server
 8a9084ac774028f001776470230f044c | Icinga2      | 8a9084ac774028f00177647022e4043f | EL7 Client
                                  |              | 8a9084ac774028f00177647023180451 | EL6 Client
                                  |              | 8a9084ac774028f00177647022fd0445 | EL8 Client
(14 rows)

However, it looks like all the Content for the DummyIcinga2 product is there:

candlepin=# SELECT product.uuid AS "Product UUID", product.name AS "Product Name", product.product_id, content.* FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON content.uuid = pc.content_uuid WHERE product.name LIKE '%Icinga2%' ORDER BY product.name ASC;
           Product UUID           | Product Name |  product_id  |               uuid               |  content_id   |          created           |          updated           |      
     contenturl            |                        gpgurl                        |                   label                    | metadataexpire |    name    | releasever | required
tags | type | vendor | arches | entity_version | locked 
----------------------------------+--------------+--------------+----------------------------------+---------------+----------------------------+----------------------------+------
---------------------------+------------------------------------------------------+--------------------------------------------+----------------+------------+------------+---------
-----+------+--------+--------+----------------+--------
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 82070718890  | 8a9084ac7797e7c4017803f654ee0a82 | 1614974178542 | 2021-03-05 14:56:18.542-05 | 2021-03-05 14:56:18.542-05 | /cust
om/DummyIcinga2/EL7_Client |                                                      | My_Awesome_Org_DummyIcinga2_EL7_Client |              1 | EL7 Client |            |         
     | yum  | Custom |        |     1614698953 |      0
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 82070718890  | 8a9084ac7797e7c40177f8e7ab0c0a01 | 1614788668158 | 2021-03-03 11:24:28.172-05 | 2021-03-03 11:24:28.172-05 | /cust
om/DummyIcinga2/EL8_Client |                                                      | My_Awesome_Org_DummyIcinga2_EL8_Client |              1 | EL8 Client |            |         
     | yum  | Custom |        |    -2099817882 |      0
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 82070718890  | 8a9084ac7797e7c4017803f76fd20a89 | 1614974250963 | 2021-03-05 14:57:30.962-05 | 2021-03-05 14:57:30.962-05 | /cust
om/DummyIcinga2/EL7_Server |                                                      | My_Awesome_Org_DummyIcinga2_EL7_Server |              1 | EL7 Server |            |         
     | yum  | Custom |        |    -2083073351 |      0
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 82070718890  | 8a9084ac7797e7c4017803f50d070a7c | 1614974094572 | 2021-03-05 14:54:54.599-05 | 2021-03-05 14:54:54.599-05 | /cust
om/DummyIcinga2/EL6_Client |                                                      | My_Awesome_Org_DummyIcinga2_EL6_Client |              1 | EL6 Client |            |         
     | yum  | Custom |        |    -1527547636 |      0
 8a9084ac774028f001776470230f044c | Icinga2      | 275076129832 | 8a9084ac774028f00177647023010446 | 1610729328698 | 2021-01-15 11:48:48.671-05 | 2021-02-02 15:30:06.593-05 | /cust
om/Icinga2/EL7_Server      | ../../katello/api/v2/repositories/29/gpg_key_content | My_Awesome_Org_Icinga2_EL7_Server      |              1 | EL7 Server |            |         
     | yum  | Custom |        |    -1107742830 |      0
 8a9084ac774028f001776470230f044c | Icinga2      | 275076129832 | 8a9084ac774028f00177647022e4043f | 1610729324252 | 2021-01-15 11:48:44.226-05 | 2021-02-02 15:30:06.564-05 | /cust
om/Icinga2/EL7_Client      | ../../katello/api/v2/repositories/28/gpg_key_content | My_Awesome_Org_Icinga2_EL7_Client      |              1 | EL7 Client |            |         
     | yum  | Custom |        |    -1749933833 |      0
(6 rows)

Regarding the logs, I'm stepping through them. I have all the Candlepin logs from the point of when this system was built until now. After the last experience, I have been religiously archiving them. I'll let you know if I find anything. I'm happy to share them if you or your team would like to take a gander.

Comment 40 Jonathon Turel 2021-03-08 19:56:42 UTC
Hi all. Thanks to those who have provided some details around the issue. We're still investigating and I'm replying with a condensed info-gathering procedure given the discussion so far to (hopefully) make things simpler since there are a lot of comments. If you suspect you're running into this bug then follow the below steps and report back:


For the products with missing content, look up the product by name via "foreman-rake console" on the server:

irb(main):017:0> product = Katello::Product.find_by_name(MyProduct')

Note the 'cp_id' which is 146332194096 in this case and referred to as $PRODUCT_CP_ID below
irb(main):022:0> product.cp_id
=> "146332194096"

Get the Candlepin content ids for the product:

irb(main):024:0> product.contents.pluck('cp_content_id')
=> ["1587654605985", "1587654953887"]


Exit the rake console and connect to postgres on the server. Compare the state of the Candlepin DB to what was found above:


candlepin=# SELECT product.name, product.uuid, product.product_id, content.content_id, content.contenturl,content.arches, content.requiredtags FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON content.uuid = pc.content_uuid WHERE product.product_id = '$PRODUCT_CP_ID';


There should be one row returned for each cp_content_id above (two in my example), otherwise the product to content linkage has been broken.

The final step is to determine which Candlepin APIs have been issued with respect to this product and having Candlepin logs since the products+repos were created would be beneficial here, so we can see to full history.

cd /var/log/candlepin on the server
extract all candlepin log archives: gunzip *.gz
grep $PRODUCT_CP_ID *

Share the query and log results here for analysis!

Comment 41 Chris "Ceiu" Rog 2021-03-08 21:09:12 UTC
(In reply to caitslin from comment #39)

> Actually, unless I'm misreading this, it looks like duplicates were created.

Possibly, but probably not. At the application level, products and content/repos are considered immutable. So any changes made to them creates a copy and changes all the links accordingly. You can verify whether or not they are duplicates by examining the links in cp2_owner_products (and cp2_owner_content). I would expect the older instances to not appear in that table at all, whereas the currently active ones show up exactly once. The unlinked older versions of those products (internally referred to as "orphans") will be removed next time the OrphanCleanupJob is run; so don't be surprised if they vanish within a week or so.

There's a possibility this is where the problem lies, though this area has a ton of automated testing surrounding it, and there's nothing special about custom products that would give me concern. However for the sake of completeness, we can verify the links end-to-end by following from cp_pool->cp2_products->cp2_product_content->cp2_content:

SELECT pool.id, product.uuid, product.product_id, product.name, content.uuid, content.content_id, content.name FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%DummyIc%';

Then you can compare the UUIDs from that result to those from:

SELECT op.* FROM cp2_owner_products op JOIN cp_owner owner WHERE owner.displayname='...' OR owner.account='...';
SELECT oc.* FROM cp2_owner_content oc JOIN cp_owner owner WHERE owner.displayname='...' OR owner.account='...';

Where owner.displayname or owner.account are set to the display name or key/account name of the owner/org in question. If your deployment only has a single org, you can omit the join on cp_owner and everything after it.

I would expect that all of the product/content UUIDs from the first query are the ones returned in owner_products/content queries. If those *don't* line up, we've some deep CP problems to investigate.

Comment 42 caitslin 2021-03-09 16:25:02 UTC
(In reply to Chris "Ceiu" Rog from comment #41)

> There's a possibility this is where the problem lies, though this area has a
> ton of automated testing surrounding it, and there's nothing special about
> custom products that would give me concern. However for the sake of
> completeness, we can verify the links end-to-end by following from
> cp_pool->cp2_products->cp2_product_content->cp2_content:
> 
> SELECT pool.id, product.uuid, product.product_id, product.name,
> content.uuid, content.content_id, content.name FROM cp_pool pool JOIN
> cp2_products product ON pool.product_uuid = product.uuid JOIN
> cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content
> content ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE
> '%DummyIc%';
> 
> Then you can compare the UUIDs from that result to those from:
> 
> SELECT op.* FROM cp2_owner_products op JOIN cp_owner owner WHERE
> owner.displayname='...' OR owner.account='...';
> SELECT oc.* FROM cp2_owner_content oc JOIN cp_owner owner WHERE
> owner.displayname='...' OR owner.account='...';

It looks like everything returned as expected, unless I overlooked something:

candlepin=# SELECT pool.id, product.uuid, product.product_id, product.name, content.uuid, content.content_id, content.name FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%DummyIc%';
                id                |               uuid               | product_id  |     name     |               uuid               |  content_id   |    name    
----------------------------------+----------------------------------+-------------+--------------+----------------------------------+---------------+------------
 8a9084ac7797e7c40177f8e45cb60a00 | 8a9084ac7797e7c4017803f770160a8a | 82070718890 | DummyIcinga2 | 8a9084ac7797e7c4017803f654ee0a82 | 1614974178542 | EL7 Client
 8a9084ac7797e7c40177f8e45cb60a00 | 8a9084ac7797e7c4017803f770160a8a | 82070718890 | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | 1614788668158 | EL8 Client
 8a9084ac7797e7c40177f8e45cb60a00 | 8a9084ac7797e7c4017803f770160a8a | 82070718890 | DummyIcinga2 | 8a9084ac7797e7c4017803f76fd20a89 | 1614974250963 | EL7 Server
 8a9084ac7797e7c40177f8e45cb60a00 | 8a9084ac7797e7c4017803f770160a8a | 82070718890 | DummyIcinga2 | 8a9084ac7797e7c4017803f50d070a7c | 1614974094572 | EL6 Client
(4 rows)

candlepin=# SELECT op.* FROM cp2_owner_products op WHERE product_uuid='8a9084ac7797e7c4017803f770160a8a';
             owner_id             |           product_uuid           
----------------------------------+----------------------------------
 8a9084ac7706e6a3017706e8b6140001 | 8a9084ac7797e7c4017803f770160a8a
(1 row)

candlepin=# SELECT oc.* FROM cp2_owner_content oc WHERE content_uuid='8a9084ac7797e7c4017803f654ee0a82' or content_uuid='8a9084ac7797e7c40177f8e7ab0c0a01' or content_uuid='8a9084ac7797e7c4017803f76fd20a89' or content_uuid='8a9084ac7797e7c4017803f50d070a7c';
             owner_id             |           content_uuid           
----------------------------------+----------------------------------
 8a9084ac7706e6a3017706e8b6140001 | 8a9084ac7797e7c40177f8e7ab0c0a01
 8a9084ac7706e6a3017706e8b6140001 | 8a9084ac7797e7c4017803f50d070a7c
 8a9084ac7706e6a3017706e8b6140001 | 8a9084ac7797e7c4017803f654ee0a82
 8a9084ac7706e6a3017706e8b6140001 | 8a9084ac7797e7c4017803f76fd20a89
(4 rows)

Comment 43 caitslin 2021-03-09 16:38:02 UTC
(In reply to Jonathon Turel from comment #40)

> For the products with missing content, look up the product by name via
> "foreman-rake console" on the server:

# foreman-rake console
Loading production environment (Rails 6.0.3.1)
irb(main):001:0> product = Katello::Product.find_by_name('Icinga2')
=> #<Katello::Product id: 14, name: "Icinga2", description: "Icinga2 Client and Server Packages", cp_id: "275076129832", multiplier: nil, provider_id: 1, created_at: "2021-01-15 16:45:14", updated_at: "2021-02-02 20:30:05", gpg_key_id: 15, sync_plan_id: 2, label: "Icinga2", organization_id: 1, ssl_ca_cert_id: nil, ssl_client_cert_id: nil, ssl_client_key_id: nil>

> Get the Candlepin content ids for the product:

irb(main):002:0> product.contents.pluck('cp_content_id')
=> ["1610729319629", "1610729324252", "1610729328698", "1610729332842"]

> Exit the rake console and connect to postgres on the server. Compare the
> state of the Candlepin DB to what was found above:

candlepin=# SELECT product.name, product.uuid, product.product_id, content.content_id, content.contenturl,content.arches, content.requiredtags FROM cp_pool pool JOIN cp2_products product ON pool.product_uuid = product.uuid JOIN cp2_product_content pc ON product.uuid = pc.product_uuid JOIN cp2_content content ON content.uuid = pc.content_uuid WHERE product.product_id = '275076129832';
  name   |               uuid               |  product_id  |  content_id   |         contenturl         | arches | requiredtags 
---------+----------------------------------+--------------+---------------+----------------------------+--------+--------------
 Icinga2 | 8a9084ac774028f001776470230f044c | 275076129832 | 1610729328698 | /custom/Icinga2/EL7_Server |        | 
 Icinga2 | 8a9084ac774028f001776470230f044c | 275076129832 | 1610729324252 | /custom/Icinga2/EL7_Client |        | 
(2 rows)


> There should be one row returned for each cp_content_id above (two in my
> example), otherwise the product to content linkage has been broken.
> 
> The final step is to determine which Candlepin APIs have been issued with
> respect to this product and having Candlepin logs since the products+repos
> were created would be beneficial here, so we can see to full history.

$ for f in $(ls -tr1 candlepin.log*.gz); do zcat $f | egrep '275076129832' >> candlepin-275076129832.log; done; egrep '275076129832' candlepin.log >> candlepin-275076129832.log

I have attached the candlepin-275076129832.log to this ticket.

Comment 44 caitslin 2021-03-09 16:39:14 UTC
Created attachment 1762034 [details]
Candlepin Log

Comment 45 Chris "Ceiu" Rog 2021-03-16 18:57:28 UTC
@caitslin
Can you check the logs to see if there has been a manifest import or manifest removal between the time the custom product was setup and the time content goes missing/unlinked? These operations will appear in the logs as a POST or DELETE to /owners/{owner_key}/imports in candlepin.log.

Comment 46 caitslin 2021-03-19 14:28:57 UTC
@crog 

I found two POST events from before the custom product was added and one after the linkage broke. I found zero DELETE events.

2021-01-15 11:51:57,995 [thread=http-bio-127.0.0.1-8443-exec-10] [req=266b8b5b-a3c8-48ce-97b1-67d7122c10ec, org=, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=POST, uri=/candlepin/owners/My_Awesome_Org/imports

2021-01-15 11:52:20,760 [thread=http-bio-127.0.0.1-8443-exec-7] [req=b8dc85d0-a88b-4532-8e0a-b5351bbb2aaa, org=, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=POST, uri=/candlepin/owners/My_Awesome_Org/imports

2021-02-24 12:13:45,319 [thread=http-bio-127.0.0.1-8443-exec-7] [req=8ebb12ab-32db-458a-b577-97c14d4c1622, org=, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=POST, uri=/candlepin/owners/My_Awesome_Org/imports

I have narrowed down the timeline of events to this:

01/15 - System was built, software installed, etc.
02/05 - Migrated systems from previous server to this one
02/10 - First issue reported of Icinga repo missing
02/11 - Confirmed issue on reported hosted and witnessed issue first-hand on one of my hosts. Noted that other repos were missing as well issue didn't affect just Icinga.

I am in the process of stepping through the Candlepin logs for all of these days to see if I can find anything that sounds out as anomalous. I hope to report back in the next day or so with anything I find/don't find.

I am happy to share the log, too, if you'd like to take a gander.

Comment 48 Chris "Ceiu" Rog 2021-04-01 17:57:43 UTC
@caitslin
Sorry about the delay in getting back to you on this. If it's still possible, could I get the portion of the Candlepin log from the time at which the custom product is created to the earliest point at which it's noticed the associated content is unlinked or missing?

Due to the potential sensitive nature of the contents of the log, please attach it to the support case for this issue if you have one. Alternatively, we can use Red Hat's SFTP service [1] for this if creating a support case isn't desirable. If neither of those work, we can figure out a third option.


[1] https://access.redhat.com/articles/5594481

Comment 49 caitslin 2021-04-05 15:52:26 UTC
@crog 

I have uploaded the log file via Red Hat SFTP. 

The file name is: 1931027_candlepin-aggregate-2021-01-15-2021-02-11-sorted.log.gz

The product was added when the Candlepin, et al., software was installed on 01/15. The issue was noticed a coworker on 2/10 at roughly 10:55a. I confirmed the issue on 2/11 at roughly 08:51a by viewing the repos on a host, noticed the Icinga2 repo was available, then disappeared after a 'subscription-manager refresh'.

I stepped through every Candlepin log entry from 02/01 - 02/11. It took me almost three weeks; I don't claim to be an expert, but the following items stood out me:

1) CRL serial blacklist events were frequent
2) Batch revocations happened occasionally
3) Consumer unregister/register appears to trigger certificate regeneration

I'm assuming some of those events are caused by daily Content View creation & publication. 

I aggregated the three Candlepin logs (candlepin, audit, and error) so I could provide you with as much data as possible.

Let me know if there's anything else you need.

Comment 50 Chris "Ceiu" Rog 2021-04-28 16:21:03 UTC
(In reply to caitslin from comment #49)
> @crog 
> 
> I have uploaded the log file via Red Hat SFTP. 
> 
> The file name is:
> 1931027_candlepin-aggregate-2021-01-15-2021-02-11-sorted.log.gz
> 
> The product was added when the Candlepin, et al., software was installed on
> 01/15. The issue was noticed a coworker on 2/10 at roughly 10:55a. I
> confirmed the issue on 2/11 at roughly 08:51a by viewing the repos on a
> host, noticed the Icinga2 repo was available, then disappeared after a
> 'subscription-manager refresh'.
> 
> I stepped through every Candlepin log entry from 02/01 - 02/11. It took me
> almost three weeks; I don't claim to be an expert, but the following items
> stood out me:
> 
> 1) CRL serial blacklist events were frequent
> 2) Batch revocations happened occasionally
> 3) Consumer unregister/register appears to trigger certificate regeneration
> 
> I'm assuming some of those events are caused by daily Content View creation
> & publication. 
> 
> I aggregated the three Candlepin logs (candlepin, audit, and error) so I
> could provide you with as much data as possible.
> 
> Let me know if there's anything else you need.

Sorry for the delay on any updates here. We've looked through the provided log and nothing immediately stood out, and so far have still been unsuccessful at reproducing the issue on our end.

That said, is this something that still comes up in your environment? Has it occurred again since the manual repair of the product-content linkage? If so, how frequently is it occurring?

If this is something that still comes up, I want to try the following:
- Wait until the issue occurs again
- Once noticed, perform the manual repair to get the product and content in a functional state
- Clear the candlepin log or otherwise mark the point at which the product is fixed
- Turn on debug logging by adding the line "log4j.logger.org.candlepin=DEBUG" to /etc/candlepin/candlepin.conf and restart Candlepin and/or the entire Sat environment to get the change to take effect
- Wait until the issue occurs again
- Grab candlepin.log for the duration between the last fix and the breakage

The audit.log and error.log are not important for debugging this issue and can be safely ignored for the time being. Hopefully there will be some details in the debug output that will indicate what is happening.

Comment 51 caitslin 2021-04-30 16:53:58 UTC
@crog 

Yes, we have noticed it with other products, too. For example, CentOS 8 content is no longer available to CentOS 8 consumers. It went missing around the same time as Icinga2, but I didn't bring it up because I didn't want to distract from resolving what was (hopefully) a larger issue.

Regarding the fix, I held off on applying that in the event you needed more diagnostic data prior to making any manual changes.

I attempted to apply the manual fix for Icinga2 today and it was not successful.

First, I noticed I still had the DummyIcinga2 product you requested I create as part of our diagnostics.

candlepin=# SELECT product.uuid, product.name, content.uuid, content.name FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';
               uuid               |     name     |               uuid               |    name    
----------------------------------+--------------+----------------------------------+------------
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f654ee0a82 | EL7 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f76fd20a89 | EL7 Server
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f50d070a7c | EL6 Client
 8a9084ac774028f001776470230f044c | Icinga2      | 8a9084ac774028f00177647023010446 | EL7 Server
 8a9084ac774028f001776470230f044c | Icinga2      | 8a9084ac774028f00177647022e4043f | EL7 Client
                                  |              | 8a9084ac774028f00177647023180451 | EL6 Client
                                  |              | 8a9084ac774028f00177647022fd0445 | EL8 Client
(8 rows)

candlepin=#

I then deleted the content and the product. I wanted to verify it had been removed, but instead, it looks like duplicates were created. From the web interface, the product and content are gone. But from the DB, it appears more entries were made:

candlepin=# SELECT product.uuid, product.name, content.uuid, content.name FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';
               uuid               |     name     |               uuid               |    name    
----------------------------------+--------------+----------------------------------+------------
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f654ee0a82 | EL7 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f76fd20a89 | EL7 Server
 8a9084ac7797e7c4017803f770160a8a | DummyIcinga2 | 8a9084ac7797e7c4017803f50d070a7c | EL6 Client
 8a9084ac787dfb43017923ad4ab406f6 | DummyIcinga2 | 8a9084ac7797e7c4017803f654ee0a82 | EL7 Client
 8a9084ac787dfb43017923ad4ab406f6 | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac787dfb43017923ad4ab406f6 | DummyIcinga2 | 8a9084ac7797e7c4017803f76fd20a89 | EL7 Server
 8a9084ac787dfb43017923ad4ec106fd | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac787dfb43017923ad4cb806fa | DummyIcinga2 | 8a9084ac7797e7c40177f8e7ab0c0a01 | EL8 Client
 8a9084ac787dfb43017923ad4cb806fa | DummyIcinga2 | 8a9084ac7797e7c4017803f76fd20a89 | EL7 Server
 8a9084ac774028f001776470230f044c | Icinga2      | 8a9084ac774028f00177647023010446 | EL7 Server
 8a9084ac774028f001776470230f044c | Icinga2      | 8a9084ac774028f00177647022e4043f | EL7 Client
                                  |              | 8a9084ac774028f00177647023180451 | EL6 Client
                                  |              | 8a9084ac774028f00177647022fd0445 | EL8 Client
(14 rows)

candlepin=#

I decided to ignore it for the moment and try the manual fix you suggested. But that failed:

candlepin=# INSERT INTO cp2_product_content(product_uuid, content_uuid) VALUES ('8a9084ac774028f001776470230f044c', '8a9084ac774028f00177647023180451'), ('8a9084ac774028f001776470230f044c', '8a9084ac774028f00177647022fd0445');
ERROR:  null value in column "id" violates not-null constraint
DETAIL:  Failing row contains (8a9084ac774028f001776470230f044c, 8a9084ac774028f00177647023180451, null, null, null, null).
candlepin=#

How would you like me to proceed? Should I enable debug logging and restart Candlepin, or is there something else you'd like me to do first?

Thanks again for all your help with this!

Comment 52 Chris "Ceiu" Rog 2021-04-30 18:18:03 UTC
(In reply to caitslin from comment #51)
> I then deleted the content and the product. I wanted to verify it had been
> removed, but instead, it looks like duplicates were created. From the web
> interface, the product and content are gone. But from the DB, it appears
> more entries were made:

This is kind of outside the scope of this problem, but these are expected if changes are made to the product (or content). Due to some caching and other ORM configuration surrounding products and content, any changes made ends up creating an entirely new row in the DB with a new UUID. The obsolete rows typically aren't active long and will eventually get cleaned up. You can check which is active by relating it back to the cp2_owner_products and cp2_owner_content tables, respectively:

SELECT DISTINCT prod.uuid AS product_uuid, prod.name AS product_name, content.uuid AS content_uuid, content.contenturl FROM cp2_owner_products op JOIN cp2_products prod ON op.product_uuid = prod.uuid JOIN cp2_product_content pc ON pc.product_uuid = prod.uuid JOIN cp2_content content ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%Icinga2%';

This will indicate which UUID represents the "active" version(s) of the product. Note that as-written, the query is only correct for single-org deployments. If you have multiple organizations being managed by Sat/CP, then the WHERE clause will need to be updated accordingly.

 
> I decided to ignore it for the moment and try the manual fix you suggested.
> But that failed:
> 
> candlepin=# INSERT INTO cp2_product_content(product_uuid, content_uuid)
> VALUES ('8a9084ac774028f001776470230f044c',
> '8a9084ac774028f00177647023180451'), ('8a9084ac774028f001776470230f044c',
> '8a9084ac774028f00177647022fd0445');
> ERROR:  null value in column "id" violates not-null constraint
> DETAIL:  Failing row contains (8a9084ac774028f001776470230f044c,
> 8a9084ac774028f00177647023180451, null, null, null, null).
> candlepin=#

My mistake here -- the build of CP you're on must be right after we changed the primary key on that table. Basically, just provide any unique ID for each row and it should work:

INSERT INTO cp2_product_content(id, product_uuid, content_uuid) VALUES 
  ('manual_fix-20210430-a', '8a9084ac774028f001776470230f044c','8a9084ac774028f00177647023180451'), 
  ('manual_fix-20210430-b', '8a9084ac774028f001776470230f044c','8a9084ac774028f00177647022fd0445');

Also, before applying this, it might be a good idea to verify UUIDs in question are the desired ones with the above query.

> 
> How would you like me to proceed? Should I enable debug logging and restart
> Candlepin, or is there something else you'd like me to do first?

Sure, debug logging can be turned on at any point. The important part is once the fix is applied to a product that has been shown to eventually break, the log needs to be either truncated or marked accordingly so we know exactly where to begin looking for anomalies.

My goal here is to get as many details as what Candlepin is doing and being told to do between the point at which a product is in its desired state, and the broken state.


> 
> Thanks again for all your help with this!

np, thanks for your patience and the data you've been providing.

Comment 53 caitslin 2021-05-03 14:38:10 UTC
@crog 

I validated those UUIDs are correct - I went ahead and applied the manual fix to the DB. I restarted Tomcat/Candlepin, and I logged into one of our affected hosts. I ran 'subscription-manager refresh' and noticed no changes to entitlement certificates. I listed attached products and noticed Icinga2 was still listed. So I thought maybe a remove/attach would fix that. I removed the subscription to the affected product:

# subscription-manager remove --pool=8a9084ac7706ede4017706efce4c001b
2 local certificates have been deleted.
The entitlement server successfully removed these pools:
   8a9084ac7706ede4017706efce4c001b
The entitlement server successfully removed these serial numbers:
   8089056837674183752
   2619436609108234225

I then attempted to re-attach:

# subscription-manager attach --pool=8a9084ac7706ede4017706efce4c001b
Runtime Error Null value was assigned to a property [class org.candlepin.model.ProductContent.enabled] of primitive type setter of org.candlepin.model.ProductContent.enabled at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException:167

Looking at the Candlepin log I see:

2021-05-03 10:26:22,708 [thread=http-bio-127.0.0.1-8443-exec-2] [req=80d1e13b-7bd0-401a-b74f-75d28db24df5, org=My_Awesome_Org, csid=] INFO  org.candlepin.service.impl.DefaultEntitlementCertServiceAdapter - Generating entitlement cert for pool: Pool [id=8a9084ac7706ede4017706efce4c001b, type=NORMAL, product=275076129832, productName=Icinga2, quantity=-1] quantity: 1 entitlement id: a76f3ed9090b460b9e50cf697f93e8af
2021-05-03 10:26:22,714 [thread=http-bio-127.0.0.1-8443-exec-2] [req=80d1e13b-7bd0-401a-b74f-75d28db24df5, org=My_Awesome_Org, csid=] INFO  org.candlepin.service.impl.DefaultEntitlementCertServiceAdapter - Creating X509 cert for product: Product [uuid: 8a9084ac774028f001776470230f044c, id: 275076129832, name: Icinga2]
2021-05-03 10:26:22,797 [thread=http-bio-127.0.0.1-8443-exec-2] [req=80d1e13b-7bd0-401a-b74f-75d28db24df5, org=My_Awesome_Org, csid=] WARN  org.hibernate.engine.loading.internal.LoadContexts - HHH000100: Fail-safe cleanup (collections) : org.hibernate.engine.loading.internal.CollectionLoadContext@118730a<rs=com.mchange.v2.c3p0.impl.NewProxyResultSet@5494132f [wrapping: null]>
2021-05-03 10:26:22,797 [thread=http-bio-127.0.0.1-8443-exec-2] [req=80d1e13b-7bd0-401a-b74f-75d28db24df5, org=My_Awesome_Org, csid=] WARN  org.hibernate.engine.loading.internal.CollectionLoadContext - HHH000160: On CollectionLoadContext#cleanup, localLoadingCollectionKeys contained [1] entries
2021-05-03 10:26:22,798 [thread=http-bio-127.0.0.1-8443-exec-2] [req=80d1e13b-7bd0-401a-b74f-75d28db24df5, org=My_Awesome_Org, csid=] ERROR org.candlepin.common.exceptions.mappers.CandlepinExceptionMapper - Runtime Error Null value was assigned to a property [class org.candlepin.model.ProductContent.enabled] of primitive type setter of org.candlepin.model.ProductContent.enabled at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException:167
org.hibernate.PropertyAccessException: Null value was assigned to a property [class org.candlepin.model.ProductContent.enabled] of primitive type setter of org.candlepin.model.ProductContent.enabled
        at org.hibernate.property.access.spi.SetterFieldImpl.set(SetterFieldImpl.java:47)
        at org.hibernate.tuple.entity.AbstractEntityTuplizer.setPropertyValues(AbstractEntityTuplizer.java:661)
        at org.hibernate.tuple.entity.PojoEntityTuplizer.setPropertyValues(PojoEntityTuplizer.java:206)
        at org.hibernate.persister.entity.AbstractEntityPersister.setPropertyValues(AbstractEntityPersister.java:5045)
        at org.hibernate.engine.internal.TwoPhaseLoad.doInitializeEntity(TwoPhaseLoad.java:243)
        at org.hibernate.engine.internal.TwoPhaseLoad.initializeEntity(TwoPhaseLoad.java:160)
        at org.hibernate.loader.Loader.initializeEntitiesAndCollections(Loader.java:1179)
        at org.hibernate.loader.Loader.processResultSet(Loader.java:1028)
        at org.hibernate.loader.Loader.doQuery(Loader.java:964)
        at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:354)
        at org.hibernate.loader.Loader.doQueryAndInitializeNonLazyCollections(Loader.java:324)
        at org.hibernate.loader.Loader.loadCollection(Loader.java:2528)
        at org.hibernate.loader.collection.plan.LegacyBatchingCollectionInitializerBuilder$LegacyBatchingCollectionInitializer.initialize(LegacyBatchingCollectionInitializerBuilder.java:92)
        at org.hibernate.persister.collection.AbstractCollectionPersister.initialize(AbstractCollectionPersister.java:707)
        at org.hibernate.event.internal.DefaultInitializeCollectionEventListener.onInitializeCollection(DefaultInitializeCollectionEventListener.java:76)
        at org.hibernate.event.service.internal.EventListenerGroupImpl.fireEventOnEachListener(EventListenerGroupImpl.java:108)
        at org.hibernate.internal.SessionImpl.initializeCollection(SessionImpl.java:2145)
        at org.hibernate.collection.internal.AbstractPersistentCollection$4.doWork(AbstractPersistentCollection.java:589)
        at org.hibernate.collection.internal.AbstractPersistentCollection.withTemporarySessionIfNeeded(AbstractPersistentCollection.java:264)
        at org.hibernate.collection.internal.AbstractPersistentCollection.initialize(AbstractPersistentCollection.java:585)
        at org.hibernate.collection.internal.AbstractPersistentCollection.read(AbstractPersistentCollection.java:149)
        at org.hibernate.collection.internal.PersistentBag.iterator(PersistentBag.java:387)
        at org.candlepin.util.CollectionView.iterator(CollectionView.java:113)
        at org.candlepin.util.X509Util.filterProductContent(X509Util.java:95)
        at org.candlepin.util.X509V3ExtensionUtil.mapProduct(X509V3ExtensionUtil.java:325)
        at org.candlepin.util.X509V3ExtensionUtil.createProducts(X509V3ExtensionUtil.java:285)
        at org.candlepin.service.impl.DefaultEntitlementCertServiceAdapter.doEntitlementCertGeneration(DefaultEntitlementCertServiceAdapter.java:477)
        at org.candlepin.service.impl.DefaultEntitlementCertServiceAdapter.generateEntitlementCerts(DefaultEntitlementCertServiceAdapter.java:151)
        at org.candlepin.controller.EntitlementCertificateGenerator.generateEntitlementCertificates(EntitlementCertificateGenerator.java:119)
        at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:62)
        at org.candlepin.bind.HandleCertificatesOp.preProcess(HandleCertificatesOp.java:75)
        at org.candlepin.bind.BindChain.preProcess(BindChain.java:69)
        at org.candlepin.bind.BindChain.run(BindChain.java:108)
        at org.candlepin.controller.CandlepinPoolManager.createEntitlements(CandlepinPoolManager.java:1751)
        at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:62)
        at org.candlepin.controller.CandlepinPoolManager.entitleByPools(CandlepinPoolManager.java:1631)
        at com.google.inject.persist.jpa.JpaLocalTxnInterceptor.invoke(JpaLocalTxnInterceptor.java:70)
        at org.candlepin.controller.Entitler.bindByPoolQuantities(Entitler.java:143)
        at org.candlepin.controller.Entitler.bindByPoolQuantity(Entitler.java:121)
        at org.candlepin.resource.ConsumerResource.bind(ConsumerResource.java:2130)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:151)
        at org.jboss.resteasy.core.MethodInjectorImpl.lambda$invoke$3(MethodInjectorImpl.java:122)
        at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
        at java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:628)
        at java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:1996)
        at java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:110)
        at org.jboss.resteasy.core.MethodInjectorImpl.invoke(MethodInjectorImpl.java:122)
        at org.jboss.resteasy.core.ResourceMethodInvoker.internalInvokeOnTarget(ResourceMethodInvoker.java:594)
        at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTargetAfterFilter(ResourceMethodInvoker.java:468)
        at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invokeOnTarget$2(ResourceMethodInvoker.java:421)
        at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:363)
        at org.jboss.resteasy.core.ResourceMethodInvoker.invokeOnTarget(ResourceMethodInvoker.java:423)
        at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:391)
        at org.jboss.resteasy.core.ResourceMethodInvoker.lambda$invoke$1(ResourceMethodInvoker.java:365)
        at java.util.concurrent.CompletableFuture.uniComposeStage(CompletableFuture.java:995)
        at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:2137)
        at java.util.concurrent.CompletableFuture.thenCompose(CompletableFuture.java:110)
        at org.jboss.resteasy.core.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:365)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:477)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:252)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:153)
        at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:363)
        at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:156)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:238)
        at org.jboss.resteasy.plugins.server.servlet.ServletContainerDispatcher.service(ServletContainerDispatcher.java:249)
        at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:60)
        at org.jboss.resteasy.plugins.server.servlet.HttpServletDispatcher.service(HttpServletDispatcher.java:55)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:731)
        at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286)
        at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276)
        at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181)
        at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
        at org.candlepin.servlet.filter.EventFilter.doFilter(EventFilter.java:65)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at org.candlepin.common.filter.LoggingFilter.doFilter(LoggingFilter.java:125)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at org.candlepin.servlet.filter.CandlepinPersistFilter.doFilter(CandlepinPersistFilter.java:48)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at org.candlepin.servlet.filter.CandlepinScopeFilter.doFilter(CandlepinScopeFilter.java:68)
        at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
        at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:218)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:498)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:445)
        at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1091)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.IllegalArgumentException: Can not set boolean field org.candlepin.model.ProductContent.enabled to null value
        at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:167)
        at sun.reflect.UnsafeFieldAccessorImpl.throwSetIllegalArgumentException(UnsafeFieldAccessorImpl.java:171)
        at sun.reflect.UnsafeBooleanFieldAccessorImpl.set(UnsafeBooleanFieldAccessorImpl.java:80)
        at java.lang.reflect.Field.set(Field.java:764)
        at org.hibernate.property.access.spi.SetterFieldImpl.set(SetterFieldImpl.java:41)
        ... 102 common frames omitted
2021-05-03 10:26:22,799 [thread=http-bio-127.0.0.1-8443-exec-2] [req=80d1e13b-7bd0-401a-b74f-75d28db24df5, org=My_Awesome_Org, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Response: status=500, content-type="application/json", time=172


How should I proceed from here?

Thanks!

Comment 54 caitslin 2021-05-03 14:41:02 UTC
@crog 

Sorry for any additional noise, but when I submitted my last update the bug system reported:


Changes submitted for bug 1931027

    Email sent to:
        no one 


So, I just wanted to make sure someone saw that I posted an update above.

Comment 55 Chris "Ceiu" Rog 2021-05-03 17:23:49 UTC
(In reply to caitslin from comment #53)
> @crog 
> 
> I validated those UUIDs are correct - I went ahead and applied the manual
> fix to the DB. I restarted Tomcat/Candlepin, and I logged into one of our
> affected hosts. I ran 'subscription-manager refresh' and noticed no changes
> to entitlement certificates. I listed attached products and noticed Icinga2
> was still listed. So I thought maybe a remove/attach would fix that. I
> removed the subscription to the affected product:
> 

This part was fine, but apparently we need to also set the enabled flag explicitly. We'll also set the other "optional" fields just in case some other part of CP treats them like required fields:

UPDATE cp2_product_content SET enabled = true, created = now(), updated = now() WHERE id = 'ID USED TO CREATE ROW';

Then do a SELECT on the same ID to make sure all columns for that row are now populated.

Restart CP and then do your subman reattach to generate a new entitlement + cert and verify the content is present.

Comment 56 caitslin 2021-05-04 16:20:45 UTC
@crog 

Thanks for the update. That resolved the issue. I have confirmed the Icinga2 repo is now available to my CentOS 8 and RHEL 8 content consumers. I will apply the same fix strategy to the CentOS 8 repos that "went missing" like Icinga2.

I will also enable DEBUG level logging for Candlepin and keep a close eye for any linkages that break.

Do we want to keep this ticket open, or should I necrobump it, or file a new one and link to this issue, should the issue appear again?

Thanks!

Comment 57 Chris "Ceiu" Rog 2021-05-04 17:36:49 UTC
Let's keep this BZ going for a bit. The issue seemed to crop up fairly quickly before, so hopefully that's what happens here if it's going to occur again. If this ends up going silent for a few weeks or longer, it might get closed during some triage passes. If that occurs, then just open a new BZ referencing this one when/if the issue comes up again.

Comment 61 Dirk Götz 2021-07-26 13:42:15 UTC
Just for your information: Another one reporting the bug and providing a script to fix a system where the error occurred https://community.theforeman.org/t/entitlement-certificate-not-containing-all-content/20764/3

Comment 62 Chris "Ceiu" Rog 2021-08-12 14:03:10 UTC
I would strongly recommend that any organization hitting this issue often enough to feel it necessary to use a script to semi-automate fixing the issue instead turn on debug logging [1] in Candlepin and provide logs from the period when a given product is created/fixed, and when it is noticed that the product has been broken.

We've been unable to reproduce locally under a number of environments and loads, so any information that can point us in the direction of the source of this problem would be greatly appreciated. 



[1] To turn on debug logging, open /etc/candlepin/candlepin.conf, and modify the entry for "log4j.logger.org.candlepin" to read "log4j.logger.org.candlepin=DEBUG". If no such entry exists, the line can be added to the end of the file. Once modified, restart tomcat for the change to take effect. To disable debug logging, either remove the line, or change the logging level to one of: INFO, WARN, or ERROR.

Comment 63 Marco Verschuur 2021-08-20 13:47:42 UTC
Hi Chris,

Can you please take a look in this support case: https://access.redhat.com/support/cases/#/case/02939104
I've explained my debug scenario and included candlepin debug logging.

We are again/still facing this issue and we need to get this fixed. We cannot keep using workarounds so let's see if we can tackle this problem once and for all

Comment 64 Chris "Ceiu" Rog 2021-08-20 16:21:44 UTC
Hi Marco,

Thanks for the info provided. I looked through the logs and I believe your issue is not related to product-content linkage, but, instead, is an issue we discovered with repos without an architecture being erroneously filtered (https://bugzilla.redhat.com/show_bug.cgi?id=1985360). This bug has been fixed and will be present in Candlepin 3.1.30 and newer, which will be shipped with Satellite 6.10+.

In the interim, you can try updating the product in question to explicitly list the arches for the clients you expect to register (x86_64, likely). I suspect by adding that and doing a subman refresh, the missing repos will show up again. Please give that a try and let me know how it goes.

Comment 65 Marco Verschuur 2021-08-20 17:49:08 UTC
(In reply to Chris "Ceiu" Rog from comment #64)

Chris,

Thanks for your shift response. I did configure "Restrict to Architecture" to x86_64, saved it, performed the subman refresh on the client, but no luck; still no product.

Comment 66 Chris "Ceiu" Rog 2021-08-20 19:24:15 UTC
(In reply to Marco Verschuur from comment #65)

That being the case, could you please update the support with the additional logging from candlepin.log from the additional actions performed? I'll take a look and see if it's as I'm expecting from the CP side of things. I'm thinking my instructions may have been lacking a necessary step or two to fully regenerate the cert once the product is fixed.

Comment 67 Marco Verschuur 2021-08-21 08:48:16 UTC
The new logs have been update in the case

I did try setting the OS Restriction as well and re-generating repo metadata, but that didn't help.

Comment 73 Jason Grantz 2022-04-20 04:01:04 UTC
https://access.redhat.com/support/cases/#/case/03200881

Triggered this on Red Hat Satellite (build: 6.10.3)

Comment 78 Paul Donohue 2022-07-01 17:42:34 UTC
Another case that ran into this on Satellite 6.10.5.1:
https://access.redhat.com/support/cases/#/case/03256890

Comment 80 Paul Donohue 2022-07-06 22:27:42 UTC
Some notes on my experience with this:

We upgraded from Satellite 6.9.something to Satellite 6.10.5.1 on 05/12/2022.  Hosts that were registered or refreshed on 05/12 shortly after the Satellite upgrade did not have any issues.
Between 05/13/2022 and 05/18/2022, we did not register or refresh any additional Hosts, but we did make a number of changes in Satellite, including re-importing the manifest, deleting some old repos, and adding some new repos.  Unfortunately, due to the number of changes we made, it is not easy for us to narrow down the specific sequence of events that broke things.  A review of the candlepin logs didn't find anything interesting/relevant between those dates.
Hosts that were registered or refreshed on or after 05/19/2022 had problems with a number of our custom repos.

We used the following to identify the affected repos:
[root@satellite ~]# sudo -u postgres psql candlepin
candlepin=# SELECT content.* from cp2_content content LEFT JOIN cp2_product_content pc on content.uuid = pc.content_uuid where pc.content_uuid IS NULL;
(This found 22 affected repos, out of the 46 custom repos we had configured.)

Based on a suggestion from RedHat support, we tried the following:
[root@satellite ~]# foreman-rake console
irb> roots = Katello::RootRepository.where(content_id: ['1234', '5678', ...])   # Manually copy/paste all of the content_id values printed by the above query into this command
irb> roots.each{|root| ForemanTasks.async_task(::Actions::Candlepin::Product::ContentAdd, owner: root.product.organization.label, product_id: root.product.cp_id, content_id: root.content_id) }

That worked for 13 of the affected repos, but failed for 9 of the repos.
In the cases where it worked, candlepin appeared to automatically regenerate the entitlement certificates, so simply running `subscription-manager refresh` on the Hosts was enough to get those 13 repos working again on our systems.
In the cases where it failed, candlepin logged messages like this:
2022-07-01 11:22:07,980 [thread=http-bio-127.0.0.1-23443-exec-17] [req=66e5bb52-ed2f-45dd-ad3b-6fffb50fe688, org=, csid=] INFO  org.candlepin.common.filter.LoggingFilter - Request: verb=POST, uri=/candlepin/owners/X/products/Y/content/Z?enabled=true
2022-07-01 11:22:07,993 [thread=http-bio-127.0.0.1-23443-exec-17] [req=66e5bb52-ed2f-45dd-ad3b-6fffb50fe688, org=, csid=] WARN  org.hibernate.engine.jdbc.spi.SqlExceptionHelper - SQL Error: 0, SQLState: 23505
2022-07-01 11:22:07,993 [thread=http-bio-127.0.0.1-23443-exec-17] [req=66e5bb52-ed2f-45dd-ad3b-6fffb50fe688, org=, csid=] ERROR org.hibernate.engine.jdbc.spi.SqlExceptionHelper - ERROR: duplicate key value violates unique constraint "cp2_product_entity_version"
  Detail: Key (product_id, entity_version)=(1234, -4321) already exists.

We then tried using the script from https://community.theforeman.org/t/entitlement-certificate-not-containing-all-content/20764/3 to fix the remaining 9 repos.  This script ran successfully and populated the cp2_product_content table as expected, but candlepin did not automatically regenerate the entitlement certificates after running this, so our Hosts still did not see these repos.  We eventually ended up forcing candlepin to regenerate all entitlement certificates by running:
[root@satellite ~]# sudo -u postgres psql candlepin
candlepin=# UPDATE cp_entitlement SET dirty='t';

Finally, we used a Satellite Job to run `subscription-manager refresh` on all of our Hosts, and everything seemed to work properly again.

Comment 81 Timo Alatalo 2022-07-18 06:07:08 UTC
We have same kind of issue Paul described at Comment #80 with Satellite 6.10.7. 

https://access.redhat.com/support/cases/#/case/03264411

In our case the select from candlepin database (mentioned on Comment #80) found 14 custom repos and we were able to fix few of them with ContentAdd task. However with 7 repos contentadd task paused with same kind of SQL error (duplicate key value violates unique constraint "cp2_product_entity_version"). We didn't try to use the script https://community.theforeman.org/t/entitlement-certificate-not-containing-all-content/20764/3 (we are waiting for suggestions from Red Hat).

Comment 83 Jonathan Sattelberger 2022-07-20 17:03:33 UTC
I am not entirely sure what happened, but this morning a good portion of custom repos were missing from RHEL (7Server and 7Workstation, 8), CentOS Linux 7, and Stream 8 installations this morning. About 20 different repositories across 5 different OSes releases, and 5 different composite content views. Content views were published two days ago without issue. Unused content views were removed yesterday morning to conserve space. Machines in the Default Organizational View are also unable to see unfiltered repositories. Those started complaining around 7:00 pm yesterday evening. The general population around 8:00 am this morning. We are waiting for suggestions from Red Hat.

Comment 84 Chris "Ceiu" Rog 2022-07-20 18:50:30 UTC
(In reply to Jonathan Sattelberger from comment #83)
> I am not entirely sure what happened, but this morning a good portion of
> custom repos were missing from RHEL (7Server and 7Workstation, 8), CentOS
> Linux 7, and Stream 8 installations this morning. About 20 different
> repositories across 5 different OSes releases, and 5 different composite
> content views. Content views were published two days ago without issue.
> Unused content views were removed yesterday morning to conserve space.
> Machines in the Default Organizational View are also unable to see
> unfiltered repositories. Those started complaining around 7:00 pm yesterday
> evening. The general population around 8:00 am this morning. We are waiting
> for suggestions from Red Hat.

If you haven't already, please open an issue that includes Candlepin logs (a foreman-debug dump or sosreport should include this) between when the product was last known to be in a good or functional state, and 7p or 8a when users started noticing the issue. I'm hoping you caught it quickly enough that we can finally get a set of logs that go end-to-end and we can build a reproducer on our end. If you don't have logs available or are otherwise unable to open an issue, can you investigate and report what major write operations were performed between the last known time the product was in a functional state, and when content went missing? Specifically I'm looking for things like: manifest imports, or create/update/delete operations on content views, or custom products.

This also applies to others who have reported above, or may report in the future. At present the critical info we're seeking is the common operations occurring between good and bad states.

As an aside, and perhaps a light at the end of this exceedingly long tunnel, we did notice and fix an issue in the CP4 branch that could lead to objects being erroneously unlinked under specific circumstances during a manifest import or refresh operation. However, without a reproducer for what has been reported here, it's too early for say with any certainty if it was the cause of this issue.

Comment 85 Jonathan Sattelberger 2022-07-20 22:35:44 UTC
(In reply to Chris "Ceiu" Rog from comment #84)
> If you haven't already, please open an issue that includes Candlepin logs (a
> foreman-debug dump or sosreport should include this) between when the
> product was last known to be in a good or functional state, and 7p or 8a
> when users started noticing the issue. I'm hoping you caught it quickly
> enough that we can finally get a set of logs that go end-to-end and we can
> build a reproducer on our end. If you don't have logs available or are
> otherwise unable to open an issue, can you investigate and report what major
> write operations were performed between the last known time the product was
> in a functional state, and when content went missing? Specifically I'm
> looking for things like: manifest imports, or create/update/delete
> operations on content views, or custom products.

https://access.redhat.com/support/cases/#/case/03271498

Created support case for Red Hat Satellite 6.11.0.

The [composite] content views I removed were a few months old, probably before the pulp2 to pulp3 migration, or perhaps just after migration.

Comment 86 Paul Donohue 2022-07-21 16:55:15 UTC
I've uploaded complete candlepin logs from 5/12-5/19 (before last known good until after last known bad) to https://access.redhat.com/support/cases/#/case/03256890
Unfortunately, as I mentioned in Comment #80 above, we did a lot of stuff in that time window, so our logs may not be very useful for reproducing this.

For reference:
We have been using Satellite 6 since the beginning (6.0), but my comment above was the first/only time we've run into this.  We use custom repositories and content views rather heavily, so refreshing manifests, create/update/delete/publish/promote operations on Content Views, and create/update operations on Custom Products are things we do fairly regularly.  While we did do those things in the time window when this broke, I believe we have also done all of them again since then with no issues.
The one thing we don't do very often is delete Custom Repos or Custom Products (we usually just abandon them, and leave them as-is), but we did delete some within the window where this broke.  That was probably the first time we had deleted any in a few years, and we haven't deleted any since it broke.  So, anecdotally, it seems likely that deleting a Custom Repo or Custom Product is probably what broke this in our case.

Comment 87 Paul Donohue 2022-07-21 23:05:30 UTC
I just tried deleting a few more Custom Repos and Custom Products (including removing them from Content Views, then re-publishing/promoting the Content Views, and deleting old Versions), but it did not reproduce the problem.  So, it seems like it's more complicated/subtle than just performing a specific operation.

Comment 88 Chris "Ceiu" Rog 2022-07-22 20:27:07 UTC
(In reply to Paul Donohue from comment #87)
> I just tried deleting a few more Custom Repos and Custom Products (including
> removing them from Content Views, then re-publishing/promoting the Content
> Views, and deleting old Versions), but it did not reproduce the problem. 
> So, it seems like it's more complicated/subtle than just performing a
> specific operation.

Yeah, this observation is pretty common in the customers we've seen report the issue and successfully work around it. Seems once the issue is fixed, it doesn't come up again.

For those of you who've hit this, did you happen to perform a Satellite (or Candlepin) update between the time the custom products were created and when they were found to be broken?

Comment 89 Paul Donohue 2022-07-23 16:49:49 UTC
> did you happen to perform a Satellite (or Candlepin) update between the time the custom products were created and when they were found to be broken?

For me, yes.  The repos I deleted (both when this broke, and when I deleted a few more repos a couple days ago without breaking anything) were all very old (probably 4-6 years old) and have been through many Satellite upgrades.

Comment 96 Nicolas 2022-08-16 15:51:16 UTC
I had the same problems as Paul Donohue, but it was after the 6.11 upgrade. 
His solution posted on 2022-07-06 22:27:42 UTC worked. 

Thank you Paul !

Comment 99 Jan Vanmullem 2022-09-20 12:51:39 UTC
We also encountered the same problem after an upgrade to 6.10 (rh support case 03312857).
The solution posted in comment #80 has also worked for us.

Thanks Paul!

Comment 100 Bryan Kearney 2022-10-05 20:02:37 UTC
Upstream bug assigned to chrobert

Comment 101 hakon.gislason 2022-10-06 15:29:55 UTC
The solution posted in comment #80 fixed 17 out of 19 repositories for us. Alexey, do you want some info on the 2 repositories that were not fixed by the script?

Comment 103 Bryan Kearney 2022-10-18 16:02:37 UTC
Moving this bug to POST for triage into Satellite since the upstream issue https://projects.theforeman.org/issues/35599 has been resolved.

Comment 104 Pavel Moravec 2022-10-27 10:10:32 UTC
Some TL;DR summary:

- there is no known reproducer of the problem; Red Hat has just some cloned customer Satellites that demonstrate the already present problem, but no known reproducer steps towards that state are known
  - linked PR just attempts to alleviate/fix the consequences, but not fix the root cause
- it is not version specific: any version since 6.8 till 6.11 seems to be affected
  - some customers reported multiple instances happened after upgrade to 6.10 (but 6.9->6.10 migration is rather pulp-centristic and not katello/candlepin centristic, so this might be a random coincidence?)
- only custom repos are affected
- the bug often (only?) happens for repos cloned into content views
- the bug often (only?) happens repos with GPG key (after it is changed?)


I will work with one customer to focus on finding a *reproducer* .

Comment 105 Paul Donohue 2022-10-27 16:20:08 UTC
If it helps, I still have full disk snapshots of my Satellite system from both before and after the problem appeared (from both before and after my upgrade on 05/12/2022, and from 05/18/2022, relative to the timeline in Comment #80).  In theory, between the logs and the Task history in Satellite in the "after" snapshot, someone could probably reconstruct the sequence of events and reproduce it from the "before" snapshot.  However, as noted above, it is possible we made too many changes for this to be practical to reconstruct manually.  I intended to attempt this myself, but I have not yet found the time to do it ... If someone from RedHat wants to try then I can share data.

Comment 106 Charles Slivkoff 2022-10-28 19:00:15 UTC
RH Support pointed me to this script to correct the DB inconsistency:

https://raw.githubusercontent.com/ATIX-AG/orcharhino-scripts/main/find_missing_candlepin_product_contents/find_missing_candlepin_product_contents.rb



This seems to have fixed the issue in my case whereas the solution in Comment #80 did not.  

I also needed to `subscription-manager clean` and re-register to actually get the updated entitlement certificates.


We encountered the problem on 1 custom repository (Oracle 8 Instant Client) only.
This is being used in content-views and does have a GPG key associated with it.

Upstream URL: https://yum.oracle.com/repo/OracleLinux/OL8/oracle/instantclient21/x86_64/

Comment 107 Pavel Moravec 2022-11-07 14:57:58 UTC
Hello,
I have a reproducer that I am still simplifying and finding its mandatory parts and gotchas.

TL;DR idea is to concurrently modify multiple repos - if some repos are from the same product, then either repo can loose its relation to its product within candlepin. Then, candlepin generating new certificates for the pool / for the product does not see this repo within the product, and omits it.

I will provide precise reproducer within a day, but the basics is 10 products with 2 repos each, concurrently set either/all GPG key, arch and/or version limitation. Do it a few times in a row, and check if there isnt some forgotten repo (either via subscription-manager refresh on a client and checking enabled repos, or directly:


su - postgres -c "psql candlepin -c \"SELECT owner.owner_id, product.uuid, product.name, product.product_id, content.uuid, content.name, content.content_id FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid INNER JOIN cp2_owner_content AS owner ON owner.content_uuid = content.uuid WHERE content.vendor = 'Custom' AND product.product_id IS NULL;\""


I am not sure if this is the *only* reproducer of that bug, as I saw some customer set-ups that does not directly follow this scenario.

Comment 108 Pavel Moravec 2022-11-07 17:15:57 UTC
Created attachment 1922803 [details]
candlepin standalone reproducer

Standalone candlepin reproducer that can be run on a Satellite deployment.

It assumes an owner and environment exists (configurable), assumes "candlepin" postgres is local and candlepin listening on localhost:23443 .

The script performs:
1) setup phase, where it creates 10 products (and their pools), and 2 contents for each product (and associate them to the product and environment)

2) reproducer phase, where it concurrently updates all 20 contents by assigning some gpg key and/or arch and/or requiredTag. Once assigned, it prints linkages between products and contents (that all have an owner), AND also contents that have no product - if such content exists, script terminates.


Some more observations:
- content updates *must* be done concurrently
- very first iteration has *never* reproduced the issue, so far (I *think* a subsequent content update after a concurrent one does the break..?)
- update of two contents in one product is sufficient reproducer - just due to concurrency and randomness, you need more iterations to hit the bug

Enjoy!

Comment 109 Pavel Moravec 2022-11-07 21:37:59 UTC
Created attachment 1922893 [details]
improved standalone candlepin reproducer

Previous script version had a few issues that all are fixed now. Usage and description is the same.

Comment 110 Pavel Moravec 2022-11-09 21:37:11 UTC
Created attachment 1923432 [details]
reproducer with manual client

Previous reproducer just managed to break some records in particular place in postgres, BUT didnt end up in missing content on clients - when running alone.

The key is, I was always testing the missing repositories on clients using a real client registered to the Satellite/candlepin, and running sub-man refresh (and inspecting /etc/you.repos.d/redhat.repo). THIS IS MANDATORY STEP.

Attaching yet improved reproducer:

0) set candlepin to call OrphanCleanupJob every 5 minutes, not once per week (*just* for step 7, not mandatory):
# grep OrphanCleanupJob /etc/candlepin/candlepin.conf
candlepin.async.jobs.OrphanCleanupJob.schedule=0 0/5 * * * ?
#
1) have an organization and some default environment created in candlepin.
2) have a client system registered to the candlepin, within that Org/Env.
3) setup the reproducer and run a few times (once might be enough):

./bz1931027_reproducer.sh SOMELABEL 1 5 alter

Options stand for:
- SOMELABEL = unique label for products and repos/contents
- 1 = run setup phase
- 5 = # of iterations of "concurrently modify all repos/content
- "alter" = do alter gpg/arch/requiredTags to some values, if empty, then all gpg/arch/requiredTags is reset to empty string for all repos

4) Meantime, regularly check on the client enabled custom repos:
while true; do date; subscription-manager refresh ; grep name /etc/yum.repos.d/redhat.repo  | grep SOMELABEL | sort | nl; sleep 5; done

5) reset gpg/arch/requiredTags to empty, to ensure no repo is hidden due to different arch or similar:

./bz1931027_reproducer.sh SOMELABEL 0 1

6) THIS will already raise errors:
Verifying product-content linkage...
Checking product-content pair: PROD_SOMELABEL_1 (17571734123340) => 1668028785140...	OK!
Checking product-content pair: PROD_SOMELABEL_1 (17571734123340) => 1668028785402...	OK!
Checking product-content pair: PROD_SOMELABEL_2 (17952571229670) => 1668028785814...	OK!
Checking product-content pair: PROD_SOMELABEL_2 (17952571229670) => 1668028786054...	OK!
Checking product-content pair: PROD_SOMELABEL_3 (162623143519709) => 1668028786439...	FAILED!
  Content 1668028786439 is not linked to product PROD_SOMELABEL_3 (162623143519709)!
Checking product-content pair: PROD_SOMELABEL_3 (162623143519709) => 1668028786676...	OK!
Checking product-content pair: PROD_SOMELABEL_4 (12833101834903) => 1668028787086...	FAILED!
  Content 1668028787086 is not linked to product PROD_SOMELABEL_4 (12833101834903)!
Checking product-content pair: PROD_SOMELABEL_4 (12833101834903) => 1668028787344...	OK!
Checking product-content pair: PROD_SOMELABEL_5 (29911971016007) => 1668028787728...	OK!
Checking product-content pair: PROD_SOMELABEL_5 (29911971016007) => 1668028787978...	FAILED!
  Content 1668028787978 is not linked to product PROD_SOMELABEL_5 (29911971016007)!
Checking product-content pair: PROD_SOMELABEL_6 (11599906726345) => 1668028788347...	OK!
Checking product-content pair: PROD_SOMELABEL_6 (11599906726345) => 1668028788618...	OK!
Checking product-content pair: PROD_SOMELABEL_7 (216791790320486) => 1668028789025...	OK!
Checking product-content pair: PROD_SOMELABEL_7 (216791790320486) => 1668028789269...	OK!
Checking product-content pair: PROD_SOMELABEL_8 (191791492911651) => 1668028789594...	OK!
Checking product-content pair: PROD_SOMELABEL_8 (191791492911651) => 1668028789782...	OK!
Checking product-content pair: PROD_SOMELABEL_9 (49661447114529) => 1668028790135...	OK!
Checking product-content pair: PROD_SOMELABEL_9 (49661447114529) => 1668028790394...	FAILED!
  Content 1668028790394 is not linked to product PROD_SOMELABEL_9 (49661447114529)!
Checking product-content pair: PROD_SOMELABEL_10 (1162016468220) => 1668028790774...	OK!
Checking product-content pair: PROD_SOMELABEL_10 (1162016468220) => 1668028790983...	OK!

7) Wait 5 minutes to have OrphanCleanupJob run and check that also postgres queries return missing entries in cp2_product_content table (the observed but not key fact, that I misleadingly focused on):

str=SOMELABEL
(
  echo "SELECT product.uuid, product.name, product.product_id, content.uuid, content.name, content.content_id FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid WHERE content.contenturl LIKE '%${str}%';"
  echo "SELECT owner.owner_id, product.uuid, product.name, product.product_id, content.uuid, content.name, content.content_id FROM cp2_content content LEFT JOIN (cp2_product_content pc JOIN cp2_products product ON product.uuid = pc.product_uuid) ON pc.content_uuid = content.uuid INNER JOIN cp2_owner_content AS owner ON owner.content_uuid = content.uuid WHERE content.contenturl LIKE '%${str}%' AND product.product_id IS NULL;"
) | su - postgres -c "psql candlepin"

               uuid               |       name        |   product_id    |               uuid               |           name           |  content_id   
----------------------------------+-------------------+-----------------+----------------------------------+--------------------------+---------------
 8aac0157845b767c01845e45296d1437 | PROD_SOMELABEL_10 | 1162016468220   | 8aac0157845b767c01845e4419b91323 | PROD_10-CONT_SOMELABEL_1 | 1668028790774
 8aac0157845b767c01845e441988130a | PROD_SOMELABEL_4  | 12833101834903  | 8aac0157845b767c01845e4419791303 | PROD_4-CONT_SOMELABEL_2  | 1668028787344
 8aac0157845b767c01845e44198c130e | PROD_SOMELABEL_5  | 29911971016007  | 8aac0157845b767c01845e44196c1301 | PROD_5-CONT_SOMELABEL_1  | 1668028787728
 8aac0157845b767c01845e445bc01334 | PROD_SOMELABEL_7  | 216791790320486 | 8aac0157845b767c01845e4419781302 | PROD_7-CONT_SOMELABEL_1  | 1668028789025
 8aac0157845b767c01845e445bc01334 | PROD_SOMELABEL_7  | 216791790320486 | 8aac0157845b767c01845e4419881308 | PROD_7-CONT_SOMELABEL_2  | 1668028789269
 8aac0157845b767c01845e441989130c | PROD_SOMELABEL_3  | 162623143519709 | 8aac0157845b767c01845e44196a1300 | PROD_3-CONT_SOMELABEL_2  | 1668028786676
 8aac0157845b767c01845e445b70132b | PROD_SOMELABEL_2  | 17952571229670  | 8aac0157845b767c01845e44196912ff | PROD_2-CONT_SOMELABEL_1  | 1668028785814
 8aac0157845b767c01845e445b70132b | PROD_SOMELABEL_2  | 17952571229670  | 8aac0157845b767c01845e44195512f9 | PROD_2-CONT_SOMELABEL_2  | 1668028786054
 8aac0157845b767c01845e45294a1431 | PROD_SOMELABEL_8  | 191791492911651 | 8aac0157845b767c01845e4419c31327 | PROD_8-CONT_SOMELABEL_2  | 1668028789782
 8aac0157845b767c01845e4419981315 | PROD_SOMELABEL_9  | 49661447114529  | 8aac0157845b767c01845e4419881309 | PROD_9-CONT_SOMELABEL_1  | 1668028790135
 8aac0157845b767c01845e452922142e | PROD_SOMELABEL_1  | 17571734123340  | 8aac0157845b767c01845e44195e12fa | PROD_1-CONT_SOMELABEL_2  | 1668028785402
 8aac0157845b767c01845e4529571434 | PROD_SOMELABEL_6  | 11599906726345  | 8aac0157845b767c01845e44199c131a | PROD_6-CONT_SOMELABEL_1  | 1668028788347
                                  |                   |                 | 8aac0157845b767c01845e4419b11322 | PROD_9-CONT_SOMELABEL_2  | 1668028790394
                                  |                   |                 | 8aac0157845b767c01845e44192a12f8 | PROD_1-CONT_SOMELABEL_1  | 1668028785140
                                  |                   |                 | 8aac0157845b767c01845e4419801307 | PROD_5-CONT_SOMELABEL_2  | 1668028787978
                                  |                   |                 | 8aac0157845b767c01845e4419901311 | PROD_3-CONT_SOMELABEL_1  | 1668028786439
                                  |                   |                 | 8aac0157845b767c01845e44196912fe | PROD_6-CONT_SOMELABEL_2  | 1668028788618
                                  |                   |                 | 8aac0157845b767c01845e4419ad131f | PROD_8-CONT_SOMELABEL_1  | 1668028789594
                                  |                   |                 | 8aac0157845b767c01845e44192912f7 | PROD_4-CONT_SOMELABEL_1  | 1668028787086
                                  |                   |                 | 8aac0157845b767c01845e44198c1310 | PROD_10-CONT_SOMELABEL_2 | 1668028790983
(20 rows)

             owner_id             | uuid | name | product_id |               uuid               |           name           |  content_id   
----------------------------------+------+------+------------+----------------------------------+--------------------------+---------------
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e4419b11322 | PROD_9-CONT_SOMELABEL_2  | 1668028790394
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e44192a12f8 | PROD_1-CONT_SOMELABEL_1  | 1668028785140
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e4419801307 | PROD_5-CONT_SOMELABEL_2  | 1668028787978
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e4419901311 | PROD_3-CONT_SOMELABEL_1  | 1668028786439
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e44196912fe | PROD_6-CONT_SOMELABEL_2  | 1668028788618
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e4419ad131f | PROD_8-CONT_SOMELABEL_1  | 1668028789594
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e44192912f7 | PROD_4-CONT_SOMELABEL_1  | 1668028787086
 8aac0157821faa5e01821fb392f80001 |      |      |            | 8aac0157845b767c01845e44198c1310 | PROD_10-CONT_SOMELABEL_2 | 1668028790983
(8 rows)

(the first output has been asked around #c40, the second output is a list of orphaned contents with no linkage to a product (but still owned by the owner))

8) enabling the orphaned contents to their products via:

curl -H "Content-Type:application/json" -X POST -u admin:admin -ks "${BASE_URL}/candlepin/owners/${OWNER}/products/${prod_id}/content/${cont_id}?enabled=true"

(for proper pairs of prod_id and cont_id, per original pairing from product_content_id.txt) does workaround the bug - THIS is the ::Actions::Candlepin::Product::ContentAdd task in workaround in https://access.redhat.com/solutions/5960501)



Still, the reproducer is redundantly complex. It should be simplified to *something* like:
- have a few products with 2+ repos/contents in each
- do some real updates of repos/contents (not a no-op change)
- meanwhile, sub-man refresh to regenerate client cert - THIS is the crucial point
- reset updates to repos/contents just to ensure all repos/contents should be visible on the client

I will try to simplify the reproducer and also prepare either Satellite-only reproducer or candlepin-only reproducer (incl. mimicking a client via API).

Comment 112 Pavel Moravec 2022-11-12 13:54:31 UTC
An update about the reproducer status:

Some good news first:
we have a reproducer confirmed also by engineering. It is based on the below scenario:
1) multiple repositories within the same product are updated *concurrently*
2) a weekly candlepin internal job OrphanCleanupJob triggered on every Sunday morning does break product<->repository relations in candlepin, as a consequence of 1)
3) once a client system calls "subscription-manifest refresh" or reregister itself, the client looses track of the repoditories that lost relation to their product in candlepin
4) a user detects the problem any time later

The tricky part is there can be long delays between these steps:
- up to 7 days delay between 1) and 2)
- an arbitrary delay between 2) and 3) (when candlepin is already doomed but clients still use previously cached information about reprositories - until they fetch the current one via sub-man refresh). These random 
- again an arbitrary delay between 3) and 4)

The sequence of random delays between mandatory steps of the reproducer sometimes makes it hard to confirm if a customer hit this scenario or not.


The bad news is, there are reported cases where we *know* that scenario did not happen. I.e. logs never contain concurrent modifications of repos.

That means my reproducer must be just one of several independent reproducers that lead to the same bug / to the same symptoms. Now engineering is investigating the root cause of the bug, also in order to find out the other reproducer(s), that would fit the remaining customer data.

If we succeed in finding that reproducer, it is enough to fix the common one root cause. If we wont succeed, we would be (so far) able to fix only some ways that lead to the same symptoms described in the bug.

Comment 115 Paul Donohue 2022-11-22 18:57:09 UTC
This just happened to me again, and I know the exact sequence of user events this time (because it just so happened that I personally made the changes, and I remember what I was doing at the time).

* Satellite 6.11.4
* We had an existing "EPEL" product with "EPEL6", "EPEL7", and "EPEL8" repos configured with "Mirroring Policy" set to "Additive".
* 11/08/2022:
* 20:06:52 - I created a new "EPEL9" repo matching the other existing repos except without a GPG key.  Tasks completed within a few seconds.
* 20:08:27 - I updated the EPEL9 repo with the GPG key.  Task completed in <1 second.
* 20:08:38 - Started sync for EPEL9.
* Then I decided to change the "Mirroring Policy" from "Additive" to "Content Only" on all of our EPEL repos.
* 20:10:57 - Changed "Mirroring Policy" on RHEL6.  Update task completed in 2 seconds, metadata generate task ran for another 2 minutes.
* 20:11:03 - Changed "Mirroring Policy" on RHEL7.  Update task completed in 3 seconds, metadata generate task ran for another 4 minutes.
* 20:11:10 - Changed "Mirroring Policy" on RHEL8.  Update task completed in 2 seconds, metadata generate task ran for another 2 minutes.
* 20:11:16 - Attempted to change "Mirroring Policy" on EPEL9, but this failed due to running sync task.
* After a while, the EPEL9 sync task seemed to get stuck.  DynFlow still showed that "sync.downloading.artifacts" and "sync.parsing.packages" were running, but Pulp appeared to be idle and nothing changed in DynFlow for about 10 minutes.
* 20:52:37 - Canceled EPEL9 sync task through the Satellite Task UI.
* 20:53:45 - Started another EPEL9 sync.
* 21:17:08 - EPEL9 sync completed successfully.
* 21:18:29 - Changed "Mirroring Policy" on EPEL9.  Update task completed in 2 seconds, metadata generate task ran for another 38 seconds.
* 11/13/2022 22:00:39 - "Remove orphans" job ran.
* Somewhere in there, the EPEL6 and EPEL7 repos were broken in candlepin.  EPEL8 and EPEL9 were not broken.
* 11/17/2022 06:14:26 - Client system problems with EPEL6/EPEL7 started appearing.

Note that the repo updates were closely timed, but as far as I can tell they were NOT concurrent.

Let me know if you want further details.  As before, I have logs and disk snapshots, although there is nothing interesting in the candlepin logs.

Comment 116 Pavel Moravec 2022-11-22 20:09:53 UTC
(In reply to Paul Donohue from comment #115)
> This just happened to me again, and I know the exact sequence of user events
> this time (because it just so happened that I personally made the changes,
> and I remember what I was doing at the time).
> 
> * Satellite 6.11.4
> * We had an existing "EPEL" product with "EPEL6", "EPEL7", and "EPEL8" repos
> configured with "Mirroring Policy" set to "Additive".
> * 11/08/2022:
> * 20:06:52 - I created a new "EPEL9" repo matching the other existing repos
> except without a GPG key.  Tasks completed within a few seconds.
> * 20:08:27 - I updated the EPEL9 repo with the GPG key.  Task completed in
> <1 second.
> * 20:08:38 - Started sync for EPEL9.
> * Then I decided to change the "Mirroring Policy" from "Additive" to
> "Content Only" on all of our EPEL repos.
> * 20:10:57 - Changed "Mirroring Policy" on RHEL6.  Update task completed in
> 2 seconds, metadata generate task ran for another 2 minutes.
> * 20:11:03 - Changed "Mirroring Policy" on RHEL7.  Update task completed in
> 3 seconds, metadata generate task ran for another 4 minutes.
> * 20:11:10 - Changed "Mirroring Policy" on RHEL8.  Update task completed in
> 2 seconds, metadata generate task ran for another 2 minutes.
> * 20:11:16 - Attempted to change "Mirroring Policy" on EPEL9, but this
> failed due to running sync task.
> * After a while, the EPEL9 sync task seemed to get stuck.  DynFlow still
> showed that "sync.downloading.artifacts" and "sync.parsing.packages" were
> running, but Pulp appeared to be idle and nothing changed in DynFlow for
> about 10 minutes.
> * 20:52:37 - Canceled EPEL9 sync task through the Satellite Task UI.
> * 20:53:45 - Started another EPEL9 sync.
> * 21:17:08 - EPEL9 sync completed successfully.
> * 21:18:29 - Changed "Mirroring Policy" on EPEL9.  Update task completed in
> 2 seconds, metadata generate task ran for another 38 seconds.
> * 11/13/2022 22:00:39 - "Remove orphans" job ran.
> * Somewhere in there, the EPEL6 and EPEL7 repos were broken in candlepin. 
> EPEL8 and EPEL9 were not broken.
> * 11/17/2022 06:14:26 - Client system problems with EPEL6/EPEL7 started
> appearing.
> 
> Note that the repo updates were closely timed, but as far as I can tell they
> were NOT concurrent.
> 
> Let me know if you want further details.  As before, I have logs and disk
> snapshots, although there is nothing interesting in the candlepin logs.

Hello,
this sounds very interesting, thanks for the details. Half of these activities were not related to candlepin so we can rule them out (I can elaborate if requested). Some other customers did also modify some/many repos from the same product *sequentially* but in a bulk action, so that should be a hint (anyway, I was mimicking this many times without a luck..).

I will ask you for logs and some more data via mail.

Comment 117 Paul Donohue 2022-11-22 20:21:57 UTC
Small correction to the above: EPEL6, EPEL7, and EPEL8 were all broken.  EPEL9 was not broken.

Comment 119 Vincent S. Cojot 2022-12-02 20:32:09 UTC
Hitting this on 6.11.4.1 after upgrade from 6.9.10.
Suddenly some custom products would no longer show up for client systems RHEL7 and RHEL8.
Found this BZ and https://community.theforeman.org/t/entitlement-certificate-not-containing-all-content/20764/3

I downloaded https://raw.githubusercontent.com/ATIX-AG/orcharhino-scripts/main/find_missing_candlepin_product_contents/find_missing_candlepin_product_contents.rb

For the script to work on RHEL7.9 + 6.11.4.1, I had to patch it like this:

[root@sat6 ~]# diff -u find_missing_candlepin_product_contents.rb.orig find_missing_candlepin_product_contents.rb
--- find_missing_candlepin_product_contents.rb.orig     2022-12-02 15:22:31.192117015 -0500
+++ find_missing_candlepin_product_contents.rb  2022-12-02 15:21:06.200541724 -0500
@@ -10,6 +10,14 @@
 
 require 'set'
 
+unless [].respond_to? :to_h
+  class Array
+    def to_h
+      Hash[self]
+    end
+  end
+end
+
 def query_database(dbname, sql)
   res = []
   IO.popen("su - postgres -c 'psql #{dbname} -t -A -z'", 'w+') do |io|


this is because sat 6.11.4.1 is still on ruby 2.0:
[root@sat6 ~]# ruby --version
ruby 2.0.0p648 (2015-12-16) [x86_64-linux]

After fixing the script, it reported 553 unique repos missing.
(All of the RHOSP container repos for OSP13, OSp15, OSP16 and OSP17).

Once I ran the script with --repair, it would no longer report any repos which needed to be 'repaired'.

But it still shows some inconsistencies:

[root@sat6 ~]# ruby ./find_missing_candlepin_product_contents.rb 2>&1|tail -6

Product "GitLab Runner"(cp_id=932577792545) has 2 content in foremanDB, but 3 content in candlepinDB
Product "Dell OMSA"(cp_id=757933989515) has 2 content in foremanDB, but 3 content in candlepinDB
Product "Extra Packages"(cp_id=256062657870) has 12 content in foremanDB, but 24 content in candlepinDB

Comment 120 Vincent S. Cojot 2022-12-02 20:54:59 UTC
After running the script on my 6.11.4.1 Sat6, the custom products are back:

#  rct cat-cert /etc/pki/entitlement/7138422137888002631.pem |grep -i extra
        Label: krynn_Extra_Packages_rhel8-compat-rpms
        URL: /krynn/Library/RHEL_8_HVM_Servers/custom/Extra_Packages/rhel8-compat-rpms
        Label: krynn_Extra_Packages_rhel8-noarch-rpms
        URL: /krynn/Library/RHEL_8_HVM_Servers/custom/Extra_Packages/rhel8-noarch-rpms
        Label: krynn_Extra_Packages_rhel8-x86_64-rpms
        URL: /krynn/Library/RHEL_8_HVM_Servers/custom/Extra_Packages/rhel8-x86_64-rpms

# dnf repolist|grep -i extra
krynn_Extra_Packages_rhel8-compat-rpms       rhel8-compat-rpms
krynn_Extra_Packages_rhel8-noarch-rpms       rhel8-noarch-rpms
krynn_Extra_Packages_rhel8-x86_64-rpms       rhel8-x86_64-rpms

Comment 126 Jeremy Lenz 2023-01-27 19:36:51 UTC
Current workaround: Upstream (Katello 4.7+) and on Satellite builds that have the new rake task, you can run `foreman-rake katello:check_candlepin_content` to find broken repos.

a) If the bad repo is a Red Hat repository, we recommend refreshing the manifest and running the rake task again. Alternately, you may also need to delete repositories and re-enable.
b) If it's a custom repository, need to delete and recreate the repository.

Comment 127 Steffen Froemer 2023-02-01 06:07:34 UTC
Please correct me if I'm wrong, but as far as I remember, you can't delete repos when part of a published content-view, which should be the case in all scenarios. That said, the workaround isn't a workaround anymore.

Comment 128 Sayan Das 2023-02-01 07:23:52 UTC
(In reply to Steffen Froemer from comment #127)
> Please correct me if I'm wrong, but as far as I remember, you can't delete
> repos when part of a published content-view, which should be the case in all
> scenarios. That said, the workaround isn't a workaround anymore.

About the repo being part of the CV, 
You can delete such repos if it's satellite 6.11 or above. We have a feature for that.

Take a look at the 6.11 section from https://access.redhat.com/solutions/3180551

Comment 131 Vladimír Sedmík 2023-02-10 16:09:09 UTC
Tested in 6.13.0 snap 10 (candlepin-4.2.13-1.el8sat.noarch) using this (the only known) reproducer: https://bugzilla.redhat.com/show_bug.cgi?id=2150116#c1

Similar to 6.12.1 testing (https://bugzilla.redhat.com/show_bug.cgi?id=2150116#c2), after 300 runs of the reproducer script the content host still shown all repositories.

Also checked the new rake task for missing content identification:
[root@sat ~]# foreman-rake katello:check_candlepin_content
I, [2023-02-10T10:15:59.322465 #124444]  INFO -- : Checked 20 repositories.

And once again after manual content removal:
[root@sat ~]# foreman-rake console
> root = ::Katello::RootRepository.last
> ::Katello::Resources::Candlepin::Product.remove_content(root.product.organization.label, root.product.cp_id, root.content_id)
> ::Katello::Resources::Candlepin::Content.destroy(root.product.organization.label, root.content_id)

[root@sat ~]# foreman-rake katello:check_candlepin_content
I, [2023-02-10T08:46:17.194914 #123592]  INFO -- : Checked 20 repositories.
I, [2023-02-10T08:46:17.195116 #123592]  INFO -- : There were 1 repositories that do not exist in the backend system [Candlepin]
I, [2023-02-10T08:46:17.222229 #123592]  INFO -- : Organization - "Default Organization", Product - "CUSTOM__ZOO_1", Repository: "CUSTOM__ZOO_1-20"

Given this result, I consider this particular reproducer as resolved in 6.13.0, although it may not be the only reproducer of the issue (see comment#112). Also the rake task works as expected.

Comment 132 Pavel Moravec 2023-03-15 08:04:32 UTC
To prevent some potential confusion:

The underlying bug is *NOT* (fully) fixed, despite the status is verified etc.

What is fixed/improved:
- one minor scenario has been reproduced, fixed and the fix verified (concurrently modifying multiple repos in a custom product)
- a script detecting broken linage between repo and product will be shipped; this helps *detecting* the problem has occurred


What is *not* fixed: there is an unknown scenario where even sequential update of a custom repository can trigger the situation when entitlement certificate for a custom repo is missing at a client. Despite a huge effort made, we are unable to reproduce it internally.


So even on the latest 6.11.z (with cloned https://bugzilla.redhat.com/show_bug.cgi?id=2166748 fixed) or e.g. in future Satellite 6.13, the bug will be still present.

Comment 136 errata-xmlrpc 2023-05-03 13:20:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Satellite 6.13 Release), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2097

Comment 137 Red Hat Bugzilla 2023-09-18 00:24:48 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days