Bug 1011100

Summary: a 'derived' pool without a 'master' causes trouble
Product: [Community] Candlepin Reporter: Dennis Crissman <dcrissman>
Component: candlepinAssignee: Devan Goodwin <dgoodwin>
Status: CLOSED CURRENTRELEASE QA Contact: Katello QA List <katello-qa-list>
Severity: high Docs Contact:
Priority: high    
Version: 0.9CC: dgoodwin, mstead, rmunilla
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-06 14:26:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
derived pools in stage w/out master none

Description Dennis Crissman 2013-09-23 15:44:29 UTC
This is a spin-off of https://bugzilla.redhat.com/show_bug.cgi?id=1010616.

In stage we have hit a few times where it appears a subscription was removed from the service layer, a pool refresh done, but the 'derived' pool was not deleted. This causes us to be in a funny state where subscription-manager and rhsm-web report that this pool is available for subscription from, but when the bind is attempted an error is thrown because the subscription on the services side no longer exists.

We do not have a reproducible path yet, but again, we have seen this a couple times now.

Here is an example from the stage database:
select * from cp_pool where owner_id = '8a99f98340114f88014032c84fbc1aee' and productId = 'MCT1339F3';+----------------------------------+---------------------+---------------------+---------------+--------------------+----------------+---------------------+-----------+---------------------------------------------------------------------------------------------------------+----------+----------------------+---------------------+----------------+----------------------------------+----------------------+--------------------+---------+-------------+------------------+--------------------+---------------+-------------------+
| id                               | created             | updated             | accountNumber | activeSubscription | contractNumber | endDate             | productId | productName                                                                                             | quantity | restrictedToUsername | startDate           | subscriptionId | owner_id                         | sourceEntitlement_id | subscriptionSubKey | version | orderNumber | derivedProductId | derivedProductName | sourceStackId | sourceConsumer_id |
+----------------------------------+---------------------+---------------------+---------------+--------------------+----------------+---------------------+-----------+---------------------------------------------------------------------------------------------------------+----------+----------------------+---------------------+----------------+----------------------------------+----------------------+--------------------+---------+-------------+------------------+--------------------+---------------+-------------------+
| 8a99f98340a67e350140b91d78dd5c89 | 2013-08-26 01:34:11 | 2013-09-05 21:12:10 | NULL          |                   | NULL           | 2014-07-24 23:59:59 | MCT1339F3 | Red Hat Enterprise Linux Advanced Platform for IBM POWER + pAVE, Standard L3 (unlimited sockets) 3 year |       -1 | NULL                 | 2013-07-25 00:00:00 | 2690204        | 8a99f98340114f88014032c84fbc1aee | NULL                 | derived            |       7 | NULL        | NULL             | NULL               | NULL          | NULL              |
+----------------------------------+---------------------+---------------------+---------------+--------------------+----------------+---------------------+-----------+---------------------------------------------------------------------------------------------------------+----------+----------------------+---------------------+----------------+----------------------------------+----------------------+--------------------+---------+-------------+------------------+--------------------+---------------+-------------------+
1 row in set (0.00 sec)

Comment 1 Dennis Crissman 2013-09-24 13:16:01 UTC
Here is another one, but a little different. Only a 'derived' entry exists for this subscription, but in this case a matching subscription does exist from the service layer. When I do a pool refresh the job gets a state of 3 with no exceptions.

I will note that the subscription in the service layer does not seem to have a derived product on it, so perhaps it was removed.


mysql> select * from cp_pool where subscriptionId = 2691073;
+----------------------------------+---------------------+---------------------+---------------+--------------------+----------------+---------------------+-----------+--------------------------------+----------+----------------------+---------------------+----------------+----------------------------------+----------------------+--------------------+---------+-------------+------------------+--------------------+---------------+-------------------+
| id                               | created             | updated             | accountNumber | activeSubscription | contractNumber | endDate             | productId | productName                    | quantity | restrictedToUsername | startDate           | subscriptionId | owner_id                         | sourceEntitlement_id | subscriptionSubKey | version | orderNumber | derivedProductId | derivedProductName | sourceStackId | sourceConsumer_id |
+----------------------------------+---------------------+---------------------+---------------+--------------------+----------------+---------------------+-----------+--------------------------------+----------+----------------------+---------------------+----------------+----------------------------------+----------------------+--------------------+---------+-------------+------------------+--------------------+---------------+-------------------+
| 8a99f98440e442240140e70ecf865d98 | 2013-09-03 23:40:42 | 2013-09-12 18:21:27 | 5206818       |                   | NULL           | 2014-07-24 23:59:59 | SER0406   | Red Hat OpenStack Tech Preview |       -1 | NULL                 | 2013-07-25 00:00:00 | 2691073        | 8a99f9843a7a3a9d013a8c12f1f738cc | NULL                 | derived            |       6 | NULL        | NULL             | NULL               | NULL          | NULL              |
+----------------------------------+---------------------+---------------------+---------------+--------------------+----------------+---------------------+-----------+--------------------------------+----------+----------------------+---------------------+----------------+----------------------------------+----------------------+--------------------+---------+-------------+------------------+--------------------+---------------+-------------------+
1 row in set (0.81 sec)


mysql> select * from cp_owner where id = '8a99f9843a7a3a9d013a8c12f1f738cc';
+----------------------------------+---------------------+---------------------+-------------+---------+--------------+---------------+---------------------+-------------+
| id                               | created             | updated             | displayName | account | parent_owner | contentPrefix | defaultServiceLevel | upstream_id |
+----------------------------------+---------------------+---------------------+-------------+---------+--------------+---------------+---------------------+-------------+
| 8a99f9843a7a3a9d013a8c12f1f738cc | 2012-10-23 01:23:02 | 2012-10-23 01:23:02 | 6752652     | 6752652 | NULL         | NULL          | NULL                | NULL        |
+----------------------------------+---------------------+---------------------+-------------+---------+--------------+---------------+---------------------+-------------+
1 row in set (0.00 sec)


mysql> select * from cp_job where targetId = '6752652' order by created DESC limit 5;
+----------------------------------------------------+---------------------+---------------------+---------------------+-----------------------------------+---------------------+-------+-------------+-----------------+----------+------------+
| id                                                 | created             | updated             | finishTime          | result                            | startTime           | state | jobGroup    | principalName   | targetId | targetType |
+----------------------------------------------------+---------------------+---------------------+---------------------+-----------------------------------+---------------------+-------+-------------+-----------------+----------+------------+
| refresh_pools_6a3e6d18-bea0-4836-bce5-36d403021d85 | 2013-09-24 09:01:11 | 2013-09-24 09:01:11 | 2013-09-24 09:01:11 | Pools refreshed for owner 6752652 | 2013-09-24 09:01:11 |     3 | async group | candlepin_admin | 6752652  |          0 |
| refresh_pools_2e61a004-d7a0-452d-968e-3d8168844d3f | 2013-09-24 09:00:14 | 2013-09-24 09:00:17 | 2013-09-24 09:00:17 | Pools refreshed for owner 6752652 | 2013-09-24 09:00:14 |     3 | async group | candlepin_admin | 6752652  |          0 |
+----------------------------------------------------+---------------------+---------------------+---------------------+-----------------------------------+---------------------+-------+-------------+-----------------+----------+------------+
2 rows in set (0.00 sec)

Comment 2 Dennis Crissman 2013-09-24 14:48:20 UTC
May or may not be related to https://bugzilla.redhat.com/show_bug.cgi?id=1009536, but they share some symptoms as sometimes a refresh pool on an account with a derived only can put the system into the same state.

Comment 3 Dennis Crissman 2013-09-24 15:01:15 UTC
Quick query to see how many derived only subscriptions we have.

select id, created, productId, productName, subscriptionId, subscriptionSubKey from cp_pool where subscriptionSubKey = 'derived' and subscriptionId not in (select subscriptionId from cp_pool where subscriptionSubKey = 'master');

4202 rows in set (1.22 sec)

Comment 4 Dennis Crissman 2013-09-24 15:53:46 UTC
(In reply to Dennis Crissman from comment #3)
> Quick query to see how many derived only subscriptions we have.
> 
> select id, created, productId, productName, subscriptionId,
> subscriptionSubKey from cp_pool where subscriptionSubKey = 'derived' and
> subscriptionId not in (select subscriptionId from cp_pool where
> subscriptionSubKey = 'master');
> 
> 4202 rows in set (1.22 sec)

Ran this same query in production and got an empty set back.

Comment 5 Dennis Crissman 2013-09-24 20:02:49 UTC
Created attachment 802439 [details]
derived pools in stage w/out master

Comment 6 Devan Goodwin 2013-09-25 12:20:12 UTC
Fixed in candlepin.git master: 80ce0976512f72451a89f64dfcb826152e24bebb

Also cherry picked to 0.8.28 hotfix branch and built for IT.

Caused by a recent change to stop cleaning up derived pools during refresh when the subscription is gone/expired. This is fine for derived pools tied to entitlements as they are cleaned up during the cleanup of the main pool, however it is not ok for virt bonus pools which are not tied to any specific pool.

Modified to clean up virt bonus pools as well.

Comment 7 Devan Goodwin 2013-09-27 16:19:37 UTC
Moving to ON_QA, this is stage and production now. However it will probably be impossible for QE to reproduce and we may have to take IT's word on it's effectiveness.

Comment 8 Dennis Crissman 2013-11-06 14:26:14 UTC
I am no longer able to find a 'derived' pool w/out a 'master'. Closing this ticket.