Bug 1659549

Summary: productid is not published in the content view if that is the only item which changed in the sync
Product: Red Hat Satellite 6 Reporter: sthirugn <sthirugn>
Component: Content ViewsAssignee: Partha Aji <paji>
Status: CLOSED ERRATA QA Contact: Lai <ltran>
Severity: high Docs Contact:
Priority: urgent    
Version: 6.3.4CC: andrew.schofield, bkearney, dgross, ehelms, jsherril, kabbott, ktordeur, ltran, mmccune, mtenheuv, paji, sadas, zhunting
Target Milestone: 6.5.0Keywords: Triaged
Target Release: Unused   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: tfm-rubygem-katello-3.10.0.28-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1664127 (view as bug list) Environment:
Last Closed: 2019-05-14 12:39:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1664127, 1664129    

Description sthirugn@redhat.com 2018-12-14 16:28:16 UTC
Description of problem:
productid is not published in the content view if that is the only item which changed in the sync

Version-Release number of selected component (if applicable):
Satellite 6.3.5

How reproducible:
Always

Steps to Reproduce:
1.We had seen this happen often with kickstart repo like rhel 7.1, rhel 6.9 kickstart repos.
Customer scenario:
- Content view is published and promoted to a lifecycle environment successfully.
- Capsule sync started after the promotion completed

Actual results:
Capsule sync tasks finished with stopped/warning status with /var/log/messages showing. It appears that the productid of the kickstart repo changed in upstream but not other contents. With the current logic of publish to a lifecycle environment, katello thinks there is nothing new to publish to the target lifecycle environment. The orphan clean up job then comes and removes the old productid file in the content view version assuming it wouldn't be used anymore.  But in reality the capsule is still referring to this old product id file and hence the error.


Dec 14 03:57:41 uslp2546403 pulp: nectar.downloaders.threaded:INFO: Download failed: Download of https://satellite.example.com/pulp/repos/Default_Organization/UAT/CV_TEST4/content/dist/rhel/server/7/7.1/x86_64/kickstart/repodata/4c4bc87d3301fd34ca1d49b4787c6f8ee4528e298f031bca447f803b1e374f28-productid.gz failed with code 404: Not Found
...
...
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688) Not Found
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688) Traceback (most recent call last): 
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/sync.py", line 263, in run
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688)     metadata_files = self.get_metadata(metadata_files)
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/sync.py", line 450, in get_metadata
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688)     metadata_files.download_metadata_files()
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688)   File "/usr/lib/python2.7/site-packages/pulp_rpm/plugins/importers/yum/repomd/metadata.py", line 217, in download_metadata_files
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688)     raise IOError(error_report.error_msg)
Dec 14 03:57:46 uslp2546403 pulp: pulp_rpm.plugins.importers.yum.sync:ERROR: [e28dec63] (10008-26688) IOError: Not Found
Dec 14 03:57:46 uslp2546403 pulp: pulp.server.async.tasks:INFO: [e28dec63] Task failed : [e28dec63-a96d-4668-a9ee-879fa8a4ef07]
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688) Task pulp.server.managers.repo.sync.sync[e28dec63-a96d-4668-a9ee-879fa8a4ef07] raised unexpected: PulpExecutionException('Importer indicated a failed response',)
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688) Traceback (most recent call last): 
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)     R = retval = fun(*args, **kwargs)
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 527, in __call__
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)     return super(Task, self).__call__(*args, **kwargs)
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 107, in __call__
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)     return super(PulpTask, self).__call__(*args, **kwargs)
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)     return self.run(*args, **kwargs)
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)   File "/usr/lib/python2.7/site-packages/pulp/server/controllers/repository.py", line 827, in sync
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688)     raise pulp_exceptions.PulpExecutionException(_('Importer indicated a failed response'))
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:ERROR: (9494-26688) PulpExecutionException: Importer indicated a failed response
Dec 14 03:57:46 uslp2546403 pulp: celery.worker.job:INFO: Task pulp.server.async.tasks._release_resource[0b79efb6-2771-4d27-890c-a39cab43f7f9] succeeded in 0.00446466822177s: None

Expected results:
When there is no actual change in the rpm contents and the change happens only in the product id and other metadata contents, they all should be pushed out to right life cycle environments during publish.

Additional info:
Temporary workaround is to re-publish the content view version to the desired lifecycle environment with forced yum metadata regeneration:
hammer content-view version promote --organization-id <id> --content-view-id <id> --version <version_number> --to-lifecyle-environment-id <lce_id> --force-yum-metadata-regeneration

Comment 3 Justin Sherrill 2018-12-14 16:54:32 UTC
More details on how you might reproduce the bug:

1.  create some yum repo on disk with a product_id fille
2.  create and sync this repo
3.  Publish the repo in a content view (version 1)
4.  Update/modify the product_id file within the yum repo
5.  Resync the yum repository
6.  Publish the content view again (version 2)
7.  Completely delete version 1
8.  Run the pulp orphan cleanup
9.  Try to download the product id file from the repository or use a client with it

Result, 404 not found

Comment 4 Justin Sherrill 2018-12-17 21:45:17 UTC
Created redmine issue https://projects.theforeman.org/issues/25718 from this bug

Comment 5 pm-sat@redhat.com 2018-12-19 19:11:16 UTC
Upstream bug assigned to paji@redhat.com

Comment 7 Zach Huntington-Meath 2019-01-08 17:43:37 UTC
Partha, https://github.com/Katello/katello/pull/7903 failed to pick cleanly. Do you mind making an MR for it downstream?

Comment 9 Partha Aji 2019-03-01 22:23:34 UTC
0) Create a repo with the following bash script (Note it doesnot have a productid file)
DIR=/tmp/my-data
mkdir $DIR
cd  $DIR
wget https://partha.fedorapeople.org/test-repos/rpm-with-productid/elephant-0.3-0.8.noarch.rpm
createrepo .
#start serving this dir
python -m SimpleHTTPServer 5050

1) Create and sync repo in sat with feed pointing to http://<fqdn>:5050 
2) put it in a CV and publish and promote to an env
3) now update the repo with a product id. Something like
DIR=/tmp/my-data
cd  $DIR
echo "100000" > productid
modifyrepo  --mdtype=productid productid repodata 

4) resync, publish the CV and promote the new version
5) Go to Monitor -> Tasks view and look for the latest promote 
6) Go to dynflow console Run tab
7) Search for Actions::Katello::Repository::CheckMatchingContent
8) Expand and see check the value of match content 
Expected:
 matching_content: false
Actual:
 matching_content: true


The point here is that "matching_content" value is used to determine whether metadata needs to be republished. However since katello was not tracking stuff in "repodata" directory (aka Yum Metadata) it ignored the fact that a productid file got added there. This fix indexes the yum metadata and thus katello is able to track changes in the repodata directory including productid.

Comment 11 Lai 2019-03-26 14:41:20 UTC
Tested using Partha's steps:

0) Create a repo with the following bash script (Note it doesnot have a productid file)
DIR=/tmp/my-data
mkdir $DIR
cd  $DIR
wget https://partha.fedorapeople.org/test-repos/rpm-with-productid/elephant-0.3-0.8.noarch.rpm
createrepo .
#start serving this dir
python -m SimpleHTTPServer 5050

1) Create and sync repo in sat with feed pointing to http://<fqdn>:5050 
2) put it in a CV and publish and promote to an env
3) now update the repo with a product id. Something like
DIR=/tmp/my-data
cd  $DIR
echo "100000" > productid
modifyrepo  --mdtype=productid productid repodata 

4) resync, publish the CV and promote the new version
5) Go to Monitor -> Tasks view and look for the latest promote 
6) Go to dynflow console Run tab
7) Search for Actions::Katello::Repository::CheckMatchingContent
8) Expand and see check the value of match content

Expected:
  matching_content: false
Actual:
  matching_content: false

9) Repeat steps 3 - 8

Expected:
  matching_content: false
Actual:
  matching_content: true

Somehow updating the productid, resycing, and republishing multiple times doesn't create a matching_content of false.  It has returned true all the time.  The only time that it goes back to false is when you toggle the download policy to Immediate and On Demand after following the above steps.  Failing this bug due to inconsistencies of behavior.

Comment 12 Lai 2019-03-28 19:23:10 UTC
Forgot to mention but for step 9, on step 3, make sure to update the productid.  I've updated the productid multiple times but still got a maching_content: true.

Comment 15 Partha Aji 2019-04-02 14:03:31 UTC
After discussing with Lai I propose a slight modification to the steps in -> https://bugzilla.redhat.com/show_bug.cgi?id=1659549#c11

3) now update the repo with a product id. Something like
DIR=/tmp/my-data
cd  $DIR
echo "100000" >> productid
createrepo .
modifyrepo  --mdtype=productid productid repodata 


Add a "createrepo" step so that a new revision number shows up in repomd.xml .

Comment 16 Lai 2019-04-17 15:45:38 UTC
Tested using Partha's steps:

0) Create a repo with the following bash script (Note it doesnot have a productid file)
DIR=/tmp/my-data
mkdir $DIR
cd  $DIR
wget https://partha.fedorapeople.org/test-repos/rpm-with-productid/elephant-0.3-0.8.noarch.rpm
createrepo .
#start serving this dir
python -m SimpleHTTPServer 5050

1) Create and sync repo in sat with feed pointing to http://<fqdn>:5050 
2) put it in a CV and publish and promote to an env
3) now update the repo with a product id. Something like
DIR=/tmp/my-data
cd  $DIR
echo "100000" > productid
createrepo .
modifyrepo  --mdtype=productid productid repodata 

4) resync, publish the CV and promote the new version
5) Go to Monitor -> Tasks view and look for the latest promote 
6) Go to dynflow console Run tab
7) Search for Actions::Katello::Repository::CheckMatchingContent
8) Expand and see check the value of match content

Expected:
  matching_content: false
Actual:
  matching_content: false

Verified on 6.5.0_24

Comment 18 errata-xmlrpc 2019-05-14 12:39:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:1222