Bug 1428541

Summary: [RFE] Need a file-system integrity report for /var/lib/pulp
Product: Red Hat Satellite Reporter: Mike McCune <mmccune>
Component: PulpAssignee: Patrick Creech <pcreech>
Status: CLOSED ERRATA QA Contact: Sanket Jagtap <sjagtap>
Severity: high Docs Contact:
Priority: high    
Version: 6.2.7CC: aupadhye, bkearney, bmbouter, cmarinea, daniele, daviddavis, dkliban, egolov, fgarciad, ggainey, ipanova, jbhatia, jcallaha, jentrena, jortel, jswensso, kabbott, kdixon, ktordeur, mhrivnak, mmccune, mvanderw, omaciel, pcreech, rchan, rjerrido, satellite6-bugs, sjagtap, swa, syangsao, tdaianov, ttereshc, vanhoof
Target Milestone: 6.4.0Keywords: FutureFeature, PrioBumpGSS, PrioBumpQA, Triaged
Target Release: UnusedFlags: cmarinea: needinfo?
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: pulp-rpm-2.16.4.1-1,satellite-6.4.0-15,pulp-rpm-2.16.4.1-5,satellite-6.4.1-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-16 15:27:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1122832, 1385841    

Description Mike McCune 2017-03-02 19:26:00 UTC
In certain cases due to historical issues with Satellite or on-going problems during content manipulation, there are situations where various inconsistencies can exist with the files contained within /var/lib/pulp

We have customers who are going to start deploying the new 'repair' facilities in this feature we are adding:

https://bugzilla.redhat.com/show_bug.cgi?id=1223023

* [RFE] Allow Pulp to verify/repair corrupted packages in a repository 

with the addition of the repair side of this feature we need a way to identify the following conditions:

 * Missing RPMs from /var/lib/pulp/content
 * Corrupt/NOT OK md5sums on any unit in /var/lib/pulp/content
 * invalid repositories contained within /var/lib/pulp/published where the yum metadata points at sylinks that are missing 
 * missing or broken symlinks for published repositories for Content Views

EG:

Source: /var/lib/pulp/published/yum/master/yum_distributor/Default_Organization-Library-rhel7-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server/1472243944.1

Target: /var/lib/pulp/published/yum/https/repos/Default_Organization/Library/rhel7/content/dist/rhel/server/7/7Server/x86_64/os

May add more criteria to check but in order to restore confidence in the integrity of /var/lib/pulp, we need to be able to report on the state of this sub-directory.

Runtimes to generate this report are expected to be very long but this should not be a blocker for the implementation

Comment 1 pulp-infra@redhat.com 2017-03-03 17:02:00 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 2 pulp-infra@redhat.com 2017-03-03 17:02:04 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 10 Mike McCune 2018-02-06 16:09:54 UTC
Jeff, Answered in the  upstream issue

Comment 14 pulp-infra@redhat.com 2018-04-09 14:04:07 UTC
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 15 pulp-infra@redhat.com 2018-04-24 09:34:05 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 17 pulp-infra@redhat.com 2018-07-12 16:36:33 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 18 pulp-infra@redhat.com 2018-07-12 17:11:49 UTC
All upstream Pulp bugs are at MODIFIED+. Moving this bug to POST.

Comment 25 Mike McCune 2018-09-17 19:19:16 UTC
Sanket, looks like this didn't make it downstream. Moving back to assigned to get it in the next snap.

Comment 26 Patrick Creech 2018-09-17 19:42:16 UTC
Looks like everything is there, we just need to ensure it makes it into the compose.  Working on that now

Comment 27 Patrick Creech 2018-09-18 00:27:58 UTC
Updated satellite rpm to pull in pulp-integrity by default in satellite-common (server and capsule)

Comment 30 pulp-infra@redhat.com 2018-09-18 18:34:39 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.

Comment 31 Patrick Creech 2018-09-24 14:46:01 UTC
snap 23, not 63

Comment 32 Mike McCune 2018-09-25 19:11:04 UTC
fails pretty quickly on SNAP 23

# pulp-integrity 
Traceback (most recent call last):
  File "/usr/bin/pulp-integrity", line 9, in <module>
    load_entry_point('pulp-integrity==2.16.4.1', 'console_scripts', 'pulp-integrity')()
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 378, in load_entry_point
    return get_distribution(dist).load_entry_point(group, name)
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 2566, in load_entry_point
    return ep.load()
  File "/usr/lib/python2.7/site-packages/pkg_resources.py", line 2260, in load
    entry = __import__(self.module_name, globals(),globals(), ['__name__'])
  File "/usr/lib/python2.7/site-packages/pulp_integrity/integrity.py", line 3, in <module>
    import pyparsing as pp
ImportError: No module named pyparsing

Comment 33 pulp-infra@redhat.com 2018-09-25 19:35:43 UTC
Requesting needsinfo from upstream developer dkliban, ttereshc, daviddavis because the 'FailedQA' flag is set.

Comment 34 Patrick Creech 2018-09-25 22:17:57 UTC
So, I did some digging:

    https://github.com/pulp/pulp_rpm/blob/2-master/pulp-integrity/pulp_integrity/integrity.py#L3-L5

    https://github.com/pulp/pulp_rpm/blob/2-master/pulp-integrity/pulp_integrity/validator.py#L5

    https://github.com/pulp/pulp_rpm/blob/2-master/pulp-integrity/pulp_integrity/generic.py#L5

    https://github.com/pulp/pulp_rpm/blob/2-master/pulp-integrity/pulp_integrity/rpm.py#L4-L5

These imports in the pulp-integrity codebase don't live in the python stdlib, and don't live in the only Requires that pulp-integrity pulls in:

    https://github.com/pulp/pulp-packaging/blob/master/packages/pulp-rpm/pulp-rpm.spec#L349

Here's the pulp-packaging commit that introduced this package:

    https://github.com/pulp/pulp-packaging/commit/80a1d586b5e2bf0632894453376aefda5fabd005


From what I can tell, it needs to have the following requires added at a minimum:

    Requires:  pyparsing
    Requires:  pulp-rpm-plugins

pulp-rpm-plugins because it is dependent on one of it's distributors, and it should pull in the rest of the dependent stack

It does import from other locations in pulp, not sure at the moment if packaging best practices recommend setting a Requires: on the others

    from pulp_rpm.plugins.distributors.yum import configuration as yum_config

Comment 36 Sanket Jagtap 2018-09-28 09:03:31 UTC
Build: Satellite 6.4.0 snap24

Satellite: 
pulp-integrity  --validation "((dark_content existence) size)"
{
  "report": [
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "5c3ac436-cca5-4dd2-85ca-85c54740f0af" ,
    "repo_id": "ae8076c8-0911-4b66-b494-700416dc613b" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/da/68f1cc23168dcbc7c8796aa876756ceec5c38e061a4ceb0a3cd37c995da1c1" 
  } ,
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "5c3ac436-cca5-4dd2-85ca-85c54740f0af" ,
    "repo_id": "1-RHEL_6_CV-v1_0-ae8076c8-0911-4b66-b494-700416dc613b" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/da/68f1cc23168dcbc7c8796aa876756ceec5c38e061a4ceb0a3cd37c995da1c1" 
  } ,
<snip>
...

 pulp-integrity --check {checksum,size,broken_symlinks,dark_content,existence}
{
  "report": [
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "5c3ac436-cca5-4dd2-85ca-85c54740f0af" ,
    "repo_id": "ae8076c8-0911-4b66-b494-700416dc613b" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/da/68f1cc23168dcbc7c8796aa876756ceec5c38e061a4ceb0a3cd37c995da1c1" 
  } ,
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "5c3ac436-cca5-4dd2-85ca-85c54740f0af" ,
    "repo_id": "1-RHEL_6_CV-v1_0-ae8076c8-0911-4b66-b494-700416dc613b" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/da/68f1cc23168dcbc7c8796aa876756ceec5c38e061a4ceb0a3cd37c995da1c1" 
  } ,

Capsule:

pulp-integrity --validation "((dark_content existence) size)"
{
  "report": [
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "b679e46a-297b-4a84-a021-15f7ecefa990" ,
    "repo_id": "1-RHEL_6_CV-DEV-ae8076c8-0911-4b66-b494-700416dc613b" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/da/68f1cc23168dcbc7c8796aa876756ceec5c38e061a4ceb0a3cd37c995da1c1" 
  } ,
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "09ffeeb5-a11b-4d9c-8c12-3046f4704033" ,
    "repo_id": "1-RHEL_7_CV-DEV-f349e969-3959-4710-9641-f49f914224d7" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/d9/220b57c3d9f08a3b4aabb5f7b90bac9190507a3ea13990d8f6ebdbc9bdda53"
<snip>

pulp-integrity --check {checksum,size,broken_symlinks,dark_content,existence}
{
  "report": [
  {
    "validator": "dark_content, pulp-integrity 2.16.4.1, pulp_integrity.generic:DarkContentValidator" ,
    "unit": "distribution: <no-filename>" ,
    "unit_id": "b679e46a-297b-4a84-a021-15f7ecefa990" ,
    "repo_id": "1-RHEL_6_CV-DEV-ae8076c8-0911-4b66-b494-700416dc613b" ,
    "error": "The path was not found on the filesystem." ,
    "path": "/var/lib/pulp/content/units/distribution/da/68f1cc23168dcbc7c8796aa876756ceec5c38e061a4ceb0a3cd37c995da1c1" 
  } ,

Comment 38 errata-xmlrpc 2018-10-16 15:27:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2927