Bug 1184853

Summary: [RFE] Don't publish the same repositories over and over again
Product: Red Hat Satellite Reporter: Stephen Benjamin <stbenjam>
Component: RepositoriesAssignee: John Mitsch <jomitsch>
Status: CLOSED DUPLICATE QA Contact: Katello QA List <katello-qa-list>
Severity: high Docs Contact: Russell Dickenson <rdickens>
Priority: high    
Version: 6.0.7CC: abradshaw, barry.gestwicki.ctr, bbuckingham, bhinson, bkearney, cwelton, dkaylor, erik-fedora, jomitsch, jsherril, mhrivnak, mmccune, nstrug, peter.vreman, prescott, rdickens, riehecky, sauchter, sreber, walden, xdmoon
Target Milestone: UnspecifiedKeywords: FutureFeature, ReleaseNotes, Triaged
Target Release: Unused   
Hardware: Unspecified   
OS: Unspecified   
URL: http://projects.theforeman.org/issues/15798
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
Issue: Publishing and promoting Content View performance is not optimal and will be improved in future releases. Currently the publish and promotion process operates on all the repositories in the Content View, not only those repositories whose content had been changed. Workaround: None at this time.
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-05 15:30:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1122832, 1190823    

Description Stephen Benjamin 2015-01-22 10:58:19 UTC
Publishing content views is slow.

If I just want to add a single package to a single repo, the publishing process republishes the entire Content View.

Would it be possible instead to only publish that one repo that had changes when creating the new Content View Version?

Comment 1 RHEL Program Management 2015-01-22 11:03:41 UTC
Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 3 Stephen Benjamin 2015-01-29 10:49:52 UTC
Also this is the same for promotion -> if I'm promoting why isn't it just a single symlink?

Comment 4 Peter Vreman 2015-02-27 11:26:56 UTC
The inodes of a node are getting quickly full. In our system 20Million Inodes are used by Sat6.

# df -h /var/lib/pulp
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/vg_li_lc_1017_app1-lv_app1
                      291G  239G   38G  87% /lvapp1

# df -i /var/lib/pulp
Filesystem             Inodes    IUsed IFree IUse% Mounted on
/dev/mapper/vg_li_lc_1017_app1-lv_app1
                     19333120 19332915   205  100% /lvapp1

Some calculation on the amount of symlinks (=inodes)

Library:
  RedHat repos
    4 releases (6.x,6Server,7.1,7Server) = 4 * 30000 = 120.000
Regular CVs using RedHat repos
  - Amount of CVs 6, at leat 1 for each RedHat release
  - Average has 5 versions
  - Additional lifecycles (including library) => 4 versions
  => 6 * 30000 * (5 + 4) =~ 1.600.000
Composite CVs
  - Amount of Composite 10, for each Application group
  - Inherits at least one RedHat Content View
  - Average has 2 versions
  - Additional lifecycles (including library) => 4 versions
  => 10 * (2 + 4) * 30000 =~ 1.800.000

As you can see it grows quickly, especially the fact the each (Composite) Content View has to include a full RedHat release that means 30000 symlinks.


The use all inodes is even quicker reached as the pulp content is not deleted, see BZ1184442.

Comment 5 Peter Vreman 2015-02-27 12:19:12 UTC
Checked that it is even worse, in the previous calculation the required directories were not included:

For RPMs there are upto 4 directories created

Example of the 6Server-Server repo in the standard Library:

/var/lib/pulp/nodes/published/https/repos/Hilti-Red_Hat_Enterprise_Linux_Server-Red_Hat_Enterprise_Linux_6_Server_RPMs_x86_64_6Server/content/rpm# time find . -type d | wc -l && find . -type l | wc -l
48714
14616

That means in total 63.000 inodes, for only the 6Server-Server repo in the Library.

Comment 7 Peter Vreman 2015-07-14 12:35:07 UTC
Checked with Satellite 6.1. This problem still is available.
We have a RedHat CDN update every day and the published YUM repositories are recreated every day again. That means
1. create tmp repo (xxxxx inodes actions)
2. delete live repo (xxxxx inodes actions)
3. rename tmp to live (single atomic operation)

We have 70 of such repositories, with average 30.000 inodes each.

70 * 30.000 * 2 = 4.200.000 inodes operations per RedHat sync.

Comment 9 Walden Raines 2015-08-11 13:07:44 UTC
Moving to 6.2.0 since this is an RFE

Comment 16 Andrew Kofink 2016-07-22 15:15:08 UTC
Created redmine issue http://projects.theforeman.org/issues/15798 from this bug

Comment 19 Bryan Kearney 2016-08-04 20:18:23 UTC
Moving 6.2 bugs out to sat-backlog.

Comment 21 Peter Vreman 2017-02-17 15:59:33 UTC
Is this maybe fixed in http://projects.theforeman.org/issues/18032 with https://github.com/Katello/katello/pull/6597 ?

Comment 22 Justin Sherrill 2017-02-17 16:13:50 UTC
partially, but I don't think fully.  I think the heart of the issue that stephen is getting at is that adding even if you just want to add one package to a content view, we generate an entirely new version creating all new repos.  This takes a large amount of time.

Comment 24 Peter Vreman 2018-03-05 15:29:29 UTC
Hi Justin,

The https://bugzilla.redhat.com/show_bug.cgi?id=1522912 looks similair and implements optimizations. For me it is ok to close this as duplicate.

Peter

Comment 25 Justin Sherrill 2018-03-05 15:30:53 UTC
I agree!  Thanks Peter for for pointing that out.  Closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 1522912 ***