Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Red Hat Satellite engineering is moving the tracking of its product development work on Satellite to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "Satellite project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs will be migrated starting at the end of May. If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "Satellite project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/SAT-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1244130

Summary:

Pulp nodes (for capsules) disk usage uses O(N) inodes

Product:

Red Hat Satellite

Reporter:

Peter Vreman <peter.vreman>

Component:

Pulp

Assignee:

satellite6-bugs <satellite6-bugs>

Status:

CLOSED ERRATA

QA Contact:

Sachin Ghai <sghai>

Severity:

medium

Docs Contact:

Priority:

high

Version:

6.1.0

CC:

bbuckingham, bkearney, bmbouter, bugzilla_rhn, cwelton, daviddavis, dkliban, egolov, ggainey, hklein, ipanova, jortel, mhrivnak, mmccune, mshimura, pcreech, peter.vreman, rchan, sauchter, sghai, ttereshc, xdmoon

Target Milestone:

Unspecified

Keywords:

Triaged

Target Release:

Unused

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

This needs a Release Note in the errata to detail the cleanup procedure

Story Points:

---

Clone Of:

Environment:

Last Closed:

2016-02-15 15:51:39 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1122832

Attachments:

Description	Flags
inode_usage of pulp nodes publushed repos	none
inode_usage script	none

Description Peter Vreman 2015-07-17 09:00:44 UTC

Created attachment 1053020 [details]
inode_usage of pulp nodes publushed repos

Description of problem:
To make the pulp data available for Capsules (pulp nodes) the pulp repositories are made available in /var/lib/pulp/nodes/published/https/repos.

With every repository added (Also Content Views also creates implicit repositories) a new directory is added.

Example 1: Duplication per Content View, that is O(N):
- A Redhat Server 6.5 repository will generate 60.000 inodes (45.000 directories and 15.000 symlinks). When this is used by 10 Content Views also this RedHat Server repositories will be created 10 times, that means 600.000 inodes.

Example 2: RedHat CDN, that recreates directories:
- Syncing RedHat channels will always recreated directories for every RedHat repository. That means that with a daily sync for an each (both non-EUS and EUS) Server repository those 60.000 need to be created and also deleted.
We are syncing both non-EUS,EUS of Kicksatrt,Server,Optional,RHSCL channels for the following releases 6.5,6.6,6Server,7.1,7Server. This totals then to 71 Directories:

/var/lib/pulp/nodes/published/https/repos# ls -1d Hilti-Red* Hilti-Oracle* | wc -l
71

Taking an average of 30.000 inodes that means:

30.000 * 2 (both create and delete) * 71 = 42.000.000 inodes IO actions need to be done per day.


The shared repository of all RPMs uses only: 228.000 inodes
The pulp nodes published directory contains: 6.352.462 inodes
The pulp nodes published directory contains: 550 directories

See the attached inode_usage.txt for details



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Sync both non-EUS,EUS of Kicksatrt,Server,Optional,RHSCL channels for the following releases 6.5,6.6,6Server,7.1,7Server. 
2. Sync all RedHat repositories once
3. Do an incremental Sync of RedHat repositories and monitor the number of IO transactions on the filesystem

4. Create and Publish 20 ContentViews with each content view has at least Kickstart,Server,Optional repositories included
5. Check the /var/lib/pulp/nodes/published/https/repos directory
6. Check for duplicate directories
7. Count number of inodes in /var/lib/pulp/content/rpm
8. Count number of inodes in /var/lib/pulp/nodes/published/https/repos


Actual results:
- Many Inode transactions during RedHat sync, in fact the time for an incremental sync is almost the same as a full sync.
- Inode usage in the pulp nodes published directory is N-times higher than the shared content/rpm


Expected results:
- Incremental sync shall use at maximum IO actions related to the incremental changes
- Inode usage in pulp nodes published is at maximum the amount of inodes used by the shared rpm content


Additional info:

Comment 1 Peter Vreman 2015-07-17 09:01:24 UTC

Created attachment 1053021 [details]
inode_usage script

Comment 2 RHEL Program Management 2015-07-17 09:15:56 UTC

Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 4 David O'Brien 2015-08-20 02:17:43 UTC

Please verify that this requires a release note, and if so please provide some suitable doc text.

thanks

Comment 5 Brad Buckingham 2015-08-20 13:16:47 UTC

This bug requires modifications to improve the inode usage; however, it shouldn't require a rel note at this time; therefore, removing the sat61-release-notes blocker.

Comment 6 Michael Hrivnak 2015-10-27 15:44:42 UTC

Jeff, please add a link to an upstream issue describing your proposal.

Comment 7 Jeff Ortel 2015-10-27 20:10:08 UTC

Done.

Comment 8 pulp-infra@redhat.com 2015-10-27 20:36:21 UTC

The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2015-10-27 20:36:25 UTC

The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 12 Jeff Ortel 2015-11-30 16:03:23 UTC

Stuart,

After pulp is updatd, the existing links (inode usage) would be cleaned up on subsequent publishes.  If we don't want wait for publishes, we can include a migration script in the solution.

-jeff

Comment 13 Stuart Auchterlonie 2015-12-01 10:40:23 UTC

Hi Jeff,

Maybe i'm missing something here, but under what circumstances would an
*existing* content view get re-published?


Regards
Stuart

Comment 14 Peter Vreman 2015-12-01 12:21:20 UTC

Please take care of Composite Content Views and Environments. These are refering to dedicated Content View Versions. Re-publishing adds a new Content View Versions. The current Sat6 (without automatic latest selection support) cannot make assumptions what to do with Lifecycle Environment Promotions or Content View Versions.

Comment 15 Jeff Ortel 2015-12-01 15:41:32 UTC

My understanding of content view lifecycle is limited.  Thanks for the clarification, Peter.  Looks like we'll want to include a migration script in the solution to clean up the unwanted symlinks.

Comment 16 pulp-infra@redhat.com 2015-12-09 21:00:20 UTC

The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 17 pulp-infra@redhat.com 2015-12-10 23:00:19 UTC

The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 18 pulp-infra@redhat.com 2015-12-15 15:30:25 UTC

The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 19 Jeff Ortel 2015-12-18 14:28:33 UTC

The solution makes it possible for existing symlinks created during node publishing to be blindly deleted.  After upgrade they are no longer used by the child node (capsule).

The links are published in /var/lib/pulp/nodes/published

Subsequent publishes will delete the links.  However if admins want them deleted prior to the next publish, it can be done in one of two ways:

1. Admins can delete them manually.
   Example: find /var/lib/pulp/nodes/published/https/repos -type l -exec rm -f {} \;

2. The RPM spec file can run #1 during upgrade.

I prefer #1.  Thoughts?

Comment 20 Peter Vreman 2015-12-18 15:21:20 UTC

Using the RPM %post is not good as the removal process can take hours.

If a user forgets the manual step in the upgrade process it will never be done, because all the next upgrades do not contain this cleanup step anymore.

Recommendation:
- have a special pulp background task do the cleanup. Then it is for sure that the cleanup will be executed and done in a controlled way.

Alternative:
- add it to katello-upgrade, but that is something more Satellite specific.

Only remove of symlinks does still leave there directories. Is this by design?

Comment 21 Jeff Ortel 2015-12-18 20:27:04 UTC

Yes.  The content/ directory still contains a few real files.

Comment 22 Peter Vreman 2015-12-21 08:50:45 UTC

The major inode use is by the directory tree, see Example 1 in the description https://bugzilla.redhat.com/show_bug.cgi?id=1244130#c0.

For RHEL6.5 EUS
45.000 directories
15.000 symlinks

Deleting the symlinks is therefor not enough. Also the empty directories need to be cleaned.

Comment 23 Bryan Kearney 2016-01-04 18:33:34 UTC

Moving to POST since an upstream fix is available.

Comment 25 Jeff Ortel 2016-01-19 16:17:52 UTC

Understood about the need to purge the directories as well.  I still think a reasonable approach is to provide admin suggested commands to clean things up.

Something like:

find /var/lib/pulp/nodes/published -type l -delete
find /var/lib/pulp/nodes/published -type d -empty -delete

We could also provide a shell script that performs this clean up as well.  A background task in pulp to do this seems like overkill.

Comment 27 Peter Vreman 2016-01-25 12:57:56 UTC

I did not know about the empty and delete features of find. 
I agree that using those 2 simple find commands are good enough to perform the cleanup.

Comment 28 Sachin Ghai 2016-02-09 07:16:34 UTC

To verify this bz, I installed Sat 6.1.6 along with capsule to see the real issue. Now I need some clarification before upgrading to satellite 6.1.7

I'm assuming, before upgrade, we need to check and delete existing published links and directories manually on satellite and capsule server as per comment25, right ?

find /var/lib/pulp/nodes/published -type l -delete
find /var/lib/pulp/nodes/published -type d -empty -delete

2. Need to upgrade with 6.1.7 and again need to validate if published links and directories are being created :

a) on Sat sever by publishing some CV's on sat server 
b) by syncing contents from sat server to capsule ?


Note: I synced rhel6, 7 server kickstart, optional and rhscl repos on Sat server.

@Jeff: could you please take a look and confirm if my assumptions are correct based on bz history ?

Comment 31 Sachin Ghai 2016-02-09 17:04:14 UTC

(In reply to Jeff Ortel from comment #30)
> (In reply to Sachin Ghai from comment #28)
> > To verify this bz, I installed Sat 6.1.6 along with capsule to see the real
> > issue. Now I need some clarification before upgrading to satellite 6.1.7
> > 
> > I'm assuming, before upgrade, we need to check and delete existing published
> > links and directories manually on satellite and capsule server as per
> > comment25, right ?
> 
> Yes.
> 
> > 
> > find /var/lib/pulp/nodes/published -type l -delete
> > find /var/lib/pulp/nodes/published -type d -empty -delete

Thanks Jeff. I removed all existing soft links and empty dir before upgrade..

[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type l -delete
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type l | wc -l
0
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d -empty -delete
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d -empty | wc -l
0


Now proceeding with upgrade with 6.1.7 and will update the test results here.

Comment 32 Sachin Ghai 2016-02-09 18:03:32 UTC

Upgrade went fine from Sat 6.1.6 to 6.1.7 along with external capsule. Later, after upgrade, I ran a capsule sync and  it was finished successfully.

--
 [root@ibm-x3550 ~]# hammer -u admin -p changeme capsule content synchronize --id=2
[............................................................................................................................................] [100%]

--

Two observations:
-------------------

1) Apache's pulp_node.conf updated with new stanza:

<snip>
<Directory /var/www/pulp/nodes/content >
  Options FollowSymLinks Indexes
  SSLRequireSSL
<snip>

2) meta-data re-generated on every capsule sync.
Feb  9 18:53:32 ibm-x3550m3-06 pulp: pulp.plugins.pulp_rpm.plugins.distributors.yum.metadata.metadata:WARNING: Overwriting existing metadata file [/var/lib/pulp/working/repos/Default_Organization-Dev-cv_rhel7-capsule_rhel7-cap_rhel7_617/distributors/yum_distributor/repodata/repomd.xml]

Comment 33 Sachin Ghai 2016-02-10 09:41:55 UTC

After upgrade with 6.1.7 compose1. I performed following tests:

- re-synced some of the existing Red Hat repos - success
- enabled new repos and synced them - success
- created new CV with new and existing repos and published them - success
- published existing CV - success
- re-synced contents from satellite -> capsule - success

I don't find any symlinks after upgrade. Also the inode count for published repos reduced to very low count.

Before upgrade:
----------------

[root@ibm-x3550m3 ~]# find /var/lib/pulp/nodes/published -type l | wc -l
568186
[root@ibm-x3550m3 ~]# find /var/lib/pulp/nodes/published -type d  | wc -l
2141711
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published/https/repos -name '*' | wc -l
2710169

After upgrade:
----------------

[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type l | wc -l
0
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d -empty | wc -l
1
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d  | wc -l
201
[root@ibm-x3550 ~]#  find /var/lib/pulp/nodes/published/https/repos -name '*' | wc -l
499

Comment 34 Sachin Ghai 2016-02-10 09:43:19 UTC

As per the comment 33, Moving this bz to verified. thanks

Comment 36 errata-xmlrpc 2016-02-15 15:51:39 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:0174

Comment 39 pulp-infra@redhat.com 2016-02-16 20:00:31 UTC

The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.