Bug 1244130 - Pulp nodes (for capsules) disk usage uses O(N) inodes
Summary: Pulp nodes (for capsules) disk usage uses O(N) inodes
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Satellite
Classification: Red Hat
Component: Pulp
Version: 6.1.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: Unspecified
Assignee: satellite6-bugs
QA Contact: Sachin Ghai
URL:
Whiteboard:
Depends On:
Blocks: 1122832
TreeView+ depends on / blocked
 
Reported: 2015-07-17 09:00 UTC by Peter Vreman
Modified: 2022-07-09 07:40 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
This needs a Release Note in the errata to detail the cleanup procedure
Clone Of:
Environment:
Last Closed: 2016-02-15 15:51:39 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
inode_usage of pulp nodes publushed repos (83.28 KB, text/plain)
2015-07-17 09:00 UTC, Peter Vreman
no flags Details
inode_usage script (395 bytes, application/x-shellscript)
2015-07-17 09:01 UTC, Peter Vreman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Pulp Redmine 1337 0 Normal CLOSED - CURRENTRELEASE As a user, I want to reduce the number of inodes used by nodes. 2016-02-16 20:00:30 UTC
Red Hat Knowledge Base (Solution) 2134091 0 None None None 2016-01-21 13:58:57 UTC
Red Hat Product Errata RHSA-2016:0174 0 normal SHIPPED_LIVE Moderate: Satellite 6.1.7 security, bug and enhancement fix update 2016-02-15 20:50:32 UTC

Description Peter Vreman 2015-07-17 09:00:44 UTC
Created attachment 1053020 [details]
inode_usage of pulp nodes publushed repos

Description of problem:
To make the pulp data available for Capsules (pulp nodes) the pulp repositories are made available in /var/lib/pulp/nodes/published/https/repos.

With every repository added (Also Content Views also creates implicit repositories) a new directory is added.

Example 1: Duplication per Content View, that is O(N):
- A Redhat Server 6.5 repository will generate 60.000 inodes (45.000 directories and 15.000 symlinks). When this is used by 10 Content Views also this RedHat Server repositories will be created 10 times, that means 600.000 inodes.

Example 2: RedHat CDN, that recreates directories:
- Syncing RedHat channels will always recreated directories for every RedHat repository. That means that with a daily sync for an each (both non-EUS and EUS) Server repository those 60.000 need to be created and also deleted.
We are syncing both non-EUS,EUS of Kicksatrt,Server,Optional,RHSCL channels for the following releases 6.5,6.6,6Server,7.1,7Server. This totals then to 71 Directories:

/var/lib/pulp/nodes/published/https/repos# ls -1d Hilti-Red* Hilti-Oracle* | wc -l
71

Taking an average of 30.000 inodes that means:

30.000 * 2 (both create and delete) * 71 = 42.000.000 inodes IO actions need to be done per day.


The shared repository of all RPMs uses only: 228.000 inodes
The pulp nodes published directory contains: 6.352.462 inodes
The pulp nodes published directory contains: 550 directories

See the attached inode_usage.txt for details



Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Sync both non-EUS,EUS of Kicksatrt,Server,Optional,RHSCL channels for the following releases 6.5,6.6,6Server,7.1,7Server. 
2. Sync all RedHat repositories once
3. Do an incremental Sync of RedHat repositories and monitor the number of IO transactions on the filesystem

4. Create and Publish 20 ContentViews with each content view has at least Kickstart,Server,Optional repositories included
5. Check the /var/lib/pulp/nodes/published/https/repos directory
6. Check for duplicate directories
7. Count number of inodes in /var/lib/pulp/content/rpm
8. Count number of inodes in /var/lib/pulp/nodes/published/https/repos


Actual results:
- Many Inode transactions during RedHat sync, in fact the time for an incremental sync is almost the same as a full sync.
- Inode usage in the pulp nodes published directory is N-times higher than the shared content/rpm


Expected results:
- Incremental sync shall use at maximum IO actions related to the incremental changes
- Inode usage in pulp nodes published is at maximum the amount of inodes used by the shared rpm content


Additional info:

Comment 1 Peter Vreman 2015-07-17 09:01:24 UTC
Created attachment 1053021 [details]
inode_usage script

Comment 2 RHEL Program Management 2015-07-17 09:15:56 UTC
Since this issue was entered in Red Hat Bugzilla, the release flag has been
set to ? to ensure that it is properly evaluated for this release.

Comment 4 David O'Brien 2015-08-20 02:17:43 UTC
Please verify that this requires a release note, and if so please provide some suitable doc text.

thanks

Comment 5 Brad Buckingham 2015-08-20 13:16:47 UTC
This bug requires modifications to improve the inode usage; however, it shouldn't require a rel note at this time; therefore, removing the sat61-release-notes blocker.

Comment 6 Michael Hrivnak 2015-10-27 15:44:42 UTC
Jeff, please add a link to an upstream issue describing your proposal.

Comment 7 Jeff Ortel 2015-10-27 20:10:08 UTC
Done.

Comment 8 pulp-infra@redhat.com 2015-10-27 20:36:21 UTC
The Pulp upstream bug status is at NEW. Updating the external tracker on this bug.

Comment 9 pulp-infra@redhat.com 2015-10-27 20:36:25 UTC
The Pulp upstream bug priority is at Normal. Updating the external tracker on this bug.

Comment 12 Jeff Ortel 2015-11-30 16:03:23 UTC
Stuart,

After pulp is updatd, the existing links (inode usage) would be cleaned up on subsequent publishes.  If we don't want wait for publishes, we can include a migration script in the solution.

-jeff

Comment 13 Stuart Auchterlonie 2015-12-01 10:40:23 UTC
Hi Jeff,

Maybe i'm missing something here, but under what circumstances would an
*existing* content view get re-published?


Regards
Stuart

Comment 14 Peter Vreman 2015-12-01 12:21:20 UTC
Please take care of Composite Content Views and Environments. These are refering to dedicated Content View Versions. Re-publishing adds a new Content View Versions. The current Sat6 (without automatic latest selection support) cannot make assumptions what to do with Lifecycle Environment Promotions or Content View Versions.

Comment 15 Jeff Ortel 2015-12-01 15:41:32 UTC
My understanding of content view lifecycle is limited.  Thanks for the clarification, Peter.  Looks like we'll want to include a migration script in the solution to clean up the unwanted symlinks.

Comment 16 pulp-infra@redhat.com 2015-12-09 21:00:20 UTC
The Pulp upstream bug status is at ASSIGNED. Updating the external tracker on this bug.

Comment 17 pulp-infra@redhat.com 2015-12-10 23:00:19 UTC
The Pulp upstream bug status is at POST. Updating the external tracker on this bug.

Comment 18 pulp-infra@redhat.com 2015-12-15 15:30:25 UTC
The Pulp upstream bug status is at MODIFIED. Updating the external tracker on this bug.

Comment 19 Jeff Ortel 2015-12-18 14:28:33 UTC
The solution makes it possible for existing symlinks created during node publishing to be blindly deleted.  After upgrade they are no longer used by the child node (capsule).

The links are published in /var/lib/pulp/nodes/published

Subsequent publishes will delete the links.  However if admins want them deleted prior to the next publish, it can be done in one of two ways:

1. Admins can delete them manually.
   Example: find /var/lib/pulp/nodes/published/https/repos -type l -exec rm -f {} \;

2. The RPM spec file can run #1 during upgrade.

I prefer #1.  Thoughts?

Comment 20 Peter Vreman 2015-12-18 15:21:20 UTC
Using the RPM %post is not good as the removal process can take hours.

If a user forgets the manual step in the upgrade process it will never be done, because all the next upgrades do not contain this cleanup step anymore.

Recommendation:
- have a special pulp background task do the cleanup. Then it is for sure that the cleanup will be executed and done in a controlled way.

Alternative:
- add it to katello-upgrade, but that is something more Satellite specific.

Only remove of symlinks does still leave there directories. Is this by design?

Comment 21 Jeff Ortel 2015-12-18 20:27:04 UTC
Yes.  The content/ directory still contains a few real files.

Comment 22 Peter Vreman 2015-12-21 08:50:45 UTC
The major inode use is by the directory tree, see Example 1 in the description https://bugzilla.redhat.com/show_bug.cgi?id=1244130#c0.

For RHEL6.5 EUS
45.000 directories
15.000 symlinks

Deleting the symlinks is therefor not enough. Also the empty directories need to be cleaned.

Comment 23 Bryan Kearney 2016-01-04 18:33:34 UTC
Moving to POST since an upstream fix is available.

Comment 25 Jeff Ortel 2016-01-19 16:17:52 UTC
Understood about the need to purge the directories as well.  I still think a reasonable approach is to provide admin suggested commands to clean things up.

Something like:

find /var/lib/pulp/nodes/published -type l -delete
find /var/lib/pulp/nodes/published -type d -empty -delete

We could also provide a shell script that performs this clean up as well.  A background task in pulp to do this seems like overkill.

Comment 27 Peter Vreman 2016-01-25 12:57:56 UTC
I did not know about the empty and delete features of find. 
I agree that using those 2 simple find commands are good enough to perform the cleanup.

Comment 28 Sachin Ghai 2016-02-09 07:16:34 UTC
To verify this bz, I installed Sat 6.1.6 along with capsule to see the real issue. Now I need some clarification before upgrading to satellite 6.1.7

I'm assuming, before upgrade, we need to check and delete existing published links and directories manually on satellite and capsule server as per comment25, right ?

find /var/lib/pulp/nodes/published -type l -delete
find /var/lib/pulp/nodes/published -type d -empty -delete

2. Need to upgrade with 6.1.7 and again need to validate if published links and directories are being created :

a) on Sat sever by publishing some CV's on sat server 
b) by syncing contents from sat server to capsule ?


Note: I synced rhel6, 7 server kickstart, optional and rhscl repos on Sat server.

@Jeff: could you please take a look and confirm if my assumptions are correct based on bz history ?

Comment 31 Sachin Ghai 2016-02-09 17:04:14 UTC
(In reply to Jeff Ortel from comment #30)
> (In reply to Sachin Ghai from comment #28)
> > To verify this bz, I installed Sat 6.1.6 along with capsule to see the real
> > issue. Now I need some clarification before upgrading to satellite 6.1.7
> > 
> > I'm assuming, before upgrade, we need to check and delete existing published
> > links and directories manually on satellite and capsule server as per
> > comment25, right ?
> 
> Yes.
> 
> > 
> > find /var/lib/pulp/nodes/published -type l -delete
> > find /var/lib/pulp/nodes/published -type d -empty -delete

Thanks Jeff. I removed all existing soft links and empty dir before upgrade..

[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type l -delete
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type l | wc -l
0
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d -empty -delete
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d -empty | wc -l
0


Now proceeding with upgrade with 6.1.7 and will update the test results here.

Comment 32 Sachin Ghai 2016-02-09 18:03:32 UTC
Upgrade went fine from Sat 6.1.6 to 6.1.7 along with external capsule. Later, after upgrade, I ran a capsule sync and  it was finished successfully.

--
 [root@ibm-x3550 ~]# hammer -u admin -p changeme capsule content synchronize --id=2
[............................................................................................................................................] [100%]

--

Two observations:
-------------------

1) Apache's pulp_node.conf updated with new stanza:

<snip>
<Directory /var/www/pulp/nodes/content >
  Options FollowSymLinks Indexes
  SSLRequireSSL
<snip>

2) meta-data re-generated on every capsule sync.
Feb  9 18:53:32 ibm-x3550m3-06 pulp: pulp.plugins.pulp_rpm.plugins.distributors.yum.metadata.metadata:WARNING: Overwriting existing metadata file [/var/lib/pulp/working/repos/Default_Organization-Dev-cv_rhel7-capsule_rhel7-cap_rhel7_617/distributors/yum_distributor/repodata/repomd.xml]

Comment 33 Sachin Ghai 2016-02-10 09:41:55 UTC
After upgrade with 6.1.7 compose1. I performed following tests:

- re-synced some of the existing Red Hat repos - success
- enabled new repos and synced them - success
- created new CV with new and existing repos and published them - success
- published existing CV - success
- re-synced contents from satellite -> capsule - success

I don't find any symlinks after upgrade. Also the inode count for published repos reduced to very low count.

Before upgrade:
----------------

[root@ibm-x3550m3 ~]# find /var/lib/pulp/nodes/published -type l | wc -l
568186
[root@ibm-x3550m3 ~]# find /var/lib/pulp/nodes/published -type d  | wc -l
2141711
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published/https/repos -name '*' | wc -l
2710169

After upgrade:
----------------

[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type l | wc -l
0
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d -empty | wc -l
1
[root@ibm-x3550 ~]# find /var/lib/pulp/nodes/published -type d  | wc -l
201
[root@ibm-x3550 ~]#  find /var/lib/pulp/nodes/published/https/repos -name '*' | wc -l
499

Comment 34 Sachin Ghai 2016-02-10 09:43:19 UTC
As per the comment 33, Moving this bz to verified. thanks

Comment 36 errata-xmlrpc 2016-02-15 15:51:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:0174

Comment 39 pulp-infra@redhat.com 2016-02-16 20:00:31 UTC
The Pulp upstream bug status is at CLOSED - CURRENTRELEASE. Updating the external tracker on this bug.


Note You need to log in before you can comment on or make changes to this bug.