Bug 495815 - yum metadata generation in /var/cache/rhn can cause extreme server load
yum metadata generation in /var/cache/rhn can cause extreme server load
Status: CLOSED CURRENTRELEASE
Product: Red Hat Satellite 5
Classification: Red Hat
Component: Server (Show other bugs)
520
All Linux
urgent Severity high
: ---
: ---
Assigned To: Pradeep Kilambi
Jeff Ortel
:
Depends On:
Blocks: 485807
  Show dependency treegraph
 
Reported: 2009-04-14 18:30 EDT by Xixi
Modified: 2009-09-23 11:04 EDT (History)
5 users (show)

See Also:
Fixed In Version: sat530
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 495814
Environment:
Last Closed: 2009-09-10 16:35:34 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Xixi 2009-04-14 18:30:48 EDT
Cloning for sat 5.3.0 - even though the code may not be necessary in satellite 5.3.0, QA should cover this case to make sure this is not seen in 5.3.0.  (Per mmccune and prad on IRC just now).

+++ This bug was initially created as a clone of Bug #495814 +++

If the cache for a given channel needs to get regenerated in /var/cache/rhn every client request for that new metadata kicks of a process to regenerate the files.

This can cause extreme load on the Satellite server and each thread is essentially doing the same thing.

In order to reproduce this issue I wrote a simple multi-threaded python utility to spawn multiple yum requests to a RHN Satellite server.  This client spins up 10 threads each doing:

yum clean all && yum search zsh

with separate --installroot parameters to allow simultaneous execution.

After setting up 2 RHEL5 clients each with my load simulator I was 
quickly able to get my Satellite to reach a load of *40-80* with it 
eventually ceasing to be accessible.

** Steps to reproduce the yum 'metadata storm' on a 5.2 Satellite:

1) Register at least 2 RHEL5 clients to your Satellite

2) Make sure your RHEL5 channel is populated and synced

3) Check out: 
http://svn.rhndev.redhat.com/viewcvs/trunk/eng/scripts/load-testing/yum-load-test.py

4) On each RHEL5 client as root execute: 'python yum-load-test.py'

5) On your RHN Satellite server run: 'rm -rf /var/cache/rhn/'

6) wait .. This will cause each client request to start re-generation of 
the metadata for the rhel5 channel.  As these requests pile up the 
server is quickly brought to its knees.

The more clients you have the quicker it will die.
Comment 1 Xixi 2009-04-14 18:42:46 EDT
bug 495814 for sat52maint
bug 495816 for sat51maint
bug 495815 for sat530-triage
Comment 3 Jeff Ortel 2009-06-29 17:27:50 EDT
1) Registered (2) systems and subscribed to fully sync'd RHEL 5 channel.
2) Started python script http://svn.rhndev.redhat.com/viewcvs/trunk/eng/scripts/load-testing/yum-load-test.py on both systems.
3) rm -rf /var/cache/rhn/ on satellite.
4) Waited until /var/cache/rhn/repodaata/ being regenerated.
5) Accessed the satellite WEBUI over the next 20 minutes and satellite still seems accessible.

Satellite does not seem to die.
Comment 4 Michael Mráka 2009-08-14 08:07:02 EDT
Verified in stage -> RELEASE_PENDING.

* registered 2 rhel5 clients
* started yum-load-test.py
* removed files from /var/cache/rhn/
* load didn't exceed 1.5
# sar -q 30 10
Linux 2.6.9-89.0.3.ELsmp (dell-pem710-01.rhts.eng.bos.redhat.com)       08/14/2009

07:52:25 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15
07:52:55 AM        12       454      1.39      1.24      0.68
07:53:25 AM         1       454      1.37      1.24      0.70
07:53:55 AM         1       452      1.22      1.22      0.71
07:54:25 AM         0       452      1.13      1.20      0.72
07:54:55 AM         1       450      1.30      1.23      0.74
07:55:25 AM         0       451      1.49      1.28      0.78
07:55:55 AM         0       449      0.98      1.17      0.75
07:56:25 AM         0       451      0.59      1.06      0.73
07:56:55 AM         0       449      0.36      0.96      0.70
07:57:25 AM         0       451      0.28      0.88      0.69
Average:            2       451      1.01      1.15      0.72
Comment 5 Brandon Perkins 2009-09-10 16:35:34 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-1434.html

Note You need to log in before you can comment on or make changes to this bug.