Bug 1028830 - High level of error messages consuming excessive disk space [NEEDINFO]
High level of error messages consuming excessive disk space
Status: CLOSED CURRENTRELEASE
Product: Red Hat Hardware Certification Program
Classification: Red Hat
Component: Hardware Catalog (Show other bugs)
5.2
All Linux
high Severity high
: ---
: ---
Assigned To: hwcert-catalog
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-10 20:01 EST by Mark Keir
Modified: 2013-11-18 21:06 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-11-18 21:06:15 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rlandry: needinfo? (djansen)


Attachments (Terms of Use)

  None (edit)
Description Mark Keir 2013-11-10 20:01:59 EST
Description of problem:

Since the installation of version 5.2 of HWCert, the rate of disk use has increased above the normal pattern.  This is believed to be due to an error appearing in the Apache error log from HWCert.

Version-Release number of selected component (if applicable):

5.2

How reproducible:

Examine the disk use on the Bugzilla/HWCert web servers, observing the installation date for HWCert 5.2 is October 31.

Steps to Reproduce:
1.
2.
3.

Actual results:

Disk use patterns show accelerated slope in 
http://munin-soc01.util.phx2.redhat.com/munin/app.bz.hst.phx2.redhat.com/bzweb02.app.bz.hst.phx2.redhat.com/df.html
http://munin-soc01.util.phx2.redhat.com/munin/app.bz.hst.phx2.redhat.com/bzweb01.app.bz.hst.phx2.redhat.com/df.html

The error log files, gzipped, are greatly increased after the upgrade.

[root@bzweb02 old]# ls -lt error_log.2013-1* |head -n 20
-rw-r--r-- 1 root root 2068657 Nov 10 04:04 error_log.2013-11-09.1.gz
-rw-r--r-- 1 root root 3178676 Nov 10 04:03 error_log.2013-11-08.1.gz
-rw-r--r-- 1 root root 2980400 Nov 10 04:03 error_log.2013-11-07.1.gz
-rw-r--r-- 1 root root 3214002 Nov 10 04:03 error_log.2013-11-06.1.gz
-rw-r--r-- 1 root root 3336792 Nov 10 04:03 error_log.2013-11-05.1.gz
-rw-r--r-- 1 root root 3143947 Nov 10 04:03 error_log.2013-11-04.1.gz
-rw-r--r-- 1 root root 2760439 Nov 10 04:03 error_log.2013-11-03.1.gz
-rw-r--r-- 1 root root 3752246 Nov  3 04:02 error_log.2013-11-02.1.gz
-rw-r--r-- 1 root root 1816546 Nov  3 04:02 error_log.2013-11-01.1.gz
-rw-r--r-- 1 root root 1273733 Nov  3 04:02 error_log.2013-10-30.1.gz
-rw-r--r-- 1 root root  918362 Nov  3 04:02 error_log.2013-10-31.1.gz
-rw-r--r-- 1 root root  832319 Nov  3 04:02 error_log.2013-10-29.1.gz
-rw-r--r-- 1 root root  670710 Nov  3 04:02 error_log.2013-10-27.1.gz
-rw-r--r-- 1 root root  744449 Nov  3 04:02 error_log.2013-10-28.1.gz
-rw-r--r-- 1 root root  627779 Oct 27 04:03 error_log.2013-10-26.1.gz
-rw-r--r-- 1 root root  802046 Oct 27 04:03 error_log.2013-10-24.1.gz
-rw-r--r-- 1 root root  751501 Oct 27 04:03 error_log.2013-10-25.1.gz
-rw-r--r-- 1 root root  838805 Oct 27 04:03 error_log.2013-10-22.1.gz
-rw-r--r-- 1 root root  823818 Oct 27 04:03 error_log.2013-10-23.1.gz
-rw-r--r-- 1 root root  862592 Oct 27 04:03 error_log.2013-10-21.1.gz


Expected results:

Traffic patterns might be expected to drive up the disk use but there has not been a marked traffic increase.

Additional info:

The error increase can be attributed to a single error line that is repeated many times.

The error line is
------
[Sun Nov 10 02:37:25 2013] [error] [client 10.5.100.29] [Sun Nov 10 02:37:25 2013] list.cgi: Use of uninitialized value in hash element at /var/www/html/hwcert/list.cgi line 185.
------

It can be determined that this error, and HWCert, not Bugzilla is the cause.

[bzweb02.app.bz.hst.phx2.redhat.com] [07:43:48 PM]
[root@bzweb02 httpd]# wc -l error_log.2013-11-10  
1546703 error_log.2013-11-10
[bzweb02.app.bz.hst.phx2.redhat.com] [07:43:59 PM]
[root@bzweb02 httpd]# grep hwcert error_log.2013-11-10  |wc -l
1485686
[bzweb02.app.bz.hst.phx2.redhat.com] [07:44:32 PM]
[root@bzweb02 httpd]# grep "hwcert/list.cgi" error_log.2013-11-10  |wc -l
1485515
[bzweb02.app.bz.hst.phx2.redhat.com] [07:45:08 PM]
Comment 1 Mark Keir 2013-11-13 22:59:02 EST
Unless corrected in the next 48 hours, the servers will run out of disk space again because of this problem.
Comment 2 Tony Fu 2013-11-14 03:45:44 EST
(In reply to Mark Keir from comment #1)
> Unless corrected in the next 48 hours, the servers will run out of disk
> space again because of this problem.

Hi Mark,

We have built a hotfix package which should fix the issue you reported.  Now we have requested [1] a partner push before the final live push.

Please let me know if there are any problems.


Thanks,
Tony

[1]  https://engineering.redhat.com/rt/Ticket/Display.html?id=266075
Comment 4 Wei Shen 2013-11-18 21:06:15 EST
This should be fixed in the hotfix

Note You need to log in before you can comment on or make changes to this bug.