Bug 1028830

Summary: High level of error messages consuming excessive disk space
Product: [Retired] Red Hat Hardware Certification Program Reporter: Mark Keir <mkeir>
Component: Hardware CatalogAssignee: hwcert-catalog
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 5.2CC: djansen, hwcert-catalog, rlandry, tfu, wshen
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-11-19 02:06:15 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mark Keir 2013-11-11 01:01:59 UTC
Description of problem:

Since the installation of version 5.2 of HWCert, the rate of disk use has increased above the normal pattern.  This is believed to be due to an error appearing in the Apache error log from HWCert.

Version-Release number of selected component (if applicable):

5.2

How reproducible:

Examine the disk use on the Bugzilla/HWCert web servers, observing the installation date for HWCert 5.2 is October 31.

Steps to Reproduce:
1.
2.
3.

Actual results:

Disk use patterns show accelerated slope in 
http://munin-soc01.util.phx2.redhat.com/munin/app.bz.hst.phx2.redhat.com/bzweb02.app.bz.hst.phx2.redhat.com/df.html
http://munin-soc01.util.phx2.redhat.com/munin/app.bz.hst.phx2.redhat.com/bzweb01.app.bz.hst.phx2.redhat.com/df.html

The error log files, gzipped, are greatly increased after the upgrade.

[root@bzweb02 old]# ls -lt error_log.2013-1* |head -n 20
-rw-r--r-- 1 root root 2068657 Nov 10 04:04 error_log.2013-11-09.1.gz
-rw-r--r-- 1 root root 3178676 Nov 10 04:03 error_log.2013-11-08.1.gz
-rw-r--r-- 1 root root 2980400 Nov 10 04:03 error_log.2013-11-07.1.gz
-rw-r--r-- 1 root root 3214002 Nov 10 04:03 error_log.2013-11-06.1.gz
-rw-r--r-- 1 root root 3336792 Nov 10 04:03 error_log.2013-11-05.1.gz
-rw-r--r-- 1 root root 3143947 Nov 10 04:03 error_log.2013-11-04.1.gz
-rw-r--r-- 1 root root 2760439 Nov 10 04:03 error_log.2013-11-03.1.gz
-rw-r--r-- 1 root root 3752246 Nov  3 04:02 error_log.2013-11-02.1.gz
-rw-r--r-- 1 root root 1816546 Nov  3 04:02 error_log.2013-11-01.1.gz
-rw-r--r-- 1 root root 1273733 Nov  3 04:02 error_log.2013-10-30.1.gz
-rw-r--r-- 1 root root  918362 Nov  3 04:02 error_log.2013-10-31.1.gz
-rw-r--r-- 1 root root  832319 Nov  3 04:02 error_log.2013-10-29.1.gz
-rw-r--r-- 1 root root  670710 Nov  3 04:02 error_log.2013-10-27.1.gz
-rw-r--r-- 1 root root  744449 Nov  3 04:02 error_log.2013-10-28.1.gz
-rw-r--r-- 1 root root  627779 Oct 27 04:03 error_log.2013-10-26.1.gz
-rw-r--r-- 1 root root  802046 Oct 27 04:03 error_log.2013-10-24.1.gz
-rw-r--r-- 1 root root  751501 Oct 27 04:03 error_log.2013-10-25.1.gz
-rw-r--r-- 1 root root  838805 Oct 27 04:03 error_log.2013-10-22.1.gz
-rw-r--r-- 1 root root  823818 Oct 27 04:03 error_log.2013-10-23.1.gz
-rw-r--r-- 1 root root  862592 Oct 27 04:03 error_log.2013-10-21.1.gz


Expected results:

Traffic patterns might be expected to drive up the disk use but there has not been a marked traffic increase.

Additional info:

The error increase can be attributed to a single error line that is repeated many times.

The error line is
------
[Sun Nov 10 02:37:25 2013] [error] [client 10.5.100.29] [Sun Nov 10 02:37:25 2013] list.cgi: Use of uninitialized value in hash element at /var/www/html/hwcert/list.cgi line 185.
------

It can be determined that this error, and HWCert, not Bugzilla is the cause.

[bzweb02.app.bz.hst.phx2.redhat.com] [07:43:48 PM]
[root@bzweb02 httpd]# wc -l error_log.2013-11-10  
1546703 error_log.2013-11-10
[bzweb02.app.bz.hst.phx2.redhat.com] [07:43:59 PM]
[root@bzweb02 httpd]# grep hwcert error_log.2013-11-10  |wc -l
1485686
[bzweb02.app.bz.hst.phx2.redhat.com] [07:44:32 PM]
[root@bzweb02 httpd]# grep "hwcert/list.cgi" error_log.2013-11-10  |wc -l
1485515
[bzweb02.app.bz.hst.phx2.redhat.com] [07:45:08 PM]

Comment 1 Mark Keir 2013-11-14 03:59:02 UTC
Unless corrected in the next 48 hours, the servers will run out of disk space again because of this problem.

Comment 2 Tony Fu 2013-11-14 08:45:44 UTC
(In reply to Mark Keir from comment #1)
> Unless corrected in the next 48 hours, the servers will run out of disk
> space again because of this problem.

Hi Mark,

We have built a hotfix package which should fix the issue you reported.  Now we have requested [1] a partner push before the final live push.

Please let me know if there are any problems.


Thanks,
Tony

[1]  https://engineering.redhat.com/rt/Ticket/Display.html?id=266075

Comment 4 Wei Shen 2013-11-19 02:06:15 UTC
This should be fixed in the hotfix

Comment 5 Red Hat Bugzilla 2023-09-14 01:53:26 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days