Bug 1013161 - improve logconv.pl performance with large access logs
improve logconv.pl performance with large access logs
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base (Show other bugs)
6.4
Unspecified Unspecified
unspecified Severity unspecified
: rc
: ---
Assigned To: Rich Megginson
Sankar Ramalingam
:
Depends On:
Blocks: 1013894 1061410
  Show dependency treegraph
 
Reported: 2013-09-27 19:34 EDT by Rich Megginson
Modified: 2014-10-14 03:51 EDT (History)
4 users (show)

See Also:
Fixed In Version: 389-ds-base-1.2.11.15-34.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1013894 (view as bug list)
Environment:
Last Closed: 2014-10-14 03:51:22 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Rich Megginson 2013-09-27 19:34:51 EDT
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47387

Analysis of large access logs needs to be much faster.  Some areas for improvement:
* use db files for temp files
Specifically, use tied hashes, where the hashes are tied to database files, using the perl DB_File interface.
 * for simple arrays, use DB_RECNO
 * for hashes where order is not important, use DB_HASH
 * for hashes where order is important, use DB_BTREE
for example:
{{{
my %h1;
tie %h1, "DB_File", "$dbdir/h1.db", O_CREAT|O_RDWR, 0666, $DB_BTREE;
$h1{'e'} = 5;
$h1{'d'} = 4;
$h1{'c'} = 3;
$h1{'b'} = 2;
$h1{'a'} = 1;
while (my($k,$v) = each %h1) {
    print "$k = $v\n";
}
}}}
this prints
{{{
a = 1
b = 2
c = 3
d = 4
e = 5
}}}

* not sure what else - perhaps optimize regular expressions?

For CBP
Comment 3 srkrishn@redhat.com 2014-08-20 08:57:41 EDT
This bug has been verified as shown below:

Total Log Lines Analysed:  8103117


----------- Access Log Output ------------

Start of Logs:    31/Jan/2012:00:00:16
End of Logs:      01/Feb/2012:00:00:45

real	5m52.450s
user	5m50.612s
sys	0m0.837s


[root@hp-dl360g4-01 ~]# logconv.pl  access.20120131-000045 | free -m
             total       used       free     shared    buffers     cached
Mem:          1876       1797         78          0        108       1373
-/+ buffers/cache:        315       1560
Swap:         4031          0       4031
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 3239 root      20   0  142m  16m 2720 R 99.9  0.4   0:04.34 perl           
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 3239 root      20   0  142m  16m 2720 R 99.9  0.9   0:05.34 perl          

this was tested on build 1.2.11.15.40
Comment 4 Sankar Ramalingam 2014-08-21 05:07:23 EDT
Based on previous comment from Sriram, marking the bug as Verified.
Comment 5 errata-xmlrpc 2014-10-14 03:51:22 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1385.html

Note You need to log in before you can comment on or make changes to this bug.