Bug 1013161 - improve logconv.pl performance with large access logs
Summary: improve logconv.pl performance with large access logs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: 389-ds-base
Version: 6.4
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Rich Megginson
QA Contact: Sankar Ramalingam
URL:
Whiteboard:
Depends On:
Blocks: 1013894 1061410
TreeView+ depends on / blocked
 
Reported: 2013-09-27 23:34 UTC by Rich Megginson
Modified: 2020-09-13 20:35 UTC (History)
4 users (show)

Fixed In Version: 389-ds-base-1.2.11.15-34.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1013894 (view as bug list)
Environment:
Last Closed: 2014-10-14 07:51:22 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github 389ds 389-ds-base issues 724 0 None None None 2020-09-13 20:35:42 UTC
Red Hat Product Errata RHBA-2014:1385 0 normal SHIPPED_LIVE 389-ds-base bug fix and enhancement update 2014-10-14 01:27:42 UTC

Description Rich Megginson 2013-09-27 23:34:51 UTC
This bug is created as a clone of upstream ticket:
https://fedorahosted.org/389/ticket/47387

Analysis of large access logs needs to be much faster.  Some areas for improvement:
* use db files for temp files
Specifically, use tied hashes, where the hashes are tied to database files, using the perl DB_File interface.
 * for simple arrays, use DB_RECNO
 * for hashes where order is not important, use DB_HASH
 * for hashes where order is important, use DB_BTREE
for example:
{{{
my %h1;
tie %h1, "DB_File", "$dbdir/h1.db", O_CREAT|O_RDWR, 0666, $DB_BTREE;
$h1{'e'} = 5;
$h1{'d'} = 4;
$h1{'c'} = 3;
$h1{'b'} = 2;
$h1{'a'} = 1;
while (my($k,$v) = each %h1) {
    print "$k = $v\n";
}
}}}
this prints
{{{
a = 1
b = 2
c = 3
d = 4
e = 5
}}}

* not sure what else - perhaps optimize regular expressions?

For CBP

Comment 3 srkrishn@redhat.com 2014-08-20 12:57:41 UTC
This bug has been verified as shown below:

Total Log Lines Analysed:  8103117


----------- Access Log Output ------------

Start of Logs:    31/Jan/2012:00:00:16
End of Logs:      01/Feb/2012:00:00:45

real	5m52.450s
user	5m50.612s
sys	0m0.837s


[root@hp-dl360g4-01 ~]# logconv.pl  access.20120131-000045 | free -m
             total       used       free     shared    buffers     cached
Mem:          1876       1797         78          0        108       1373
-/+ buffers/cache:        315       1560
Swap:         4031          0       4031
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 3239 root      20   0  142m  16m 2720 R 99.9  0.4   0:04.34 perl           
 PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
 3239 root      20   0  142m  16m 2720 R 99.9  0.9   0:05.34 perl          

this was tested on build 1.2.11.15.40

Comment 4 Sankar Ramalingam 2014-08-21 09:07:23 UTC
Based on previous comment from Sriram, marking the bug as Verified.

Comment 5 errata-xmlrpc 2014-10-14 07:51:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1385.html


Note You need to log in before you can comment on or make changes to this bug.