Bug 475338

Summary: LOG: the intenal type of maxlogsize, maxdiskspace and minfreespace should be 64-bit integer
Product: [Retired] 389 Reporter: Noriko Hosoi <nhosoi>
Component: Directory ServerAssignee: Noriko Hosoi <nhosoi>
Status: CLOSED CURRENTRELEASE QA Contact: Chandrasekar Kannan <ckannan>
Severity: medium Docs Contact:
Priority: low    
Version: 1.1.3CC: benl, jgalipea, rmeggins
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: 8.1 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-04-29 23:08:42 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 249650, 493682    
Attachments:
Description Flags
cvs diff ldap/servers/slapd/{slap.h, log.h, log.c}
none
cvs diff ldap/servers/slapd/{log.h, log.c}
none
cvs commit message
none
cvs diff ldapserver/ldap/servers/slapd/log.c
none
revised cvs diff ldapserver/ldap/servers/slapd/log.c
none
cvs commit message
none
revised cvs diff ldapserver/ldap/servers/slapd/log.c
none
cvs commit message none

Description Noriko Hosoi 2008-12-08 21:58:27 UTC
Description of problem:
These config parameters take the value in megabytes.
  nsslapd-*log-logmaxdiskspace
  nsslapd-*log-logminfreediskspace
These values are internally converted to bytes.  The variables to store them had the type "int", which overflows easily.  They should have the type PRInt64.

Comment 1 Noriko Hosoi 2008-12-08 22:45:40 UTC
Created attachment 326221 [details]
cvs diff ldap/servers/slapd/{slap.h, log.h, log.c}

Fix Description: 
Changing the type of the following variables from int to PRInt64.
  slapdFrontendConfig_t
    accesslog_maxdiskspace;
    accesslog_minfreespace;
    errorlog_maxdiskspace;
    errorlog_minfreespace;
    auditlog_maxdiskspace;
    auditlog_minfreespace;
  struct logging_opts
    log_access_maxdiskspace;
    log_access_minfreespace;
    log_error_maxdiskspace;
    log_error_minfreespace;
    log_audit_maxdiskspace;
    log_audit_minfreespace;

Comment 2 Rich Megginson 2008-12-09 00:06:59 UTC
Do the log_*_maxlogsize parameters need to be 64-bit values too?

For converting a string to a 64-bit value, you should use strtoll - I think that's what Nathan did in the dna code which also has to handle explicit 64-bit values.

Comment 3 Noriko Hosoi 2008-12-09 00:31:44 UTC
(In reply to comment #2)
> Do the log_*_maxlogsize parameters need to be 64-bit values too?

I was questioning myself... :)  Should we support log files larger than 2GB?  Perhaps, we should...

Comment 4 Rich Megginson 2008-12-09 15:11:32 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > Do the log_*_maxlogsize parameters need to be 64-bit values too?
> 
> I was questioning myself... :)  Should we support log files larger than 2GB? 
> Perhaps, we should...

I think we should investigate.  What other code changes are required?

Comment 5 Noriko Hosoi 2008-12-09 23:23:51 UTC
Created attachment 326428 [details]
cvs diff ldap/servers/slapd/{log.h, log.c}

Description: In addition to maxdiskspace and minfreespace, allow maxlogsize larger than 2GB.

The type of the following internal variables are now PRInt64:
  struct logging_opts
    log_access_maxlogsize;
    log_access_maxdiskspace;
    log_access_minfreespace;
    log_error_maxlogsize;
    log_error_maxdiskspace;
    log_error_minfreespace;
    log_audit_maxlogsize;
    log_audit_maxdiskspace;
    log_audit_minfreespace;

Comment 6 Noriko Hosoi 2008-12-10 06:24:48 UTC
Created attachment 326453 [details]
cvs commit message

Reviewed by Rich (Thank you!!)

Checked in into CVS HEAD.

Comment 7 Jenny Severance 2009-03-31 13:20:49 UTC
Please add sets to verify these are now 64bit.

Comment 8 Noriko Hosoi 2009-03-31 18:31:33 UTC
I think testing this set would cover the other cases, too.

0. choose a machine with the plenty of disk space (5GB is needed just for the error log...)
1. shutdown the server
2. set these config parameters in dse.ldif
nsslapd-errorlog-maxlogsperdir: 5
nsslapd-errorlog-maxlogsize: 3072
nsslapd-errorlog-logmaxdiskspace: 5120
(The last 2 values are in mega bytes.)
3. restart the server with -d 1 to grow the error log file as fast as possible
4. stress the server with any method, e.g., ldclt.
5. the error log file size should be limited by 3GB (3221225472 bytes).
6. even though the maxlogsperdir is 5, an older error log should be removed when the 3-rd log file is generated since the error log is allowed to consume the disk space up to 5GB.

If 5. and 6. are satisfied, this bug is verified.
   
(Sorry, Jenny.  This test would take a looong time...)

Comment 9 Jenny Severance 2009-04-13 13:58:42 UTC
FYI - I am currently working on verifying this bug.

Comment 10 Jenny Severance 2009-04-15 13:14:04 UTC
Looks like the log did not rotate?

error log configuration:

ldapsearch -x -D "cn=Directory Manager" -w Secret123 -b "cn=config" | grep nsslapd-errorlog
nsslapd-errorlog-level: 0
nsslapd-errorlog-logging-enabled: on
nsslapd-errorlog-mode: 600
nsslapd-errorlog-logexpirationtime: 1
nsslapd-errorlog-logrotationsync-enabled: off
nsslapd-errorlog-logrotationsynchour: 0
nsslapd-errorlog-logrotationsyncmin: 0
nsslapd-errorlog-logrotationtime: 1
nsslapd-errorlog-logmaxdiskspace: 5120
nsslapd-errorlog: /var/log/dirsrv/slapd-jennyv2/errors
nsslapd-errorlog-logexpirationtimeunit: month
nsslapd-errorlog-logrotationtimeunit: week
nsslapd-errorlog-maxlogsize: 3072
nsslapd-errorlog-logminfreediskspace: 5
nsslapd-errorlog-maxlogsperdir: 5
nsslapd-errorlog-list: /var/log/dirsrv/slapd-jennyv2/errors.20090406-121657

log size:

total 4.9G
drwxrwx--- 2 nobody nobody 4.0K Apr 15 09:15 .
drwxr-xr-x 4 root   root   4.0K Apr  6 12:17 ..
-rw------- 1 nobody nobody 1.3K Apr 15 09:16 access
-rw------- 1 nobody root    42M Apr  7 12:16 access.20090406-121657
-rw------- 1 nobody nobody  29M Apr  8 12:16 access.20090407-121704
-rw------- 1 nobody nobody  27M Apr  9 12:04 access.20090409-161539
-rw------- 1 nobody nobody  19M Apr 14 08:46 access.20090413-084651
-rw------- 1 nobody nobody 7.9M Apr 14 22:01 access.20090414-084702
-rw------- 1 nobody root    579 Apr 15 09:15 access.rotationinfo
-rw------- 1 nobody root      0 Apr  6 12:16 audit
-rw------- 1 nobody root     63 Apr  6 12:16 audit.rotationinfo
-rw------- 1 nobody nobody 4.5G Apr 15 09:17 errors
-rw------- 1 nobody root   367M Apr 13 12:16 errors.20090406-121657
-rw------- 1 nobody root    167 Apr 13 12:16 errors.rotationinfo

rotation info:

LOGINFO:Log file created at: 13/Apr/2009:12:16:57 (1239639417)
LOGINFO:Previous Log File:/var/log/dirsrv/slapd-jennyv2/errors.20090406-121657 (1239034617) (384088635)

Comment 11 Noriko Hosoi 2009-04-15 20:50:35 UTC
There was another bug in it. 

The NSPR function to get the file size "PR_GetOpenFileInfo" was just for the file size less than 2GB.  I should have used "PR_GetOpenFileInfo64".  The file size returned from PR_GetOpenFileInfo for the error log file (>= 5GB for now) is wrong.  (see info.size == 23675)

[log.c]
2890    static PRInt64
2891    log__getfilesize(LOGFD fp)
2892    {
2893        PRFileInfo      info;
2894    
2895        if (PR_GetOpenFileInfo (fp, &info) == PR_FAILURE) {
2896            return -1;
2897        }
2898        return (PRInt64)info.size;    /* type of size is off_t */
2899    }

(gdb) p info
$13 = {type = 81613421, size = 23675, creationTime = -5191785467020775488,
  modifyTime = 6957129841574116}

http://www.mozilla.org/projects/nspr/reference/html/priofnc.html#18034
PR_GetOpenFileInfo64
Gets an open file's information. File size is expressed as a 64-bit integer.

Comment 12 Rich Megginson 2009-04-15 21:28:11 UTC
I think we should fix this in 8.1 if it's not going to cause too big a slip to the schedule.  A customer is likely to run into this.

Comment 13 Noriko Hosoi 2009-04-16 00:11:24 UTC
Created attachment 339766 [details]
cvs diff ldapserver/ldap/servers/slapd/log.c

Description: 
1) replaced PR_GetOpenFileInfo with PR_GetOpenFileInfo64 in log_getfilesize [log.c].  PR_GetOpenFileInfo does not return the correct file size if the size is larger than 2GB.
2) when a rotation info file is missing and recreated, the file size stored in the file was not correct.
3) rotated file name is created with the time stamp when rotated.  The reverse conversion function had a problem and the file name in the rotation info and the real one could mismatch.

Tested on jennyv2 using the Jenny's test environment.  (Thanks, Jenny!)

# ls -lt errors*
-rw------- 1 nobody nobody        368 Apr 15 20:00 errors
-rw------- 1 nobody root          168 Apr 15 20:00 errors.rotationinfo
-rw------- 1 nobody nobody 5427396655 Apr 15 16:36 errors.20090415-200025
# cat errors.rotationinfo 
LOGINFO:Log file created at: 15/Apr/2009:20:00:26 (1239840026)
LOGINFO:Previous Log File:/var/log/dirsrv/slapd-jennyv2/errors.20090415-200025 (1239840025) (5427396655)

Comment 14 Rich Megginson 2009-04-16 01:39:06 UTC
Why does mktime return 1 hour ahead?  Do we have a timezone/daylight savings time problem?

Comment 15 Noriko Hosoi 2009-04-16 05:59:33 UTC
I really have no idea why it happens.  This is a sample program to show the problem: the original time string "20090406-231657" is converted to timet by strptime, then the timet is converted back by strftime.  Without "timet -= 60*60;", the converted-back string won't match the original time string.  Do you think I'm doing something wrong?
===========================================
#include <stdio.h>
#include <time.h>

main()
{
	struct tm tm = {0};
	struct tm *tmsp = &tm;
	char tbuf[1024];
	char *str = "20090406-231657";
	time_t timet = 0;

	strptime(str, "%Y%m%d-%H%M%S", &tm);

	timet = mktime(&tm);
	printf("time_t: %d\n", timet);
	timet -= 60*60;

	localtime_r(&timet, &tm);

	strftime (tbuf, 1024, "%Y%m%d-%H%M%S", tmsp);

	printf("%s ==> %s\n", str, tbuf);
}
===========================================
With "timet -= 60*60;"
$ ./strftime 
time_t: 1239088617
20090406-231657 ==> 20090406-231657

Without "timet -= 60*60;"
$ strftime
time_t: 1239088617
20090406-231657 ==> 20090407-001657

Comment 16 Rich Megginson 2009-04-16 13:07:53 UTC
I'm not sure what the problem is.  I just want to make sure that, come November and Daylight Savings Time ends, this code still works correctly.

Comment 17 Noriko Hosoi 2009-04-16 15:48:53 UTC
Is this test good enough to verify it?

I changed the time to 2009/12/31 and timezone EST.  It returns the same time string with "timet -= 60*60;"

[nhosoi@kiki-rhel5 tmp]$ date
Thu Dec 31 15:04:01 EST 2009
[nhosoi@kiki-rhel5 tmp]$ ./strptime
time_t: 1239077817
20090406-231657 ==> 20090406-231657

Comment 18 Rich Megginson 2009-04-16 15:56:07 UTC
Ok.  It's just weird that it returns 1 hour off.

Comment 19 Noriko Hosoi 2009-04-16 16:08:40 UTC
Thanks, Rich!  I'm going to add some more comments to the line.

I also tested on F9, RHEL4, RHEL5, and Solaris9 and got the same result.  (So, it should be a spec, not a bug... :)

Comment 20 Noriko Hosoi 2009-04-16 16:47:43 UTC
Created attachment 339870 [details]
revised cvs diff ldapserver/ldap/servers/slapd/log.c

It turned out that the long format does not require the one hour time adjustment.  The revised code adjusts the time only for the short format.
==========================================
#include <stdio.h>
#include <time.h>

main()
{
    struct tm tm = {0};
    struct tm *tmsp = &tm;
    char tbuf[1024];
    char *str = "20090406-231657";
    char *lstr = "06/Apr/2009:23:16:57";
    time_t timet = 0;

    printf("short format\n");
    strptime(str, "%Y%m%d-%H%M%S", &tm);

    timet = mktime(&tm);
    timet -= 60*60;
    printf("time_t: %d\n", timet);

    localtime_r(&timet, &tm);

    strftime (tbuf, 1024, "%Y%m%d-%H%M%S", tmsp);

    printf("%s ==> %s\n", str, tbuf);

    printf("long format\n");
    strptime(lstr, "%d/%b/%Y:%H:%M:%S", &tm);

    timet = mktime(&tm);
    printf("time_t: %d\n", timet);
    /* timet -= 60*60; */

    localtime_r(&timet, &tm);

    strftime (tbuf, 1024, "%d/%b/%Y:%H:%M:%S", tmsp);

    printf("%s ==> %s\n", lstr, tbuf);
}
==========================================
$ ./strftime
short format
time_t: 1239085017
20090406231657 ==> 20090406231657
long format
time_t: 1239085017
06/Apr/2009:23:16:57 ==> 06/Apr/2009:23:16:57

Comment 21 Noriko Hosoi 2009-04-16 16:59:47 UTC
Steps to reproduce the problem.

Assuming the same configuration parameters in Comment #10 are set.

0. Shutdown the server
   cd /var/log/dirsrv/slapd-ID
1. Remove errors.rotationinfo and prepare a large errors file (> 3072MB; we don't care what are in it.)
ls -l errors*
-rw------- 1 nobody nobody 5427396655 Apr 15 16:36 errors
-rw------- 1 nobody nobody        727 Apr 16 12:55 errors.20090412-100000
-rw------- 1 nobody nobody        727 Apr 16 12:55 errors.20090413-200000
-rw------- 1 nobody nobody        727 Apr 16 12:53 errors.20090414-200020
2. Restart the server
3. ls -l errors*
-rw------- 1 nobody nobody        368 Apr 16 12:56 errors
-rw------- 1 nobody nobody        168 Apr 16 12:56 errors.rotationinfo
-rw------- 1 nobody nobody 5427396655 Apr 15 16:36 errors.20090416-125617
<== older error logs are removed.

# cat errors.rotationinfo 
LOGINFO:Log file created at: 16/Apr/2009:12:56:17 (1239900977)
LOGINFO:Previous Log File:/var/log/dirsrv/slapd-jennyv2/errors.20090416-125617 (1239900977) (5427396655)
<== The file name errors.20090416-125617 matches the one from 'ls'.

Comment 22 Noriko Hosoi 2009-04-16 17:04:57 UTC
Created attachment 339876 [details]
cvs commit message

Reviewed by Rich (Thank you!)

Checked in into CVS HEAD.

Comment 23 Noriko Hosoi 2009-04-16 19:18:43 UTC
Created attachment 339904 [details]
revised cvs diff ldapserver/ldap/servers/slapd/log.c

BIG thanks to Nathan to figure out the heart of the problem.

In log_reverse_convert_time, by initializing "struct tm" with NULLs:
struct tm tm = {0};
tm_isdst is also set to 0, which means no daylight saving.  mktime thinks when converting struct tm to time_t, use the knowledge "the time that the time_t represents is not in the daylight saving period".  Instead, we should have set "tm.tm_isdst = -1;".  That means, we don't have the knowledge, calculate it in mktime.

I also fixed a silly bug in generating a rotated log file name which I introduced in my previous checkin.

Comment 24 Noriko Hosoi 2009-04-16 20:12:14 UTC
Created attachment 339913 [details]
cvs commit message

Reviewed by Rich (Thank you!!)

Checked in into CVS HEAD.

Comment 25 Jenny Severance 2009-04-17 20:26:20 UTC
verified DS 8.1 - RHEL 5

[root@jennyv2 slapd-jennyv2]# cat errors.rotationinfo 
LOGINFO:Log file created at: 17/Apr/2009:16:28:57 (1240000137)
LOGINFO:Previous Log File:/var/log/dirsrv/slapd-jennyv2/errors.20090416-151415 (1239909255) (5427396655)
[root@jennyv2 slapd-jennyv2]# ls -l errors*
-rw------- 1 nobody root         3412 Apr 17 16:28 errors
-rw------- 1 nobody nobody 5427396655 Apr 15 16:36 errors.20090416-151415
-rw------- 1 nobody root          168 Apr 17 16:28 errors.rotationinfo

Comment 26 Fedora Update System 2009-04-28 16:16:37 UTC
fedora-ds-base-1.2.0-4.fc9 has been submitted as an update for Fedora 9.
http://admin.fedoraproject.org/updates/fedora-ds-base-1.2.0-4.fc9

Comment 27 Fedora Update System 2009-04-28 16:18:06 UTC
fedora-ds-base-1.2.0-4.fc10 has been submitted as an update for Fedora 10.
http://admin.fedoraproject.org/updates/fedora-ds-base-1.2.0-4.fc10

Comment 28 Fedora Update System 2009-04-28 16:19:05 UTC
fedora-ds-base-1.2.0-4.fc11 has been submitted as an update for Fedora 11.
http://admin.fedoraproject.org/updates/fedora-ds-base-1.2.0-4.fc11

Comment 29 Chandrasekar Kannan 2009-04-29 23:08:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0455.html

Comment 30 Fedora Update System 2009-05-02 16:22:51 UTC
fedora-ds-base-1.2.0-4.fc9 has been pushed to the Fedora 9 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 31 Fedora Update System 2009-05-02 16:39:15 UTC
fedora-ds-base-1.2.0-4.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 32 Fedora Update System 2009-05-09 04:10:46 UTC
fedora-ds-base-1.2.0-4.fc11 has been pushed to the Fedora 11 stable repository.  If problems still persist, please make note of it in this bug report.