Bug 878111

Summary: ns-slapd segfaults if it cannot rename the logs
Product: Red Hat Enterprise Linux 6 Reporter: Najmuddin Chirammal <nc>
Component: 389-ds-baseAssignee: mreynolds
Status: CLOSED ERRATA QA Contact: Sankar Ramalingam <sramling>
Severity: medium Docs Contact:
Priority: high    
Version: 6.4CC: arubin, jgalipea, mreynolds, nhosoi, nkinder, tlavigne, yjog
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Known Issue
Doc Text:
The ns-slapd utility terminates unexpectedly if it cannot rename the dirsrv-<instance> log files in the /var/log/ directory due to incorrect permissions on the directory.
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-02-21 08:21:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 886216    

Description Najmuddin Chirammal 2012-11-19 16:55:23 UTC
Description of problem: 389-ds segfaults if it cannot rename the logs (due to wrong permission on the parent directory).

/etc/init.d/dirsrv start PKI-IPA
Starting dirsrv: 
ns-slapd: Failed to rename errors log file, Netscape Portable Runtime error -5966 (Access Denied.). Exiting...
ns-slapd: Failed to rename errors log file, Netscape Portable Runtime error -5966 (Access Denied.). Exiting...
rsyslogd-2177: imuxsock begins to drop messages from pid 3554 due to rate-limiting
kernel: ns-slapd[3554]: segfault at 7fffe5205adc ip 0000003aa02725b4 sp 00007fffe5205ac0 error 6 in libslapd.so.0.0.0[3aa0200000+f0000]

Version-Release number of selected component (if applicable): 389-ds-base-1.2.10.2-20.el6_3.x86_64

How reproducible: Always.

Steps to Reproduce:
1. chown root.root /var/log/dirsrv-<instance>
2. try to start slapd. 
  
Actual results: slapd segfaults

Expected results: No segfaults

Additional info: 
As I understood, slapd fails & segfault only if it requires log rotation, otherwise it starts normally (provided the files under /var/log/dirsrv-<instance> is writable).

Comment 2 Nathan Kinder 2012-11-26 16:23:28 UTC
Upstream ticket:
https://fedorahosted.org/389/ticket/527

Comment 3 mreynolds 2012-11-27 22:11:32 UTC
I can not reproduce the crash on 1.2.11.16.  I even chown root:root on all the log files, but the server converts them back to be owned by nobody after startup.  I did make sure it rotated the access & error log at startup, and it worked fine.

Are there any other details?  Who does your server run as?  What OS?

Comment 4 Najmuddin Chirammal 2012-11-28 05:59:15 UTC
 > Are there any other details?  Who does your server run as?  What OS?
it's an ipa server (1.2.10.2-20.el6_3 RHEL6.3), as far as I understand from the starce, it tries to rotate logs after doing setuid(to ds user), it gets permission denied, after many tries to rename the log it segfaults..

Comment 5 mreynolds 2012-11-28 17:39:48 UTC
I can not reproduce this on any version of 389.  Here is what I am doing:

[1]  Create a DS instance to run as user "dirman"
[2]  Turn on verbose logging to fill up error log (above 1 mb)
[3]  Stop DS
[4]  Change the max log size to 1 mb - this will force rotation at startup
[5]  chown root:root /var/log/dirsrv/slapd-localhost
[6]  chown root:root /var/log/dirsrv/slapd-localhost/*
[7]  start ds:  service dirsrv start
[8]  No crash or segfault
[9]  Error log is properly rotated, and ownership is restored to "dirman" for all the files under /var/log/dirsrv/slapd-localhost

Can you try and reproduce this with the latest version of 389?  1.2.11.16

What does "getenforce" return?

Comment 6 Najmuddin Chirammal 2012-11-29 12:33:16 UTC
(In reply to comment #5)
> I can not reproduce this on any version of 389.  Here is what I am doing:

I managed to reproduce it on the same machine again, I'll attach the abrt file (core+sosreport).

> What does "getenforce" return?
# sestatus |grep Current
Current mode:                   permissive

> Can you try and reproduce this with the latest version of 389?  1.2.11.16

I'll try it on my 6.4 test machine and update you if I'm able to re-produce it.

Comment 8 mreynolds 2012-11-29 16:58:38 UTC
Also, can you provide your exact steps on how you reproduce?  What user you are when you start 389, what command do you use to start 389, etc.

Thanks!

Comment 9 Najmuddin Chirammal 2012-11-30 19:03:23 UTC
(In reply to comment #8)
> Also, can you provide your exact steps on how you reproduce?  
I chowned /var/log/dirsrv/slap-<instance> to root.root and restarted the service (my error log was a week old so it tried to rotate it during startup).

>What user you are when you start 389, 
Logged in as root (the ds instance owned by pkiuser iirc).

> what command do you use to start 389, etc.

service dirsrv start reports FAIL, ns-slapd <options> also failed, but it did not return errors on terminal, I had to use -d option see the error , or from /var/log/messages.

I tried to re-produce on RHEL6.4 build, I was unable to re-produce it. I'll update you if I'm able to reproduce it on new builds.

Comment 11 Sankar Ramalingam 2012-12-03 18:31:28 UTC
QA acked.

Comment 12 mreynolds 2012-12-06 16:48:20 UTC
Najmuddin,

Have you tried the latest version of 389 on 10.65.211.213?

I've installed the debug info package on your system for 1.2.10.2-20.  If you can still reproduce it I'd like you to get a stack trace:

Before you start the server(which will trigger the log rotation/crash):

# gdb /usr/sbin/ns-slapd
(gdb) set args -D /etc/dirsrv/slapd-INSTANCE -i /var/run/dirsrv/slapd-INSTANCE.pid -w /var/run/dirsrv/slapd-INSTANCE.startpid -d 0
(gdb) run

--> seg fault

(gdb) where

Send me this output, and leave the debugger as is.


Or....

Just get the instance ready to crash, and I will login, and run the debugger.  Just let me know which instance to start.

Thanks!
Mark

Comment 13 mreynolds 2012-12-06 17:46:27 UTC
Actually the instance was ready to crash, and I'm debugging it.  The crash is caused by a stack overflow.  Investigating...

Comment 15 Sankar Ramalingam 2013-01-29 10:36:58 UTC
Request you to please add steps to verify this bugzilla.

Comment 16 mreynolds 2013-01-29 14:28:30 UTC
I was never able to reproduce this on my own.  Najmuddin Chirammal was the only one who could reproduce it(and only on a particular system).  I'm not sure if this is something that can be automated easily/consistently.  It is also a corner case, and not a common issue.

Comment 17 Sankar Ramalingam 2013-01-30 11:26:37 UTC
Najmuddin,
    Please add reproducible steps to verify this bug. Or please mark this bug as Verified if you cannot reproduce on your test system. I need your quick attention on this.

Comment 20 Sankar Ramalingam 2013-02-05 12:40:53 UTC
<nc> should I update and test? or you'll do it? 
<sankarr> you do it, plz
<nc> sankarr updated the packages
<nc> and I'm able to start the servers now
<nc> u may mark it as verfied
<nc> I do not get verified option on this bz report (may be due to current status?)
<nc> sankarr ^^
<sankarr> oh, sure
<sankarr> thanks


Hence marking the bug as verified.

Comment 21 errata-xmlrpc 2013-02-21 08:21:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0503.html