Bug 1150225
Summary: | dirsrv not running after restraint job restarts it | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Scott Poore <spoore> | ||||||
Component: | 389-ds-base | Assignee: | Noriko Hosoi <nhosoi> | ||||||
Status: | CLOSED WONTFIX | QA Contact: | Viktor Ashirov <vashirov> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 7.0 | CC: | bpeck, jgalipea, nkinder, rmeggins | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-04-17 17:08:19 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Scott Poore
2014-10-07 17:45:57 UTC
Created attachment 944702 [details]
restart test rpm
needed for testing restraint
Created attachment 944704 [details]
strace output
This strace output is from a RHEL6.5 test case. I had issues capturing output with strace against restraintd on RHEL6.6.
Hi Scott, Could you please attach the Directory Server logs (/var/log/dirsrv/slapd-ID/{access,errors}) to the bug? I'm wondering "not running" means 1) the server was not launched, 2) the server was shutdown, or 3) the server died? Thanks, --noriko (In reply to Noriko Hosoi from comment #4) > Hi Scott, > > Could you please attach the Directory Server logs > (/var/log/dirsrv/slapd-ID/{access,errors}) to the bug? > > I'm wondering "not running" means 1) the server was not launched, 2) the > server was shutdown, or 3) the server died? > > Thanks, > --noriko Never mind... "what stopped dirsrv" already talked about it (and more). Right, and I don't really see much in the logs other than it's being stopped: [07/Oct/2014:12:35:49 -0500] - slapd shutting down - signaling operation threads [07/Oct/2014:12:35:49 -0500] - slapd shutting down - closing down internal subsystems and plugins [07/Oct/2014:12:35:49 -0500] - Waiting for 4 database threads to stop [07/Oct/2014:12:35:50 -0500] - All database threads now stopped [07/Oct/2014:12:35:50 -0500] - slapd stopped. A summary of findings was: restraind is running make run to execute the runtest.sh which runs the ipactl commands. when that finishes, ns-slapd is being hit with a SIGHUP And the following worked to prevent dirsrv from being stopped when restraint finished the task: setsid nohup ipactl start It should be noted that the original failure that led here was ipactl start was running but, only dirsrv was down afterwards. I just removed it from the reproducer to try to simplify things. FYI, more info and maybe an easier test. After some troubleshooting with Bill this afternoon, we got a workaround that replaced an exec from one of the restraint plugins with setsid to handle execution. This seemed to detatch the process so it didn't get killed when the restraint task ended. Now, looking at the execution via restraint, I think a simple reproducer is just ssh to host and run: service dirsrv stop exec service dirsrv start At this point you're logged out as expected. But, when I log back in, it's not running. Or to confirm it's still up before logged out: cat > exectest <<EOF service dirsrv start service dirsrv status EOF chmod 755 exectest service dirsrv stop exec ./exectest Then log back in and check. Upstream ticket: https://fedorahosted.org/389/ticket/48014 Per 389-ds-base ticket triage, put to post 1.3.5. I cannot figure out how to disable setting the target version... This bug is supposed to be set post rhel-7.3.0 (7.4.0?), but it's not allowed. Setting 7.3.0 for now. |