Bug 615378
Summary: | Ch 19 FAQ | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Robert Rati <rrati> |
Component: | Grid_User_Guide | Assignee: | Lana Brindley <lbrindle> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Lubos Trilety <ltrilety> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | Development | CC: | ltrilety, mhideo |
Target Milestone: | 1.3 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-10-14 20:12:37 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Robert Rati
2010-07-16 16:16:00 UTC
(In reply to comment #0) > Description of problem: > Change all HOSTALLOW_ -> ALLOW_ && HOSTDENY_ -> DENY_ Done. > > Set USE_PROCD = FALSE in the startd configuration => STARTD.USE_PROCD = FALSE & > STARTER.USE_PROCD = FALSE in the startd configuration. <listitem> <para> Set <command>STARTD.USE_PROCD = FALSE</command> and <command>STARTER.USE_PROCD = FALSE</command> in the startd configuration. This is the most reliable way to handle the situation. </para> </listitem> > > "The startd will always wait the value specified in the killing_timeout > parameter before hard-killing the job" => The startd will always wait the value > specified in the killing_timeout parameter before hard-killing the starter > > "However, the starter will always wait for the value specified in the > killing_timeout-1 configuration variable before attempting to hard-kill the > job" => However, by default the starter will wait killing_timeout-1 before > attempting to hard-kill the job. <para> When you try to kill a job with a custom signal, it can sometimes cause a race condition to occur between the starter and the startd. This happens when the startd communicates with the starter using <command>procd</command>. The startd will always wait the value specified in the <parameter>killing_timeout</parameter> parameter before hard-killing the starter. However, by default the starter will wait for the value specified in the <parameter>killing_timeout-1</parameter> configuration variable before attempting to hard-kill the job. This means that it is sometimes possible for the startd to be attempting to hard-kill the starter, while the starter is cleaning up and exiting. It causes the starter to stop communicating with the <command>procd</command>, which makes the startd suffer a communication failure, and then crash. </para> LKB No HOSTALLOW/HOSTDENY in grid user guide.
Chapter was correctly changed.
>>> VERIFIED
|