Bug 615378 - Ch 19 FAQ
Ch 19 FAQ
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise MRG
Classification: Red Hat
Component: Grid_User_Guide (Show other bugs)
Development
All Linux
medium Severity medium
: 1.3
: ---
Assigned To: Lana Brindley
Lubos Trilety
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-16 12:16 EDT by Robert Rati
Modified: 2013-10-23 19:17 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-10-14 16:12:37 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Robert Rati 2010-07-16 12:16:00 EDT
Description of problem:
Change all HOSTALLOW_ -> ALLOW_ && HOSTDENY_ -> DENY_

Set USE_PROCD = FALSE in the startd configuration => STARTD.USE_PROCD = FALSE & STARTER.USE_PROCD = FALSE in the startd configuration.

"The startd will always wait the value specified in the killing_timeout parameter before hard-killing the job" => The startd will always wait the value specified in the killing_timeout parameter before hard-killing the starter

"However, the starter will always wait for the value specified in the killing_timeout-1 configuration variable before attempting to hard-kill the job" => However, by default the starter will wait killing_timeout-1 before attempting to hard-kill the job.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
Comment 1 Lana Brindley 2010-07-18 23:33:38 EDT
(In reply to comment #0)
> Description of problem:
> Change all HOSTALLOW_ -> ALLOW_ && HOSTDENY_ -> DENY_

Done.

> 
> Set USE_PROCD = FALSE in the startd configuration => STARTD.USE_PROCD = FALSE &
> STARTER.USE_PROCD = FALSE in the startd configuration.

<listitem>
	<para>
		Set <command>STARTD.USE_PROCD = FALSE</command> and <command>STARTER.USE_PROCD = FALSE</command> in the startd configuration. This is the most reliable way to handle the situation.
	</para>
</listitem>

> 
> "The startd will always wait the value specified in the killing_timeout
> parameter before hard-killing the job" => The startd will always wait the value
> specified in the killing_timeout parameter before hard-killing the starter
> 
> "However, the starter will always wait for the value specified in the
> killing_timeout-1 configuration variable before attempting to hard-kill the
> job" => However, by default the starter will wait killing_timeout-1 before
> attempting to hard-kill the job.

<para>
	When you try to kill a job with a custom signal, it can sometimes cause a race condition to occur between the starter and the startd. This happens when the startd communicates with the starter using <command>procd</command>. The startd will always wait the value specified in the <parameter>killing_timeout</parameter> parameter before hard-killing the starter. However, by default the starter will wait for the value specified in the <parameter>killing_timeout-1</parameter> configuration variable before attempting to hard-kill the job. This means that it is sometimes possible for the startd to be attempting to hard-kill the starter, while the starter is cleaning up and exiting. It causes the starter to stop communicating with the <command>procd</command>, which makes the startd suffer a communication failure, and then crash.
</para>


LKB
Comment 2 Lubos Trilety 2010-09-21 04:29:29 EDT
No HOSTALLOW/HOSTDENY in grid user guide.
Chapter was correctly changed.

>>> VERIFIED

Note You need to log in before you can comment on or make changes to this bug.