Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 703525

Summary:	KILL policy overridden by signal escalation
Product:	Red Hat Enterprise MRG	Reporter:	Robert Rati <rrati>
Component:	condor	Assignee:	Robert Rati <rrati>
Status:	CLOSED ERRATA	QA Contact:	Lubos Trilety <ltrilety>
Severity:	unspecified	Docs Contact:
Priority:	high
Version:	Development	CC:	iboverma, jneedle, ltoscano, ltrilety, matt, willb
Target Milestone:	2.0	Keywords:	Regression
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	condor-7.6.1-0.5	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2011-06-27 14:34:32 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Robert Rati 2011-05-10 15:18:06 UTC

Description of problem:
In the case where a job is preempted, the starter is setting the signal escalation timer and waiting upto 30 seconds before hard killing the job.  It shouldn't be doing this.  In a shutdown graceful case (like a preempt), the starter should send the respective signal and wait for outside influence to handle things appropriately.

See gt#2142 for details.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 1 Robert Rati 2011-05-10 15:18:35 UTC

Fixed upstream on V7_6-branch and master

Comment 8 Lubos Trilety 2011-05-18 13:59:50 UTC

Successfully reproduced on:
$CondorVersion: 7.6.1 Apr 27 2011 BuildID: RH-7.6.1-0.4.el6 $
$CondorPlatform: I686-RedHat_6.0 $

config:
PREEMPT = $(ActivationTimer) > 60

# echo -e "cmd=/root/test.sh\nargs=600\nqueue" | runuser condor -s /bin/bash -c
condor_submit
Submitting job(s).
1 job(s) submitted to cluster 1.

# condor_q
-- Submitter: hostname : <IP:56466> : hostname
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   1.0   condor          5/18 15:53   0+00:01:36 I  0   4.9  test.sh 600       
1 jobs; 1 idle, 0 running, 0 held

# ps -eaf | grep test
#

Comment 9 Lubos Trilety 2011-05-18 14:08:00 UTC

Tested on:
$CondorVersion: 7.6.1 May 17 2011 BuildID: RH-7.6.1-0.5.el5 $
$CondorPlatform: I686-RedHat_5.6 $

$CondorVersion: 7.6.1 May 17 2011 BuildID: RH-7.6.1-0.5.el5 $
$CondorPlatform: X86_64-RedHat_5.6 $

$CondorVersion: 7.6.1 May 17 2011 BuildID: RH-7.6.1-0.5.el6 $
$CondorPlatform: I686-RedHat_6.0 $

$CondorVersion: 7.6.1 May 17 2011 BuildID: RH-7.6.1-0.5.el6 $
$CondorPlatform: X86_64-RedHat_6.0 $


config:
PREEMPT = $(ActivationTimer) > 60

# echo -e "cmd=/root/test.sh\nargs=600\nqueue" | runuser condor -s /bin/bash -c
condor_submit
Submitting job(s).
1 job(s) submitted to cluster 1.

# condor_q
-- Submitter: hostname : <IP:55720> : hostname
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
   1.0   condor          5/18 16:02   0+00:02:09 R  0   0.0  test.sh 1d        
1 jobs; 0 idle, 1 running, 0 held

# ps -eaf | grep test
condor   13684 13683  0 16:02 ?        00:00:00 /bin/bash /root/test.sh 1d
#

>>> VERIFIED