Bug 1493408 - abort regression jobs after a period of inactivity rather than a hard timeout of 360 minutes
Summary: abort regression jobs after a period of inactivity rather than a hard timeout...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: GlusterFS
Classification: Community
Component: project-infrastructure
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-20 06:50 UTC by Milind Changire
Modified: 2018-04-12 15:00 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-12 15:00:10 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Milind Changire 2017-09-20 06:50:44 UTC
Description of problem:
Sometimes regression jobs hang earlier than 360 minutes.
Sometimes the regressions just run slower.

Expected results:
It would help to abort regression jobs earlier than 360 minutes and reboot the node to make way for other jobs if there has been 15 minutes of inactivity on the STDOUT.

This will help to abort hung jobs earlier than wasting time.
For slow running regression jobs, this would help to continue and complete the regression run than aborting the job and running it again for 360 minutes.

Comment 1 Nigel Babu 2017-09-20 07:09:17 UTC
We currently use type: absolute for timeout (see http://git.gluster.org/cgit/build-jobs.git/tree/build-gluster-org/jobs/centos6-regression.yml#n60)

There's a type: no-activity which will abort after no activity for the timeout defined. If we set that to 900 (15 mins), it should potentially work.

Comment 2 Nigel Babu 2018-04-12 15:00:10 UTC
This is now fixed thanks to Amar's timeout on a per patch.


Note You need to log in before you can comment on or make changes to this bug.