Bug 429198

Summary: [RFE] Include SLURM in HPC Layered offering
Product: Red Hat Enterprise Linux 6 Reporter: Issue Tracker <tao>
Component: distributionAssignee: RHEL Program Management <pm-rhel>
Status: CLOSED WONTFIX QA Contact: Brock Organ <borgan>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.0CC: cdmaestas, james.brown, jwest, notting, tao
Target Milestone: rcKeywords: FutureFeature, Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-10-19 18:36:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 729785    

Description Issue Tracker 2008-01-17 21:36:20 UTC
Escalated to Bugzilla from IssueTracker

Comment 1 Issue Tracker 2008-01-17 21:36:22 UTC
Include SLURM in HPC Layered offering
This event sent from IssueTracker by kbaxley  [LLNL (HPC)]
 issue 36716

Comment 2 Issue Tracker 2008-01-17 21:36:23 UTC
LLNL has been developing an GPL batch job scheduler. It makes it the
software
that essentially glues the cluster together. It allows the cluster
administrator
to carve up the cluster into seperate sections, order and prioritize jobs
based
upon cluster availability and use policy. It also allows users to specify
job
requirements, so that the adminisrator knows where and how to schedule it
on the
cluster. The batch scheduler also does the job of actually running the
executable on the various cluster machines, cleaning up after the job has
used
up its allocation and accounting for supercomputer usage.

There are a handful of batch schedulers out there which could be
incorporated
into a RHEL4 based HPC offering. SLURM is the one that LLNL is developing.
It is
GPL which is a distinct advantage over schedulers. It is also being
actively
maintained another advantage.


This event sent from IssueTracker by kbaxley  [LLNL (HPC)]
 issue 36716

Comment 3 Issue Tracker 2008-01-17 21:36:24 UTC
Date: Wed, 18 Jan 2006 11:37:49 -0800
From: Morris Jette <jette>
Subject: SLURM version 1.0 now available
X-Sender: jette.gov
To: linux-admin.gov
Message-id: <p06020417bff44934c4d2@[134.9.94.94]>
MIME-version: 1.0
Content-type: text/plain; format=flowed; charset=us-ascii
Content-transfer-encoding: 7BIT

SLURM v1.0 is available in the usual place. We plan for it
to be deployed as part of Chaos 3.1.


>We are please to announce the release of SLURM version 1.0.
>It finally contains all of the capabilities we had planned
>for at it's inception in 2002, plus a few more. The major
>enhancements in version 1.0 include:
>
>* I/O streams for all tasks on a node are transmitted through a single
>   sockets instead of distinct sockets for each task. This improves
>   performance and scalability.
>* Task affinity for binding tasks to CPUs.
>* Nodes can be in multiple partitions, providing more flexibility in
>   managing the SLURM partitions as queues.
>* Support for task communication/synchronization primitives (PMI).
>   This provides support for MPICH2.
>* E-mail notification option on job state changes.
>* Better control over task distribution (hostfile support).
>* User control of job prioritization via nice option.
>* Web-based configuration file building tool.
>* Support to preempt/resume jobs. This means we can now support
>   gang scheduling.
>* Support for deferred job initiation time specification.
>* Compute node job shepherd now exec'ed program, slurmstepd.  This
>   eliminates use of SysV shared memory and some problems in support
>   of the Native Posix Thread Library.
>* Several bug fixes in interface with Maui Scheduler.
>* For Blue Gene systems: support for small bglblocks.
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Morris "Moe" Jette       jette1                 925-423-4856
Integrated Computational Resource Management Group   fax 925-423-6961
Livermore Computing            Lawrence Livermore National Laboratory
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++



This event sent from IssueTracker by kbaxley  [LLNL (HPC)]
 issue 36716

Comment 4 Issue Tracker 2008-01-17 21:36:25 UTC
We should make this at least an option for MRG. Practically no one out
there takes condor seriously for anything but cycle stealing of
workstation. It doesn't scale to large and the latency of job starting is
too high. Furthermore, it doesn't get out of the way of the job to be
started as well as SLURM. 




This event sent from IssueTracker by kbaxley  [LLNL (HPC)]
 issue 36716

Comment 9 Christopher D. Maestas 2008-11-12 03:46:57 UTC
Can we consider this as a possibility for EPEL?  I would be willing to work with the slurm developers and redhat in packaging appropriately.

Comment 13 RHEL Program Management 2011-07-06 00:56:45 UTC
This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated
in the current release, Red Hat is unfortunately unable to
address this request at this time. Red Hat invites you to
ask your support representative to propose this request, if
appropriate and relevant, in the next release of Red Hat
Enterprise Linux. If you would like it considered as an
exception in the current release, please ask your support
representative.