RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 718273 - Need policy for gridengine mpi jobs
Summary: Need policy for gridengine mpi jobs
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: selinux-policy
Version: 6.2
Hardware: All
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Miroslav Grepl
QA Contact: Milos Malik
URL:
Whiteboard:
: 729361 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-07-01 15:57 UTC by Orion Poplawski
Modified: 2012-06-20 12:24 UTC (History)
4 users (show)

Fixed In Version: selinux-policy-3.7.19-138.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-20 12:24:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Initial sge_execd policy (927 bytes, application/x-compressed-tar)
2011-07-25 12:57 UTC, Miroslav Grepl
no flags Details
sge_execd_t denials (16.27 KB, text/plain)
2011-07-26 17:42 UTC, Orion Poplawski
no flags Details
sge (283.32 KB, text/plain)
2011-07-26 18:27 UTC, Orion Poplawski
no flags Details
Updated sge_execd policy (1.44 KB, application/x-compressed-tar)
2011-07-28 11:41 UTC, Miroslav Grepl
no flags Details
sge_execd_t denials (39.98 KB, text/plain)
2011-07-28 15:38 UTC, Orion Poplawski
no flags Details
Updated sge_* policy (1.50 KB, application/x-compressed-tar)
2011-07-29 09:38 UTC, Miroslav Grepl
no flags Details
sge_execd_t denials (137.49 KB, text/plain)
2011-08-01 22:42 UTC, Orion Poplawski
no flags Details
Updated policy (1.66 KB, application/x-compressed-tar)
2011-08-05 09:37 UTC, Miroslav Grepl
no flags Details
sge denials (205.42 KB, text/x-log)
2011-08-10 17:51 UTC, Orion Poplawski
no flags Details
Updated policy (3.51 KB, application/x-compressed-tar)
2011-08-24 08:16 UTC, Miroslav Grepl
no flags Details
sge denials (40.60 KB, text/plain)
2011-08-24 19:50 UTC, Orion Poplawski
no flags Details
sge policy with sge_job_t domain (1.81 KB, application/x-compressed-tar)
2011-09-06 12:20 UTC, Miroslav Grepl
no flags Details
sge denials (36.19 KB, text/plain)
2011-09-06 16:01 UTC, Orion Poplawski
no flags Details
sge policy (2.04 KB, application/x-compressed-tar)
2011-09-08 13:17 UTC, Miroslav Grepl
no flags Details
sge denials (21.27 KB, text/plain)
2011-09-08 20:05 UTC, Orion Poplawski
no flags Details
sge policy (2.03 KB, application/x-compressed-tar)
2011-09-13 11:25 UTC, Miroslav Grepl
no flags Details
sge denials (27.66 KB, text/plain)
2011-09-13 21:16 UTC, Orion Poplawski
no flags Details
redesigned sge policy (2.04 KB, application/x-compressed-tar)
2011-12-08 16:07 UTC, Miroslav Grepl
no flags Details
sge denials (6.54 KB, text/plain)
2011-12-09 21:14 UTC, Orion Poplawski
no flags Details
sge denials (22.68 KB, text/plain)
2012-05-17 16:54 UTC, Orion Poplawski
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0780 0 normal SHIPPED_LIVE selinux-policy bug fix and enhancement update 2012-06-19 20:34:59 UTC

Description Orion Poplawski 2011-07-01 15:57:47 UTC
Description of problem:

gridengine sge_execd process starts sshd processes as the user of the job submitter listening to a given port to allow for remote execution of mpi jobs.

Process tree on remote machine looks like:

system_u:system_r:initrc_t:s0   sgeadmin  2197     1  0 09:27 ?        00:00:00 /usr/bin/sge_execd
system_u:system_r:initrc_t:s0   sgeadmin  3052  2197  0 09:43 ?        00:00:00 sge_shepherd-21428 -bg
system_u:system_r:sshd_t:s0-s0:c0.c1023 root 3053 3052  0 09:43 ?      00:00:00 sshd: steph [priv]
system_u:system_r:sshd_t:s0-s0:c0.c1023 steph 3060 3053  0 09:43 ?     00:00:00 sshd: steph@notty
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3061 3060  0 09:43 ? 00:00:00 /usr/share/gridengine/utilbin/lx26-amd64/qrsh_starter /var/spool/gridengine/andrew/active_jobs/21428.1/1.andrew
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3082 3061  0 09:43 ? 00:00:00 tcsh -c  orted -mca ess env -mca orte_ess_jobid 1551040512 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri "1551040512.0;tcp://10.10.40.3:59368;tcp://192.168.1.136:59368"
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3099 3082  0 09:43 ? 00:00:00 orted -mca ess env -mca orte_ess_jobid 1551040512 -mca orte_ess_vpid 1 -mca orte_ess_num_procs 2 --hnp-uri 1551040512.0;tcp://10.10.40.3:59368;tcp://192.168.1.136:59368
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3100 3099 99 09:43 ? 00:10:22 /data/cora5/steph/WRFV3_HDIABATIC/run/wrf.exe
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3101 3099 99 09:43 ? 00:10:22 /data/cora5/steph/WRFV3_HDIABATIC/run/wrf.exe
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3102 3099 99 09:43 ? 00:10:22 /data/cora5/steph/WRFV3_HDIABATIC/run/wrf.exe
unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 steph 3103 3099 99 09:43 ? 00:10:22 /data/cora5/steph/WRFV3_HDIABATIC/run/wrf.exe

This fails with SELinux enforcing, but works in permissive mode.  Audit denials:

type=AVC msg=audit(1309534987.668:52): avc:  denied  { getattr } for  pid=3053 comm="sshd" laddr=10.10.40.4 lport=43289 faddr=10.10.40.3 fport=36754 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket
type=AVC msg=audit(1309534987.668:53): avc:  denied  { setopt } for  pid=3053 comm="sshd" laddr=10.10.40.4 lport=43289 faddr=10.10.40.3 fport=36754 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket
type=AVC msg=audit(1309534987.668:54): avc:  denied  { getopt } for  pid=3053 comm="sshd" laddr=10.10.40.4 lport=43289 faddr=10.10.40.3 fport=36754 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket
type=AVC msg=audit(1309534987.913:59): avc:  denied  { ioctl } for  pid=3053 comm="sshd" path="socket:[22376]" dev=sockfs ino=22376 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket
type=AVC msg=audit(1309534987.938:64): avc:  denied  { getattr } for  pid=3060 comm="sshd" laddr=10.10.40.4 lport=43289 faddr=10.10.40.3 fport=36754 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket
type=AVC msg=audit(1309534987.938:65): avc:  denied  { getopt } for  pid=3060 comm="sshd" laddr=10.10.40.4 lport=43289 faddr=10.10.40.3 fport=36754 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket
type=AVC msg=audit(1309534987.938:66): avc:  denied  { setopt } for  pid=3060 comm="sshd" laddr=10.10.40.4 lport=43289 faddr=10.10.40.3 fport=36754 scontext=system_u:system_r:sshd_t:s0-s0:c0.c1023 tcontext=system_u:system_r:initrc_t:s0 tclass=tcp_socket

Version-Release number of selected component (if applicable):
selinux-policy-3.7.19-54.el6_0.5.noarch

As the gridengine packager I'd like to get this fixed so we can run our compute nodes with selinux enabled.

Comment 2 Miroslav Grepl 2011-07-11 06:52:46 UTC
Orion,
AFAIK we have the same bug also for Fedora. Can we make this working in Fedora first?

I would send you an initial policy for testing.

Comment 3 Orion Poplawski 2011-07-21 19:30:56 UTC
I'm fine with fixing in fedora first.  Do you have something to test?

Comment 4 Miroslav Grepl 2011-07-25 12:57:36 UTC
Created attachment 515036 [details]
Initial sge_execd policy

Here is the initial policy.

tar xvf /tmp/sge_policy.tgz
cd /tmp/
sh sge_execd.sh

echo "-w /etc/shadow -p wa" >> /etc/audit/audit.rules
service auditd restart

service sge_execd restart


And start collecting AVC's

Comment 5 Orion Poplawski 2011-07-25 16:29:42 UTC
Thanks.  Do I need to run this on Fedora? Or can I test on EL6?

Comment 6 Daniel Walsh 2011-07-25 18:40:38 UTC
If it will install on RHEL6, you could test it there.

Comment 7 Orion Poplawski 2011-07-26 17:42:21 UTC
Created attachment 515332 [details]
sge_execd_t denials

Okay, running on EL6

Some notes:

type=AVC msg=audit(1311701740.164:40): avc:  denied  { read } for  pid=2912 comm="sge_execd" nam
e="bootstrap" dev=dm-0 ino=24169 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u
:object_r:usr_t:s0 tclass=file
type=AVC msg=audit(1311701740.164:40): avc:  denied  { open } for  pid=2912 comm="sge_execd" nam
e="bootstrap" dev=dm-0 ino=24169 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u
:object_r:usr_t:s0 tclass=file
type=AVC msg=audit(1311701740.164:41): avc:  denied  { getattr } for  pid=2912 comm="sge_execd" 
path="/usr/share/gridengine/default/common/bootstrap" dev=dm-0 ino=24169 scontext=unconfined_u:s
ystem_r:sge_execd_t:s0 tcontext=system_u:object_r:usr_t:s0 tclass=file


A number of config files and utility programs are in /usr/share/gridengine.


type=AVC msg=audit(1311701742.173:54): avc:  denied  { write } for  pid=2914 comm="sge_execd" na
me="/" dev=tmpfs ino=11897 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u:objec
t_r:tmp_t:s0 tclass=dir
type=AVC msg=audit(1311701742.173:54): avc:  denied  { add_name } for  pid=2914 comm="sge_execd"
 name="execd_messages.2912" scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u:obje
ct_r:tmp_t:s0 tclass=dir
type=AVC msg=audit(1311701742.173:54): avc:  denied  { create } for  pid=2914 comm="sge_execd" n
ame="execd_messages.2912" scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=unconfined_u:ob
ject_r:tmp_t:s0 tclass=file

On startup, it writes a log file to /tmp, before shifting to /var/spool/gridengine/<hostname>/messages.  A number of files are written to /var/spool/gridengine/.


type=AVC msg=audit(1311701742.177:63): avc:  denied  { getattr } for  pid=2914 comm="sge_execd" 
path="/usr/libexec/gridengine/utilbin/cora.sh" dev=dm-0 ino=427746 scontext=unconfined_u:system_
r:sge_execd_t:s0 tcontext=system_u:object_r:bin_t:s0 tclass=file
type=AVC msg=audit(1311701742.178:64): avc:  denied  { read } for  pid=2920 comm="sge_execd" nam
e="sh" dev=dm-0 ino=131016 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u:objec
t_r:bin_t:s0 tclass=lnk_file
type=AVC msg=audit(1311701742.178:64): avc:  denied  { execute } for  pid=2920 comm="sge_execd" 
name="bash" dev=dm-0 ino=131487 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u:
object_r:shell_exec_t:s0 tclass=file
type=AVC msg=audit(1311701742.178:64): avc:  denied  { read open } for  pid=2920 comm="sge_execd
" name="bash" dev=dm-0 ino=131487 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_
u:object_r:shell_exec_t:s0 tclass=file

This is a custom load sensor script that is started by the daemon.

Comment 8 Orion Poplawski 2011-07-26 18:27:56 UTC
Created attachment 515340 [details]
sge

Okay, this is the combination of the logs of two machines running an mpi job that covers the two machines.  It includes all of the previous logs as well.

Comment 9 Miroslav Grepl 2011-07-27 07:37:45 UTC
Thanks. I will upate the test policy. 

Are you fine with sge_execd_t domain or you would like to add a different name?

Comment 10 Orion Poplawski 2011-07-27 15:49:42 UTC
I think that name makes as much sense as any.

Comment 11 Miroslav Grepl 2011-07-28 11:41:25 UTC
Created attachment 515700 [details]
Updated sge_execd policy

tar xvf /tmp/sge_execd_policy.tgz
cd /tmp
sh sge_execd.sh

chcon -t bin_t PATHO/cora.sh

and re-test.

Comment 12 Orion Poplawski 2011-07-28 15:38:55 UTC
Created attachment 515752 [details]
sge_execd_t denials

Okay, here are the current denials.  Some more notes:

sge_execd starts a sge_shepherd process for each job it starts/manages on the host which is responsible for starting and monitoring the job.  Not sure if it should get its own policy or not.  It puts output of the jobs wherever the user requests them to be put.

At some point it fires off the user's job, which could do lots of things.  I'm not sure if this could be adequately handled under a selinux policy.

Comment 13 Miroslav Grepl 2011-07-28 16:11:16 UTC
Ah, could you also turn on a new boolean

# setsebool sge_execd_use_nfs on

it will remove some of AVS msgs.

Yes, sge_shepherd could get own type. I will re-write the policy.

Comment 14 Miroslav Grepl 2011-07-29 09:38:34 UTC
Created attachment 515863 [details]
Updated sge_* policy

tar xvf /tmp/sge_execd.tgz
cd /tmp
sh sge_execd.sh

chcon -t bin_t PATHO/cora.sh
chcon -t sge_shepherd_exec_t PATHO/sge_shepherd
setsebool sge_execd_use_nfs on

Comment 15 Orion Poplawski 2011-08-01 22:42:04 UTC
Created attachment 516220 [details]
sge_execd_t denials

Updated denials.

Comment 16 Miroslav Grepl 2011-08-05 09:37:05 UTC
Created attachment 516857 [details]
Updated policy

tar xvf /tmp/sge_execd.tgz
cd /tmp
sh sge_execd.sh

chcon -t bin_t PATHO/cora.sh
chcon -t sge_shepherd_exec_t PATHO/sge_shepherd
setsebool sge_execd_use_nfs on

Comment 18 Orion Poplawski 2011-08-10 17:51:24 UTC
Created attachment 517666 [details]
sge denials

Here you go.

Comment 19 Miroslav Grepl 2011-08-11 11:15:37 UTC
Ok, i am fixing them. I see

allow sge_shepherd_t self:process { execmem setfscreate };

which context are you setting using setfscreate?

Comment 21 Orion Poplawski 2011-08-11 15:42:59 UTC
It appears to be coming from a mv command that is part of a user's job script:

type=AVC msg=audit(1312907603.742:1405): avc:  denied  { setfscreate } for  pid=5695 comm="mv" scontext=system_u:system_r:sge_shepherd_t:s0 tcontext=system_u:system_r:sge_shepherd_t:s0 tclass=process
type=AVC msg=audit(1312907603.787:1406): avc:  denied  { remove_name } for  pid=5695 comm="mv" name="azimuth.dat" dev=tmpfs ino=352738 scontext=system_u:system_r:sge_shepherd_t:s0 tcontext=system_u:object_r:sge_execd_tmp_t:s0 tclass=dir
type=AVC msg=audit(1312907603.787:1406): avc:  denied  { unlink } for  pid=5695 comm="mv" name="azimuth.dat" dev=tmpfs ino=352738 scontext=system_u:system_r:sge_shepherd_t:s0 tcontext=system_u:object_r:sge_execd_tmp_t:s0 tclass=file

In this particular case the job script is moving a file from the /tmp (tmpfs) fs to a NFS directory.

Comment 22 Miroslav Grepl 2011-08-24 08:16:14 UTC
Created attachment 519573 [details]
Updated policy

I apologize for delay. I was out.

Install updated policy and also turn on sge_execd_use_nfs and allow_ssh_keysign booleans.

# setsebool -P sge_execd_use_nfs 1
# setsebool -P allow_ssh_keysign 1

Comment 23 Orion Poplawski 2011-08-24 19:50:02 UTC
Created attachment 519703 [details]
sge denials

Here are updated denials.  When I started sge_execd this time there were running jobs so it would try to reconnect to them.  I also ran my mpi job.  I see the job still runs as sge_shepherd_t which still seems wrong to me.

Comment 24 Miroslav Grepl 2011-08-25 20:29:24 UTC
Well, it looks like the sge_execd.te file is corrupted. I need to send you a new one since there were fixes for your AVC msgs.

Also I thought sge_shepherd_t domain is domain for a job. Look at your comment #12.

Comment 25 Orion Poplawski 2011-08-25 20:38:31 UTC
sge_execd starts sge_shepherd which then starts the user's job.  I'm not sure how you are going to get around having the user's job run in an essentially unconfined domain (as it could do a lot of things), but I may be wrong there.  In any case I would have thought that the shepherd process would have its own domain to cover just its duties.  But we're severely stretching the limits of my SELinux knowledge at this point.

Comment 26 Miroslav Grepl 2011-08-26 15:45:31 UTC
We have 

sge_execd_t domain for sge_execd

and

sge_shepperd_t domain for sge_shepherd

So my idea is user's jobs run in the sge_shepperd_t. We also have sge_shepherd_ssh_t domain to allow for remote execution of mpi jobs.

Could you me add an example of user's job and interaction with sge_shepperd?

Comment 27 Orion Poplawski 2011-08-29 16:05:49 UTC
I'm not quite sure what information you are looking for.  But here is some basic info.  sge_execd will start will start a sge_shepherd process for each job.  sge_shepherd will then execute the user's job script which will be /var/spool/gridengine/<hostname>/job_scripts/<jobid>.  The job must be a script of some kind.  Here's an example of the sge_shepherd trace file to give an example of what it does:

# cat /var/spool/gridengine/amos/active_jobs/21984.1/trace
08/28/2011 09:26:47 [498:11575]: shepherd called with uid = 0, euid = 498
08/28/2011 09:26:47 [498:11575]: starting up 6.2u5
08/28/2011 09:26:47 [498:11575]: setpgid(11575, 11575) returned 0
08/28/2011 09:26:47 [498:11575]: do_core_binding: "binding" parameter not found in config file
08/28/2011 09:26:47 [498:11575]: no prolog script to start
08/28/2011 09:26:47 [498:11575]: parent: forked "job" with pid 11576
08/28/2011 09:26:47 [498:11575]: parent: job-pid: 11576
08/28/2011 09:26:47 [498:11576]: child: starting son(job, /var/spool/gridengine/amos/job_scripts/21984, 0);
08/28/2011 09:26:47 [498:11576]: pid=11576 pgrp=11576 sid=11576 old pgrp=11575 getlogin()=<no login set>
08/28/2011 09:26:47 [498:11576]: reading passwd information for user 'vasha'
08/28/2011 09:26:47 [498:11576]: setosjobid: uid = 0, euid = 498
08/28/2011 09:26:47 [498:11576]: setting limits
08/28/2011 09:26:47 [498:11576]: RLIMIT_CPU setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: RLIMIT_FSIZE setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: RLIMIT_DATA setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: RLIMIT_STACK setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: RLIMIT_CORE setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: RLIMIT_VMEM/RLIMIT_AS setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: RLIMIT_RSS setting: (soft INFINITY hard INFINITY) resulting: (soft INFINITY hard INFINITY)
08/28/2011 09:26:47 [498:11576]: setting environment
08/28/2011 09:26:47 [498:11576]: Initializing error file
08/28/2011 09:26:47 [498:11576]: switching to intermediate/target user
08/28/2011 09:26:47 [630:11576]: closing all filedescriptors
08/28/2011 09:26:47 [630:11576]: further messages are in "error" and "trace"
08/28/2011 09:26:47 [630:11576]: now running with uid=630, euid=630
08/28/2011 09:26:47 [630:11576]: execvp(/bin/bash, "bash" "/var/spool/gridengine/amos/job_scripts/21984" "0110")

I think the user's job script and subsequent processes should probably run in some kind of sge_job_t domain since the job should be allowed to do a fair amount of things.

Comment 28 Miroslav Grepl 2011-08-30 07:51:35 UTC
Great, now it is more clear. I was looking for 

>sge_shepherd will then execute the user's job script which will be
>/var/spool/gridengine/<hostname>/job_scripts/<jobid>.  The job must be a script
>of some kind.

Comment 29 Miroslav Grepl 2011-09-06 12:20:10 UTC
Created attachment 521650 [details]
sge policy with sge_job_t domain

Now user's jobs should be running within the sge_job_t domain and you should see AVC msgs for this domain.

Comment 30 Orion Poplawski 2011-09-06 16:01:18 UTC
Created attachment 521702 [details]
sge denials

Updated denials with the new policy.

Comment 31 Miroslav Grepl 2011-09-08 12:37:21 UTC
Ok, the problem is

execvp(/bin/bash, "bash"
"/var/spool/gridengine/amos/job_scripts/21984" "0110")

I think I have a solution.

Comment 32 Miroslav Grepl 2011-09-08 13:17:56 UTC
Created attachment 522118 [details]
sge policy

Try this one.

Comment 33 Orion Poplawski 2011-09-08 13:37:05 UTC
(In reply to comment #31)
> Ok, the problem is
> 
> execvp(/bin/bash, "bash"
> "/var/spool/gridengine/amos/job_scripts/21984" "0110")

Note that what script interpreter to use is passed to qsub at job submission.  Default is /bin/tcsh, but it could be just about anything.

Comment 34 Orion Poplawski 2011-09-08 20:05:33 UTC
Created attachment 522197 [details]
sge denials

This include sge_execd startup, re-connecting to some running jobs, and running the mpi jobs.  Doesn't seem like the job is transitioning to sge_job_t.

Comment 35 Miroslav Grepl 2011-09-13 11:25:45 UTC
Created attachment 522902 [details]
sge policy

My fault, I used bad interface in sge_execd policy which should be now fixed.s

Comment 36 Orion Poplawski 2011-09-13 21:16:40 UTC
Created attachment 523012 [details]
sge denials

Updated denials.  Also includes the shutdown and emailing status from a currently running jub.

Comment 38 Miroslav Grepl 2011-12-08 16:07:05 UTC
Created attachment 542622 [details]
redesigned sge policy

I am finally back on this issue.

Orion, 
I added redesigned sge policy where all jobs should be running with sge_job_t domain. Also I removed some rules for sge_shepherd_t because I believe these rules are now required by sge_job_t. So you probably will get some AVC msgs. 

Thank you.

Comment 39 Miroslav Grepl 2011-12-09 09:37:48 UTC
*** Bug 729361 has been marked as a duplicate of this bug. ***

Comment 40 Orion Poplawski 2011-12-09 21:14:43 UTC
Created attachment 544698 [details]
sge denials

This isn't the full test case, but there is a fair amount here to work on.

Comment 41 Miroslav Grepl 2011-12-13 09:14:46 UTC
Great, this is much better. Thanks. I will add it to Fedora and then to RHEL6.3.

Comment 44 Orion Poplawski 2011-12-13 21:19:18 UTC
Umm, seems to me that there are still a number of issues to resolve.  I assume you are still working on them?

Comment 45 Miroslav Grepl 2011-12-14 13:16:34 UTC
Yes.

Comment 46 Karel Srot 2012-01-24 08:09:43 UTC
Hi Orion,
could you please provide your testing scenario? I would like to try to reproduce it on my own. Thank you in advance.

Comment 47 Miroslav Grepl 2012-01-24 09:56:07 UTC
That would be great. Also I need to finish the policy asap.

Comment 48 Orion Poplawski 2012-01-24 15:32:29 UTC
There's nothing too special.  I'm currently  using the gridengine-6.2u5-6 package from EPEL and openmpi.  See /usr/share/doc/gridengine-6.2u5/README for gridengine setup.  User jobs usually run/access NFS directories.

I then run a simple MPI hello world type program with a job script like:

#$ -S /bin/bash
#$ -pe mpi 4
#$ -cwd
. /etc/profile.d/modules.sh
module load openmpi-x86_64
mpirun $*

Comment 51 Miroslav Grepl 2012-02-28 07:08:26 UTC
We have working policy in Fedora. Time for backporting.

Comment 54 Orion Poplawski 2012-05-17 16:54:32 UTC
Created attachment 585272 [details]
sge denials

I updated to selinux-policy-3.7.19-153.el6.noarch from the dwalsh testing repo.  Still see the following on sge_execd startup running in permissive mode:

type=AVC msg=audit(1337271515.629:1671): avc:  denied  { name_bind } for  pid=24363 comm="sge_execd" src=6445 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u:object_r:port_t:s0 tclass=tcp_socket
type=AVC msg=audit(1337271515.931:1672): avc:  denied  { name_connect } for  pid=24363 comm="sge_execd" dest=6444 scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=system_u:object_r:port_t:s0 tclass=tcp_socket
type=AVC msg=audit(1337271517.963:1673): avc:  denied  { sys_ptrace } for  pid=24371 comm="ps" capability=19  scontext=unconfined_u:system_r:sge_execd_t:s0 tcontext=unconfined_u:system_r:sge_execd_t:s0 tclass=capability

I've attached the full set of denials from the two machines.

Comment 55 Orion Poplawski 2012-05-17 17:01:09 UTC
Forgot to set sge_use_nfs so there are a few nfs denials in there.

Comment 56 Miroslav Grepl 2012-05-18 08:37:48 UTC
Orion,
thanks for testing. But I thought this is executed by sge_job_t? Or did you change your test scenario?

I need sge_execd and sge_shepperd to make as unconfined domains too.

Comment 57 Orion Poplawski 2012-05-18 14:55:50 UTC
I'm not sure what you refer to by "this".  The denials listed in comment 54 are from the sge_execd daemon binding to the sge_execd socket and connecting to the qmaster, plus running ps.

You're probably going to want to label these ports (unless you don't label ports above 1024?):
sge_qmaster     6444/tcp                # Grid Engine Qmaster Service
sge_execd       6445/tcp                # Grid Engine Execution Service

I'm not sure sge_execd_t needs to be unconfined.  The other denials from sge_execd_t are from sending email.  shepherd maybe, but still has a more limited set of tasks.

Comment 58 Miroslav Grepl 2012-05-19 06:19:50 UTC
I thought AVC messages related to sshd.

Comment 59 Orion Poplawski 2012-05-21 16:07:35 UTC
(In reply to comment #58)
> I thought AVC messages related to sshd.

The sherpherd needs to login to the other nodes in the parallel job to setup the environment I believe.

Comment 60 Miroslav Grepl 2012-05-21 21:03:07 UTC
(In reply to comment #59)
> (In reply to comment #58)
> > I thought AVC messages related to sshd.
> 
> The sherpherd needs to login to the other nodes in the parallel job to setup
> the environment I believe.

Then I need to add some rules which we have for sge_job_t also for sherpherd_t.

Comment 61 Miroslav Grepl 2012-05-22 08:04:53 UTC
(In reply to comment #60)
> (In reply to comment #59)
> > (In reply to comment #58)
> > > I thought AVC messages related to sshd.
> > 
> > The sherpherd needs to login to the other nodes in the parallel job to setup
> > the environment I believe.
> 
> Then I need to add some rules which we have for sge_job_t also for
> sherpherd_t.

I added fixes to Fedora. Backporting to RHEL6.3.

Comment 62 errata-xmlrpc 2012-06-20 12:24:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0780.html


Note You need to log in before you can comment on or make changes to this bug.