Bug 1089044

Summary: running mpirun as root trashes / by removing /bin /lib /lib64 /sbin etc
Product: [Fedora] Fedora Reporter: Jay Fenlason <fenlason>
Component: openmpiAssignee: Doug Ledford <dledford>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: dakingun, dledford, fenlason, jfeeney, jsquyres, orion
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openmpi-1.8.1-1.fc21 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-04-24 02:13:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Strace of mpirun, showing the unlink commands on / none

Description Jay Fenlason 2014-04-17 18:40:54 UTC
Created attachment 887244 [details]
Strace of mpirun, showing the unlink commands on /

Description of problem:
mpirun contains a bug which causes it to remove all symlinks in / if it is run as root.  This renders the system unusable.

Version-Release number of selected component (if applicable):
1.8-1

How reproducible:
always

Steps to Reproduce:
1.module load mpi/openmpi-x86_64
2.mpirun -np 12 -hostfile hosts.openmpi foo
3.

Actual results:
all shell commands fail because their executables refer to /lib64

Expected results:
mpirun either runs the command or reports an error.

Additional info:
When you test this, it might be useful to have this script running in a different window first:

#!/usr/bin/python
import os
raw_input('Press enter after running mpirun and confirming damage to /')
os.symlink('/usr/bin','/bin')
os.symlink('/usr/lib','/lib')
os.symlink('/usr/lib64','/lib64')
os.symlink('/usr/sbin', '/sbin')

Comment 1 Orion Poplawski 2014-04-17 20:49:04 UTC
Yowza.

(gdb) bt
#0  unlink () at ../sysdeps/unix/syscall-template.S:81
#1  0x00007ffff791ee58 in opal_os_dirpath_destroy (path=0x5555557f21f0 "/", recursive=false,
    cbfunc=0x7ffff7b81510 <orte_dir_check_file>) at os_dirpath.c:273
#2  0x00007ffff7b820d7 in orte_session_dir_cleanup (jobid=4294967294)
    at util/session_dir.c:550
#3  0x00007ffff3ddbce2 in rte_init () at ess_hnp_module.c:308
#4  0x00007ffff7b73ec8 in orte_init (pargc=0x5555557ebe00, pargv=0x555555864a9c, flags=0)
    at runtime/orte_init.c:148
#5  0x0000555555559fd2 in orterun () at orterun.c:830
#6  0x00005555555593a5 in main (argc=3, argv=0x7fffffffdd38) at main.c:13

Filed https://svn.open-mpi.org/trac/ompi/ticket/4534

Comment 2 Orion Poplawski 2014-04-17 20:50:03 UTC
To be precise, it removes all files in /.  Hopefully nobody runs mpirun as root...

Comment 3 Jeff Squyres 2014-04-17 21:02:49 UTC
We had literally just found this same code path.

Looking into it...

Comment 4 Jeff Squyres 2014-04-17 22:13:30 UTC
Ticket filed upstream: https://svn.open-mpi.org/trac/ompi/ticket/4534