Red Hat Bugzilla – Bug 615510
Job hooks environment does not contain _CONDOR_SCRATCH_DIR and the like
Last modified: 2010-10-14 12:14:00 EDT
Description of problem: Related to Bug 615504. The job hooks run from the starter should be provided the same environment as the job, such as inclusion of _CONDOR_SCRATCH_DIR, TEMP, TMPDIR, etc. cmd=/bin/env, output=out out: _CONDOR_ANCESTOR_1600=2871:1278810651:3740169993 _CONDOR_ANCESTOR_32469=32470:1279315307:2943521024 _CONDOR_ANCESTOR_2871=32469:1279315307:3155706936 TMP=/var/lib/condor/execute/dir_32469 _CONDOR_SLOT=1 TEMP=/var/lib/condor/execute/dir_32469 _CONDOR_MACHINE_AD=/var/lib/condor/execute/dir_32469/.machine.ad TMPDIR=/var/lib/condor/execute/dir_32469 _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_32469 _CONDOR_JOB_AD=/var/lib/condor/execute/dir_32469/.job.ad env from within hook_update_job_info: TERM=xterm _CONDOR_ANCESTOR_30501=32553:1279315437:144623873 _CONDOR_ANCESTOR_1600=2871:1278810651:3740169993 CONDOR_PARENT_ID=woods:30501:1279312222 CONDOR_PROCD_ADDRESS_BASE=/var/run/condor/procd_pipe PATH=/sbin:/usr/sbin:/bin:/usr/bin PWD=/var/lib/condor/execute/dir_30501 LANG=en_US.UTF-8 _CONDOR_EXECUTE=/var/lib/condor/execute SHLVL=3 CONDOR_INHERIT=30501 <127.0.0.1:55599> 0 0 _CONDOR_ANCESTOR_2871=30501:1279312222:3155706935 CONDOR_PROCD_ADDRESS=/var/run/condor/procd_pipe.STARTD _=/bin/env
Additionally, information about the running job should be included, like that from ssh_to_job -- _CONDOR_JOB_PIDS, _CONDOR_JOB_IWD, _CONDOR_SLOT_NAME $ condor_ssh_to_job 528.0 Welcome to slot1@robin.local! Your condor job is running with pid(s) 15514. $ env | grep _CONDOR _CONDOR_JOB_PIDS=15514 _CONDOR_ANCESTOR_14643=15043:1279320227:664620355 _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_15513 _CONDOR_ANCESTOR_15513=15517:1279321180:451937601 _CONDOR_ANCESTOR_15043=15513:1279321169:81967136 _CONDOR_SLOT=1 _CONDOR_EXECUTE=/var/lib/condor/execute _CONDOR_JOB_IWD=/tmp _CONDOR_SLOT_NAME=slot1@robin.local _CONDOR_SHELL=/bin/zsh
Min env, _CONDOR_SCRATCH_DIR _CONDOR_JOB_PIDS Useful env, _CONDOR_MACHINE_AD _CONDOR_JOB_AD _CONDOR_SLOT _CONDOR_JOB_IWD _CONDOR_SLOT_NAME
Good... /opt/hook-privs/hook_update_job_info.sh -uid=500(matt) gid=500(matt) groups=500(matt) context=user_u:system_r:unconfined_execmem_t _CONDOR_JOB_PIDS=29031 TERM=xterm TMPDIR=/var/lib/condor/execute/dir_29028 _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_29028 _CONDOR_ANCESTOR_1600=2871:1278810651:3740169993 CONDOR_PARENT_ID=woods:29028:1279344914 TEMP=/var/lib/condor/execute/dir_29028 _CONDOR_ANCESTOR_29028=29978:1279346442:2931823716 CONDOR_PROCD_ADDRESS_BASE=/var/run/condor/procd_pipe PATH=/sbin:/usr/sbin:/bin:/usr/bin PWD=/var/lib/condor/execute/dir_29028 LANG=en_US.UTF-8 _CONDOR_SLOT=1 _CONDOR_EXECUTE=/var/lib/condor/execute SHLVL=3 CONDOR_INHERIT=29028 <127.0.0.1:41805> 0 0 TMP=/var/lib/condor/execute/dir_29028 _CONDOR_ANCESTOR_2871=29028:1279344914:3155706939 _CONDOR_JOB_IWD=/tmp CONDOR_PROCD_ADDRESS=/var/run/condor/procd_pipe.STARTD _=/bin/env /opt/hook-privs/hook_job_exit.sh -uid=500(matt) gid=500(matt) groups=500(matt) context=user_u:system_r:unconfined_execmem_t _CONDOR_JOB_PIDS= TERM=xterm TMPDIR=/var/lib/condor/execute/dir_29028 _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_29028 _CONDOR_ANCESTOR_1600=2871:1278810651:3740169993 CONDOR_PARENT_ID=woods:29028:1279344914 TEMP=/var/lib/condor/execute/dir_29028 _CONDOR_ANCESTOR_29028=30072:1279346569:2931823725 CONDOR_PROCD_ADDRESS_BASE=/var/run/condor/procd_pipe PATH=/sbin:/usr/sbin:/bin:/usr/bin PWD=/var/lib/condor/execute/dir_29028 LANG=en_US.UTF-8 _CONDOR_SLOT=1 _CONDOR_EXECUTE=/var/lib/condor/execute SHLVL=3 CONDOR_INHERIT=29028 <127.0.0.1:41805> 0 0 TMP=/var/lib/condor/execute/dir_29028 _CONDOR_MAINJOB_EXIT_SIGNAL=9 _CONDOR_ANCESTOR_2871=29028:1279344914:3155706939 _CONDOR_JOB_IWD=/tmp CONDOR_PROCD_ADDRESS=/var/run/condor/procd_pipe.STARTD _=/bin/env
Created attachment 432543 [details] Patch adding env to exit, update info and prepare hooks
Also good... /opt/hook-privs/hook_prepare.sh -uid=500(matt) gid=500(matt) groups=500(matt) context=user_u:system_r:unconfined_execmem_t _CONDOR_JOB_PIDS= TERM=xterm TMPDIR=/var/lib/condor/execute/dir_30136 _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_30136 _CONDOR_ANCESTOR_1600=2871:1278810651:3740169993 _CONDOR_ANCESTOR_30136=30137:1279346797:2316498432 CONDOR_PARENT_ID=woods:30136:1279346797 TEMP=/var/lib/condor/execute/dir_30136 CONDOR_PROCD_ADDRESS_BASE=/var/run/condor/procd_pipe PATH=/sbin:/usr/sbin:/bin:/usr/bin PWD=/var/lib/condor/execute/dir_30136 LANG=en_US.UTF-8 _CONDOR_SLOT=1 _CONDOR_EXECUTE=/var/lib/condor/execute SHLVL=3 CONDOR_INHERIT=30136 <127.0.0.1:32899> 0 0 TMP=/var/lib/condor/execute/dir_30136 _CONDOR_ANCESTOR_2871=30136:1279346797:3155706940 _CONDOR_JOB_IWD=/tmp CONDOR_PROCD_ADDRESS=/var/run/condor/procd_pipe.STARTD _=/bin/env
Note, no pids in prepare or exit hooks, they are invoked before the jobs starts and after it exits.
https://condor-wiki.cs.wisc.edu/index.cgi/tktview?tn=1512 Build post 7.4.4-0.4
The following enviroment variables: _CONDOR_SCRATCH_DIR, _CONDOR_JOB_PIDS, _CONDOR_SLOT, _CONDOR_JOB_IWD are now exported into exit, update info and prepare hooks. The most important variables have been exported. Only three job hooks can see them, but they are the most critical hooks according to developers. Thus I'm going to verify this bug and moving the remaining minor issues to Bug 640125. Verified on RHEL 4.8/5.5, i386/x86_64, condor-7.4.4-0.16.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: With this update, job hooks (external programs or scripts invoked by Condor) run from the starter are provided with the same environment as the job, such as inclusion of the '_CONDOR_SCRATCH_DIR', 'TEMP', 'TMPDIR', etc. variables.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2010-0773.html