Bug 647234

Summary: Zipped input files with no attribute information cause all-zeros file permissions
Product: Red Hat Enterprise MRG Reporter: Dominic Cleal <dcleal>
Component: condor-low-latencyAssignee: Robert Rati <rrati>
Status: CLOSED ERRATA QA Contact: Martin Kudlej <mkudlej>
Severity: medium Docs Contact:
Priority: low    
Version: 1.3CC: matt, mkudlej
Target Milestone: 1.3.2   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: python-condorutils-1.4-6 Doc Type: Bug Fix
Doc Text:
When a ZIP archive that did not preserve permission information was attached to a low-latency grid job, its content was incorrectly extracted without the read, write, or execute permissions, rendering any non-root process unable to access it. With this update, such files are now extracted with permissions determined by the umask of the user running the job.
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-02-15 12:16:56 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 673991    
Bug Blocks:    
Attachments:
Description Flags
Example zip archive without permission information
none
Simple Java program to generate zip files without permission info
none
Python snippet to show permissions seen by extract script none

Description Dominic Cleal 2010-10-27 16:17:29 UTC
Created attachment 456016 [details]
Example zip archive without permission information

Description of problem:
When submitting a low-latency grid job with a zip archive in the message body, if no file attribute information is stored in the zip file headers the files are unpacked and permissions set to 000.

This is the case when submitting zip archives generated by the standard OpenJDK libraries, which don't provide a method to specify file attributes when constructing an archive.

Version-Release number of selected component (if applicable):
python-condorutils-1.4-5.el5
MRG Grid 1.3

How reproducible:
Always.

Steps to Reproduce:
1. Submit job over AMQP with attached test_java.zip file, running "ls -l" as the command
  
Actual results:
See "----------" as permissions for a.file

Expected results:
Default permissions as per the umask

Additional info:
See the attached JavaZip.java and resulting test_java.zip archive.  Attached is list.py which shows the permissions as seen by the Python libraries.

The offending code is in /usr/lib/python2.4/site-packages/condorutils/osutil.py, in the zip_extract method where it chmods the file without checking the permissions are sane.  It would be sufficient to check the value is non-zero before performing the chmod.

A workaround appears to be to use the Apache Commons Compress library instead, which appears to permit setting Unix file permissions on ZipEntry objects as they're generated.

Comment 1 Dominic Cleal 2010-10-27 16:18:17 UTC
Created attachment 456018 [details]
Simple Java program to generate zip files without permission info

Comment 2 Dominic Cleal 2010-10-27 16:18:44 UTC
Created attachment 456019 [details]
Python snippet to show permissions seen by extract script

Comment 3 Robert Rati 2010-11-09 19:32:05 UTC
Using the perms derived from the umask may not provide a useful alternative.  The zip from the body of the amqp message is written into the user's execute directory by the carod daemon, which is running as root.  Most systems don't allow files created with the umask to contain the execute bit, so while files will be extracted with non-zero perms none of them will be executable so the low-latency job will still fail due to lack of execute perms.  The carod is running as root primarily because it needs to be able to write into the temporary execute directory that is owned by the job's owner, and until recently the hooks from the starter were running as user condor (thus not allow universal access to all execute directories).

Will using the umask solve your issue?

If not, I see two potential actions:
1) Zip archives containing files w/o permissions use a sane default of 755
2) Restructure the file handling such that the zip file is written to the temp execute directory by the prepare hook as opposed to the carod daemon.  This provides a few benefits:
  A) Zip file will be owned by the user instead of root
  B) The carod daemon should be able to run as a non-root user

However, 2 does not solve the issue with the execute permissions on the extracted files.  At present, I don't see a clean way to allow execution of a job from a zip w/o permissions except to do option 1.

Comment 4 Robert Rati 2010-11-09 19:36:12 UTC
For clarity, the zip file is written to the temporary execute dir by carod but is extracted by the prepare hook.  The result is the zip from the carod is owned by root, and the files extracted from the zip are owned by the job owner.

Comment 5 Dominic Cleal 2010-11-09 22:33:51 UTC
(In reply to comment #3)
> Using the perms derived from the umask may not provide a useful alternative. 
> The zip from the body of the amqp message is written into the user's execute
> directory by the carod daemon, which is running as root.  Most systems don't
> allow files created with the umask to contain the execute bit, so while files
> will be extracted with non-zero perms none of them will be executable so the
> low-latency job will still fail due to lack of execute perms.
[snip]
> 
> Will using the umask solve your issue?

Good point Robert, I hadn't considered that.  In this case, it would solve the issue as the binary being executed isn't transferred as part of the archive, it's known to be installed on the execution nodes.  The key part is having the transferred files readable.

Similarly, jobs running Java classes or other interpreted code could still work by running the interpreter directly, pointing to the non-executable but readable input files.

> If not, I see two potential actions:
> 1) Zip archives containing files w/o permissions use a sane default of 755

That feels like a slightly extreme solution, but would certainly work.  It would help and make it easier to bundle scripts into the job archives.  Even if it's an option that's disabled by default, it would be useful (though I suppose it's a setting of the hook script, not carod).

Comment 6 Robert Rati 2010-11-10 15:45:36 UTC
zip archives w/o permissions will create files based upon the job owner's umask on the execute node.  If the executable is included in that zip file, the solution atm is to use a zip program that stores permission information.

fixed in:
python-condorutils-1.4-6

Comment 7 Robert Rati 2010-11-29 20:48:00 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
C: Attaching a zip archive without permission information for files contained in the archive.
C: Files in the zip archive end up with permissions of 0000 when extracted.  This will prevent the files from being access by condor or any non-root process.
F: Files extracted from the zip archive will only be given the permissions in the zip archive if the archive reports valid permissions.  Otherwise they are given the permissions determined by the umask of the user running the job.
R: Zip archives without valid permissions information are usable with low-latency.

Comment 9 Martin Kudlej 2011-01-24 11:13:21 UTC
Tested on RHEL 5.6 x i386/x86_64 with:
condor-7.4.4-0.17
condor-job-hooks-1.4-5
python-condorutils-1.4-5
condor-low-latency-1.1-0.2

and it doesn't work.

Tested on RHEL 5.6 x i386/x86_64 with:
condor-low-latency-1.1-0.2.el5
condor-job-hooks-1.4-6.el5
python-condorutils-1.4-6.el5
condor-7.4.5-0.7.el5

and it still doesn't work. -->ASSIGNED

I've used Java program from attachment for generate zip file with test file without permissions and I've run it via our low-latency test client.

Here is result for zip file included file WITH proper permissions:
-rwxr-xr-x 1 nobody nobody 193 Jan 24 11:14 /var/lib/condor/execute/dir_7152/././test_run.sh
drwxr-xr-x 2 nobody nobody 4096 Jan 24 11:14 .                                
umask of user nobody = 0022

Permissions of test_run.sh before zip by python:
-rwxr-xr-x 1 root root 193 Jan 24 11:14 test_run.sh

python code for zip test_run.sh WITH permissions(test.zip):
zip = zipfile.ZipFile('test.zip', 'w');
zip.write('test_run.sh');
zip.close();
# rest code is the same as for file without permissions(test_java.zip)

StarterLog(ALL_DEBUG = D_ALL):
....
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> ~~~~~~
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 0, when = 1295864995, period = 120, handler_descrip=<check_parent>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 7, when = 1295865040, period = 0, handler_descrip=<dc_touch_log_file>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 3, when = 1295865280, period = 300, handler_descrip=<check_session_cache>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 6, when = 1295866150, period = 1170, handler_descrip=<DaemonCore::SendAliveToParent>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 4, when = 1295866781, period = 1801, handler_descrip=<handle_cookie_refresh>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 8, when = 1295893780, period = 0, handler_descrip=<dc_touch_lock_files>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 5, when = 1295894363, period = 29383, handler_descrip=<DaemonCore::refreshDNS()>
01/24/11 11:29:41 (fd:10) (pid:8025)
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore Timeout() Complete, returning 14
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 resetting
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 adding fd 13 (socket:[1507575])
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 adding fd 15 (socket:[1507576])
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 adding fd 12 (pipe:[1507650])
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 adding fd 17 (pipe:[1507651])
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 adding fd 6 (pipe:[1507608])
01/24/11 11:29:41 (fd:10) (pid:8025) PERF: entering select
01/24/11 11:29:41 (fd:10) (pid:8025) Entering thread safe start [select] in selector.cpp:313 unknown() 
01/24/11 11:29:41 (fd:10) (pid:8025) Leaving thread safe start [select] in selector.cpp:313 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) Entering thread safe stop [select] in selector.cpp:319 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) Leaving thread safe stop [select] in selector.cpp:319 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) PERF: leaving select
01/24/11 11:29:41 (fd:10) (pid:8025) State = FDS_READY
01/24/11 11:29:41 (fd:10) (pid:8025) max_fd = 17
01/24/11 11:29:41 (fd:10) (pid:8025) Selection FD's
01/24/11 11:29:41 (fd:10) (pid:8025)    Read {6 12 13 15 17 } = 5 
01/24/11 11:29:41 (fd:10) (pid:8025)    Write {} = 0
01/24/11 11:29:41 (fd:10) (pid:8025)    Except {} = 0 
01/24/11 11:29:41 (fd:10) (pid:8025) Ready FD's
01/24/11 11:29:41 (fd:10) (pid:8025)    Read {12 17 } = 2
01/24/11 11:29:41 (fd:10) (pid:8025)    Write {} = 0
01/24/11 11:29:41 (fd:10) (pid:8025)    Except {} = 0 
01/24/11 11:29:41 (fd:10) (pid:8025) Timeout = 0.000000 seconds
01/24/11 11:29:41 (fd:10) (pid:8025) Calling pipe Handler <DC stdout pipe handler> for Pipe end=65538 <DC stdout pipe>
01/24/11 11:29:41 (fd:10) (pid:8025) Return from pipe Handler
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 resetting
01/24/11 11:29:41 (fd:10) (pid:8025) selector 0x7fffb8029420 adding fd 17 (pipe:[1507651])
01/24/11 11:29:41 (fd:10) (pid:8025) Entering thread safe start [select] in selector.cpp:313 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) Leaving thread safe start [select] in selector.cpp:313 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) Entering thread safe stop [select] in selector.cpp:319 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) Leaving thread safe stop [select] in selector.cpp:319 unknown()
01/24/11 11:29:41 (fd:10) (pid:8025) Calling pipe Handler <DC stderr pipe handler> for Pipe end=65540 <DC stderr pipe>
01/24/11 11:29:41 (fd:10) (pid:8025) Return from pipe Handler
01/24/11 11:29:41 (fd:10) (pid:8025) Calling Handler <HandleDC_SERVICEWAITPIDS()> for Signal 60009 <DC_SERVICEWAITPIDS> 
01/24/11 11:29:41 (fd:10) (pid:8025) Cancel_Pipe: cancelled pipe end 65538 <DC stdout pipe> (entry=0)
01/24/11 11:29:41 (fd:10) (pid:8025) Close_Pipe(pipe_end=65538) succeeded
01/24/11 11:29:41 (fd:10) (pid:8025) Cancel_Pipe: cancelled pipe end 65540 <DC stderr pipe> (entry=0)
01/24/11 11:29:41 (fd:10) (pid:8025) Close_Pipe(pipe_end=65540) succeeded
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore: pid 8031 exited with status 0, invoking reaper 2 <HookClientMgr Output Reaper>
01/24/11 11:29:41 (fd:10) (pid:8025) About to kill family with root process 8031 using the ProcD
01/24/11 11:29:41 (fd:10) (pid:8025) Result of "signal_family" operation from ProcD: SUCCES
01/24/11 11:29:41 (fd:10) (pid:8025) HookClient /usr/libexec/condor/hooks/hook_prepare_job.py (pid 8031) exited with status 0
01/24/11 11:29:41 (fd:10) (pid:8025) Prepare hook output classad
MyType = ""
TargetType = ""
01/24/11 11:29:41 (fd:10) (pid:8025) After Prepare hook: merged job classad:
MyType = ""
TargetType = ""
MyType = ""
TargetType = ""
AMQPID = "63613437-6634-3365-2d61-3839332d3939"
WF_REQ_SLOT = "1"
IsFeatched = TRUE
TransferOutput = "output,output2,output3,output4,/etc/shadow"
Iwd = "."
JobUniverse = 5
Owner = "nobody"
HookKeyword = "LOW_LATENCY_JOB"
OrigCmd = "test_run.sh"
OrigIwd = "."
Cmd = "./test_run.sh"
01/24/11 11:29:41 (fd:10) (pid:8025) in DaemonCore NewTimer()
01/24/11 11:29:41 (fd:10) (pid:8025)
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> Timers
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> ~~~~~~
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 9, when = 1295864981, period = 0, handler_descrip=<deferred job start>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 0, when = 1295864995, period = 120, handler_descrip=<check_parent>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 7, when = 1295865040, period = 0, handler_descrip=<dc_touch_log_file>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 3, when = 1295865280, period = 300, handler_descrip=<check_session_cache>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 6, when = 1295866150, period = 1170, handler_descrip=<DaemonCore::SendAliveToParent>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 4, when = 1295866781, period = 1801, handler_descrip=<handle_cookie_refresh>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 8, when = 1295893780, period = 0, handler_descrip=<dc_touch_lock_files>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 5, when = 1295894363, period = 29383, handler_descrip=<DaemonCore::refreshDNS()>
01/24/11 11:29:41 (fd:10) (pid:8025)
01/24/11 11:29:41 (fd:10) (pid:8025) leaving DaemonCore NewTimer, id=9
01/24/11 11:29:41 (fd:10) (pid:8025) Job 1.0 set to execute immediately
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore: return from reaper for pid 8031
01/24/11 11:29:41 (fd:10) (pid:8025) About to unregister family with root 8031 from the ProcD
01/24/11 11:29:41 (fd:10) (pid:8025) Result of "unregister_family" operation from ProcD: SUCCESS
01/24/11 11:29:41 (fd:10) (pid:8025) In DaemonCore Timeout()
01/24/11 11:29:41 (fd:10) (pid:8025)
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> Timers
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> ~~~~~~
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 9, when = 1295864981, period = 0, handler_descrip=<deferred job start>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 0, when = 1295864995, period = 120, handler_descrip=<check_parent>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 7, when = 1295865040, period = 0, handler_descrip=<dc_touch_log_file>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 3, when = 1295865280, period = 300, handler_descrip=<check_session_cache>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 6, when = 1295866150, period = 1170, handler_descrip=<DaemonCore::SendAliveToParent>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 4, when = 1295866781, period = 1801, handler_descrip=<handle_cookie_refresh>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 8, when = 1295893780, period = 0, handler_descrip=<dc_touch_lock_files>
01/24/11 11:29:41 (fd:10) (pid:8025) DaemonCore--> id = 5, when = 1295894363, period = 29383, handler_descrip=<DaemonCore::refreshDNS()>
01/24/11 11:29:41 (fd:10) (pid:8025)
01/24/11 11:29:41 (fd:10) (pid:8025) Calling Timer handler 9 (deferred job start)
01/24/11 11:29:41 (fd:10) (pid:8025) Starting a VANILLA universe job with ID: 1.0
01/24/11 11:29:41 (fd:10) (pid:8025) In OsProc::OsProc()
01/24/11 11:29:41 (fd:10) (pid:8025) Main job KillSignal: 15 (SIGTERM)
01/24/11 11:29:41 (fd:10) (pid:8025) Main job RmKillSignal: 15 (SIGTERM)
01/24/11 11:29:41 (fd:10) (pid:8025) Main job HoldKillSignal: 15 (SIGTERM)
01/24/11 11:29:41 (fd:10) (pid:8025) in VanillaProc::StartJob()
01/24/11 11:29:41 (fd:10) (pid:8025) PID_SNAPSHOT_INTERVAL is undefined, using default value of 15
01/24/11 11:29:41 (fd:10) (pid:8025) USE_GID_PROCESS_TRACKING is undefined, using default value of False
01/24/11 11:29:41 (fd:10) (pid:8025) in OsProc::StartJob()
01/24/11 11:29:41 (fd:10) (pid:8025) IWD: .
01/24/11 11:29:41 (fd:10) (pid:8025) Config 'STARTER_LOG': no prefix ==> '$(LOG)/StarterLog'
01/24/11 11:29:41 (fd:10) (pid:8025) PRIV_CONDOR --> PRIV_USER at os_proc.cpp:211
01/24/11 11:29:41 (fd:11) (pid:8025) Input file: /dev/null
01/24/11 11:29:41 (fd:12) (pid:8025) Output file: /dev/null
01/24/11 11:29:41 (fd:16) (pid:8025) Error file: /dev/null
01/24/11 11:29:41 (fd:16) (pid:8025) About to exec ././test_run.sh
01/24/11 11:29:41 (fd:16) (pid:8025) Env = TMP=/var/lib/condor/execute/dir_8025 _CONDOR_JOB_IWD=. _CONDOR_SLOT=1 _CONDOR_MACHINE_AD=/var/lib/condor/execute/dir_8025/.machine.ad TEMP=/var/lib/condor/execute/dir_8025 TMPDIR=/var/lib/condor/execute/dir_8025 _CONDOR_SCRATCH_DIR=/var/lib/condor/execute/dir_8025 _CONDOR_JOB_AD=/var/lib/condor/execute/dir_8025/.job.ad _CONDOR_JOB_PIDS=
01/24/11 11:29:41 (fd:16) (pid:8025) JOB_INHERITS_STARTER_ENVIRONMENT is undefined, using default value of False
01/24/11 11:29:41 (fd:16) (pid:8025) Config 'STARTER_LOG': no prefix ==> '$(LOG)/StarterLog'
01/24/11 11:29:41 (fd:16) (pid:8025) ENFORCE_CPU_AFFINITY is undefined, using default value of False
01/24/11 11:29:41 (fd:16) (pid:8025) ENFORCE_CPU_AFFINITY not true, not setting affinity
01/24/11 11:29:41 (fd:16) (pid:8025) PRIV_USER --> PRIV_CONDOR at os_proc.cpp:437
01/24/11 11:29:41 (fd:16) (pid:8025) In DaemonCore::Create_Process(././test_run.sh,...)
01/24/11 11:29:41 (fd:16) (pid:8025) PRIV_CONDOR --> PRIV_USER at daemon_core.cpp:7556
01/24/11 11:29:41 (fd:16) (pid:8025) Create_Process: Cannot access specified executable "././test_run.sh": errno = 13 (Permission denied)
01/24/11 11:29:41 (fd:16) (pid:8025) PRIV_USER --> PRIV_CONDOR at daemon_core.cpp:7571
01/24/11 11:29:41 (fd:10) (pid:8025) ERROR "Create_Process(././test_run.sh,, ...) failed: (errno=13: 'Permission denied')" at line 542 in file os_proc.cpp
01/24/11 11:29:41 (fd:10) (pid:8025) ShutdownFast all jobs.
01/24/11 11:29:41 (fd:10) (pid:8025) Got ShutdownFast when no jobs running.
01/24/11 11:29:41 (fd:10) (pid:8025) Config 'STARTER_LOG': no prefix ==> '$(LOG)/StarterLog'
01/24/11 11:29:41 (fd:10) (pid:8025) PID_SNAPSHOT_INTERVAL is undefined, using default value of 15
01/24/11 11:29:41 (fd:10) (pid:8025) In DaemonCore::Create_Process(/usr/libexec/condor/hooks/hook_job_exit.py,...)
01/24/11 11:29:41 (fd:10) (pid:8025) Entering Create_Pipe()
01/24/11 11:29:41 (fd:12) (pid:8025) Create_Pipe() success read_handle=65536 write_handle=65537
01/24/11 11:29:41 (fd:12) (pid:8025) Entering Create_Pipe()
01/24/11 11:29:41 (fd:17) (pid:8025) Create_Pipe() success read_handle=65538 write_handle=65539
01/24/11 11:29:41 (fd:17) (pid:8025) Entering Create_Pipe()
01/24/11 11:29:41 (fd:19) (pid:8025) Create_Pipe() success read_handle=65540 write_handle=65541
01/24/11 11:29:41 (fd:19) (pid:8025) PRIV_CONDOR --> PRIV_USER at daemon_core.cpp:7556
01/24/11 11:29:41 (fd:19) (pid:8025) PRIV_USER --> PRIV_CONDOR at daemon_core.cpp:7598
01/24/11 11:29:41 (fd:21) (pid:8025) Create_Process: using fast clone() to create child process.
01/24/11 11:29:41 (fd:19) (pid:8035) Create_Process: Arg: /usr/libexec/condor/hooks/hook_job_exit.py evict
01/24/11 11:29:41 (fd:19) (pid:8035) USE_PROCESS_GROUPS is undefined, using default value of True
01/24/11 11:29:41 (fd:19) (pid:8035) About to register family for PID 8035 with the ProcD
01/24/11 11:29:41 (fd:19) (pid:8035) Result of "register_subfamily" operation from ProcD: SUCCESS
01/24/11 11:29:41 (fd:19) (pid:8035) About to tell ProcD to track family with root 8035 via environment
01/24/11 11:29:41 (fd:19) (pid:8035) Result of "track_family_via_environment" operation from ProcD: SUCCESS
01/24/11 11:29:41 (fd:19) (pid:8035) Re-mapping std(in|out|err) in child.
01/24/11 11:29:41 (fd:19) (pid:8035) Printing fds to inherit:
01/24/11 11:29:41 (fd:19) (pid:8035) About to exec "/usr/libexec/condor/hooks/hook_job_exit.py"
01/24/11 11:29:41 (fd:10) (pid:8025) Close_Pipe(pipe_end=65536) succeeded
01/24/11 11:29:41 (fd:10) (pid:8025) Close_Pipe(pipe_end=65539) succeeded
01/24/11 11:29:41 (fd:10) (pid:8025) Close_Pipe(pipe_end=65541) succeeded
01/24/11 11:29:41 (fd:10) (pid:8025) Child Process: pid 8035 at
01/24/11 11:29:41 (fd:10) (pid:8025) HOOK_JOB_EXIT (/usr/libexec/condor/hooks/hook_job_exit.py) invoked with reason: "evict"
01/24/11 11:29:41 (fd:10) (pid:8025) PRIV_CONDOR --> PRIV_ROOT at directory.cpp:175
01/24/11 11:29:41 (fd:16) (pid:8025) PRIV_ROOT --> PRIV_CONDOR at directory.cpp:187
01/24/11 11:29:41 (fd:16) (pid:8025) Removing /var/lib/condor/execute/dir_8025
01/24/11 11:29:41 (fd:16) (pid:8025) PRIV_CONDOR --> PRIV_ROOT at directory.cpp:481
01/24/11 11:29:41 (fd:16) (pid:8025) Attempting to remove /var/lib/condor/execute/dir_8025 as SuperUser (root)
01/24/11 11:29:41 (fd:16) (pid:8025) PRIV_ROOT --> PRIV_CONDOR at directory.cpp:527
01/24/11 11:29:41 (fd:10) (pid:8025) Deleting the StarterHookMgr

Comment 10 Robert Rati 2011-01-24 13:55:18 UTC
Comment #6 clearly states this will not work for a zip file containing then executable that the job is to run, which is what you tested.  A user's umask will never allow creation of a file with executable bits.  The main concern was making sure files are accessible to the executable.

Comment 12 Martin Kudlej 2011-02-01 09:25:16 UTC
Tested on RHEL 5.6/4.9 x i386/x86_64 with:
condor-job-hooks-1.4-6
condor-low-latency-1.1-2
condor-7.4.5-0.7

and it works. -->VERIFIED

Comment 13 Jaromir Hradilek 2011-02-09 19:48:48 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -1,4 +1 @@
-C: Attaching a zip archive without permission information for files contained in the archive.
+When a ZIP archive that did not preserve permission information was attached to a low-latency grid job, its content was incorrectly extracted without the read, write, or execute permissions, rendering any non-root process unable to access it. With this update, such files are now extracted with permissions determined by the umask of the user running the job.-C: Files in the zip archive end up with permissions of 0000 when extracted.  This will prevent the files from being access by condor or any non-root process.
-F: Files extracted from the zip archive will only be given the permissions in the zip archive if the archive reports valid permissions.  Otherwise they are given the permissions determined by the umask of the user running the job.
-R: Zip archives without valid permissions information are usable with low-latency.

Comment 14 errata-xmlrpc 2011-02-15 12:16:56 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0217.html