Hide Forgot
Description of problem: EC2/E jobs are sometimes executed locally instead of being correctly routed. According to developers: <rsquared> [...] What is happening makes sense, and from the logs looks to be a race between the JR and the Negotiator since the job has real requirements. Version-Release number of selected component (if applicable): Found on 2.0rc, all supported architectures (i386/x86_64, RHEL5.6/6.1): condor-7.6.1-0.8 condor-classads-7.6.1-0.8 condor-ec2-enhanced-hooks-1.2-2 python-condorec2e-1.2-2 python-condorutils-1.5-3 but most probably it is a pre-existing issue. Most likely it does not depend on EC2 jobs, but it could be related to the interaction between Negotiator and JobRouter. How reproducible: Configure a personal condor on a i686 system to support EC2 jobs and submit many instances (10 should be enough) of something like: ----------- universe = vanilla executable = /bin/sleep arguments = 600 output = /tmp/hostname32.$ENV(USER).$(cluster).out error = /tmp/hostname32.$ENV(USER).$(cluster).err log = /tmp/ulog.$ENV(USER).$(cluster).log requirements = Arch == "INTEL" should_transfer_files = yes when_to_transfer_output = on_exit transfer_executable = false +WantAWS = True +WantArch = "INTEL" +WantCPUs = 1 +EC2RunAttempts = 1 queue ----------- (on 64 bit the issue is triggered when a 64 bit AMI is required, so replace INTEL with X86_64 in requirements and WantArch). Few instances of the jobs will be executed locally.
It's not a bug, but an omission in the job submission file (missing WantAWS =!= true as part of the requirements).