Bug 489006
Summary: | Cannot distinguish between completion and other termination of AMQP submitted work | ||
---|---|---|---|
Product: | Red Hat Enterprise MRG | Reporter: | Matthew Farrellee <matt> |
Component: | grid | Assignee: | Robert Rati <rrati> |
Status: | CLOSED ERRATA | QA Contact: | Jan Sarenik <jsarenik> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 1.1 | CC: | jsarenik, mkudlej |
Target Milestone: | 1.1.1 | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2009-04-21 16:17:27 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 459615 | ||
Bug Blocks: |
Description
Matthew Farrellee
2009-03-06 17:37:42 UTC
Should I just verify that condor-low-latency-1.0-10 and higher return the JobStatus as mentioned above? And the JobState. You may want to test through a situation where Condor runs the job without interruption, and runs it with restart and maybe kill -9 interruption, including to the carod (service condor-low-latency) process. Jobs submitted via AMQP do not get run. Condor's StartLog says: Slot requirements not satisfied. Job requirements not satisfied. When I put the dump into job.submit file, change Cmd to Executable and '5' to 'vanilla', add Queue at the end, the job runs flawlessly with condor_submit (just few lines of WARNINGs for I include really full dump including parameters that are probably unknown to condor_submit). This condor runs all the vanilla jobs via condor_submit with no problems. Low-latency is configured by adding these lines to /etc/condor/condor_config -------------------------------------------------------------------------- LOW_LATENCY_HOOK_FETCH_WORK = $(LIBEXEC)/hooks/hook_fetch_work.py LOW_LATENCY_HOOK_REPLY_FETCH = $(LIBEXEC)/hooks/hook_reply_fetch.py # Starter hooks LOW_LATENCY_JOB_HOOK_PREPARE_JOB = $(LIBEXEC)/hooks/hook_prepare_job.py LOW_LATENCY_JOB_HOOK_UPDATE_JOB_INFO = $(LIBEXEC)/hooks/hook_update_job_status.py LOW_LATENCY_JOB_HOOK_JOB_EXIT = $(LIBEXEC)/hooks/hook_job_exit.py STARTD_JOB_HOOK_KEYWORD = LOW_LATENCY FetchWorkDelay = 10 * (Activity == "Idle") STARTER_UPDATE_INTERVAL = 30 -------------------------------------------------------------------------- condor-7.2.2-0.9.el5 condor-job-hooks-1.0-5.el5 condor-job-hooks-common-1.0-5.el5 condor-low-latency-1.0-12.el5 I was using mainly cmd_args.py test from mrg-grid.git repo's low-latency branch. Can you enlighten me, please? Works as expected. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHEA-2009-0434.html |