| Summary: | Ensure that duplicate entries of cluster.proc in history can be detected across submissions | ||
|---|---|---|---|
| Product: | Red Hat Enterprise MRG | Reporter: | Pete MacKinnon <pmackinn> |
| Component: | condor-qmf | Assignee: | grid-maint-list <grid-maint-list> |
| Status: | CLOSED WONTFIX | QA Contact: | MRG Quality Engineering <mrgqe-bugs> |
| Severity: | low | Docs Contact: | |
| Priority: | low | ||
| Version: | Development | CC: | iboverma, ltrilety, matt, mkudlej, tstclair |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-05-26 20:14:18 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
|
Description
Pete MacKinnon
2011-08-25 19:40:01 UTC
Issue for aviary query server also. Possibly related to Bug 732452. You get other strange behavior with a shortened SCHEDD_CLUSTER_MAXIMUM_VALUE, like jobs that appear to be stuck running in the queue after a schedd restart: -- Submitter: pmackinn.redhat.com : <192.168.1.131:48084> : milo.usersys.redhat.com ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 1.0 pmackinn 10/24 17:40 0+00:12:43 R 0 0.0 sleep 120 2.0 pmackinn 10/24 17:40 0+00:12:43 R 0 0.0 sleep 120 2 jobs; 0 idle, 2 running, 0 held And if you try to go back from SCHEDD_CLUSTER_MAXIMUM_VALUE=3 the schedd gets jammed unless the queue is cleaned out: 10/24/11 18:13:44 (pid:13483) ERROR "JOB QUEUE DAMAGED; header ad NEXT_CLUSTER_NUM invalid" at line 1088 in file /home/pmackinn/repos/uw/condor/CONDOR_SRC/src/condor_schedd.V6/qmgmt.cpp Using the global job id as a key in the jobs map solves this but that would break the current Aviary get* API since the user would have to provide the GJID instead of the simpler cluster.proc, i.e., regexp on some part of 'scheduler#c.p' and return all matches...? Multimap approach could be used to address this in the implementation but again there is potential Aviary API impact. MRG-Grid is in maintenance and only customer escalations will be considered. This issue can be reopened if a customer escalation associated with it occurs. |