Customers and field staff should be aware that there is a current limitation in 1.2 for dagman. The schedd unfortunately does not account for the entire number of jobs running including a root dagman and its node jobs. The deferred nature of launching a dag means that the potential # of jobs is hidden until parsed by the condor_dagman process of the dag submission file. This could be exacerbated by dags that include splices and nested dags. Matt and I discussed where this type of info should be classified. Perhaps not the User Guide but maybe in a Planning section for an Admin guide?
Knocking this back to the "documentation" (generic) component until we decide where to put it. LKB
Note, in 7.4, MAX_JOBS_RUNNING is now an expression and independent of a START_SCHEDULER_UNIVERSE expression.
Can someone please advise if this change is relevant to 1.3, and if so, where the information needs to be included. Thanks, Lana
We are trying to get a KB article written for this. Awaiting feedback from Mike Cressman or Jon Thomas (or someone with KB experience). Any pointers?
GSS is in charge of KB, but I have it on good authority that ECS can create articles if we need to. They remain internal until they're approved by GSS. The problem with that is I'm on leave until 7 June. If you want it done earlier than that, you might be better off contacting dpowles directly. LKB
KB article http://kbase.redhat.com/faq/docs/DOC-33345 submitted to SME jthomas for tech review