Description of problem: Without a stable agent id, Cumin cannot cleanly garbage collect stale agents in all cases. Slots can appear duplicated. Version-Release number of selected component (if applicable): 7.4.4-0.13 How reproducible: 100% Steps to Reproduce: 1. run console, view slot screen 2. service cumin stop 3. service condor restart 4. service cumin start Actual results: Duplicate entries in UI Expected results: Opposite of "Actual results" Additional info: ManagementAgent::setName's third parameter (optional) is a stable name. src/management $ grep setName * MgmtCollectorPlugin.cpp: agent->setName("com.redhat.grid","collector",collName.c_str()); MgmtMasterPlugin.cpp: agent->setName("com.redhat.grid","master", default_name); MgmtNegotiatorPlugin.cpp: agent->setName("com.redhat.grid","negotiator", mmName.c_str()); MgmtScheddPlugin.cpp: agent->setName("com.redhat.grid","scheduler", schedd_name.c_str()); MgmtStartdPlugin.cpp: agent->setName("com.redhat.grid","slot"); We set it in all QMF plugins, except the Startd. We should set it in the Startd, using the extern char * Name from startd_main.cpp.
FH sha 849ae64 no name specified (default)... slot = 0-0-1-com.redhat.grid:slot:pmackinn@localhost.localdomain slot = 0-0-1-com.redhat.grid:slot:pmackinn@localhost.localdomain STARTD_NAME = whiteford... slot = 0-0-1-com.redhat.grid:slot:whiteford@whiteford slot = 0-0-1-com.redhat.grid:slot:whiteford@whiteford condor_startd -t -f -name petey... slot = 0-0-1-com.redhat.grid:slot:petey@petey slot = 0-0-1-com.redhat.grid:slot:petey@petey
Reproduced on RHEL5 x86_64 with packages: condor-qmf-7.4.4-0.13.el5 condor-7.4.4-0.13.el5 cumin-0.1.4369-1.el5 ...and their dependencies... Verified with condor-7.4.4-0.16.el5 condor-qmf-7.4.4-0.16.el5
Verified also with condor-7.4.4-0.16.el4 condor-qmf-7.4.4-0.16.el4 On RHEL4 (both i386 and x86_64).
And finally with condor-qmf-7.4.4-0.16.el5.i386.rpm on a RHEL5 i386 box.