If the Plumage view server plugin cannot connect to the negotiator for obtaining userprio data, it crashes with a stack like the following: 01/23/13 15:58:06 Accumulating data: Time=1358953086 01/23/13 15:58:07 Can't find address for negotiator 01/23/13 15:58:07 ODSAccountant: Can't connect negotiator for Accountant ad! 01/23/13 15:58:07 ODSAccountant: failed to send GET_PRIORITY command to negotiator! Stack dump for process 18775 at timestamp 1358953087 (10 frames) /usr/lib64/libcondor_utils_7_8_7.so(dprintf_dump_stack+0x131)[0x389633eca1] /usr/lib64/libcondor_utils_7_8_7.so[0x3896385a22] /lib64/libpthread.so.0[0x3632c0f500] /usr/lib64/condor/plugins/PlumageCollectorPlugin-plugin.so(_ZN7plumage3etl13ODSAccountant7fetchAdEv+0x64)[0x7f0d78074934] /usr/lib64/condor/plugins/PlumageCollectorPlugin-plugin.so(_ZN22PlumageCollectorPlugin18recordAccountantAdEv+0x50)[0x7f0d780747e0] /usr/lib64/libcondor_utils_7_8_7.so(_ZN12TimerManager7TimeoutEPiPd+0x1a1)[0x38964478c1] /usr/lib64/libcondor_utils_7_8_7.so(_ZN10DaemonCore6DriverEv+0x763)[0x3896455083] /usr/lib64/libcondor_utils_7_8_7.so(_Z7dc_mainiPPc+0xf50)[0x38964445d0] /lib64/libc.so.6(__libc_start_main+0xfd)[0x363241ecdd] condor_collector[0x40d9f9] The code needs to return from the fetchAd method instead of trying to go ahead and decode from the NULL socket.
I've found this bug in our long term condor instance. If problem occurs it is not possible to contact collector by condor_status.
I have checked that the patch is part of condor-7.8.9-0.7.el6 source rpm. Our test suite using condor-plumage and mongodb works as before with this change. No issues found. --> VERIFIED
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHSA-2014-0440.html