+++ This bug was initially created as a clone of Bug #453600 +++ Description of problem: The SNMPD plugin for clustersuite uses the ClusterMonitoring::ClusterMonitor::get_cluster() method to retrieve the cluster information. This in turn calls ClientSocket::recv() -> read_restart(). The read_restart function is designed to fill a buffer with all data currently buffered on the socket and to return when the underlying read() returns with EAGAIN. This will only work if the socket has O_NONBLOCK set. Using this method on a blocking socket will cause the thread calling get_cluster() to block indefinitely waiting for additional data to arrive on the socket. Version-Release number of selected component (if applicable): 0.10.0-5.el5 contains the defect but it is masked by bug 441947; rebuilding the package to avoid the dlopen problem or using a later package (e.g. 0.12.0-7.el5) allows the bug to be triggered. How reproducible: 100% Steps to Reproduce: 1. Configure a cluster with snmpd enabled on the nodes 2. Enable cluster-snmp 3. Try to access a REDHAT-CLUSTER-MIB MIB, e.g. REDHAT-CLUSTER-MIB::rhcMIBVersion.0 Actual results: $ cat /etc/snmp/snmpd.conf dlmod RedHatCluster /usr/lib/cluster-snmp/libClusterMonitorSnmp.so rocommunity public 127.0.0.1 $ snmpwalk -v2c -c public localhost [tons of output, works fine but doesn't show REDHAT-CLUSTER-MIB::RedHatCluster] $ snmpwalk -v2c -c public localhost REDHAT-CLUSTER-MIB::RedHatCluster REDHAT-CLUSTER-MIB::rhcMIBVersion.0 = INTEGER: 1 Timeout: No Response from localhost $ snmpwalk -v2c -c public localhost Timeout: No Response from localhost After this snmpd can only be interrupted by SIGKILL. Expected results: MIB output correctly, no hang of snmpd. Additional info: Analysis & proposed patch from Adrien Kunysz -- Additional comment from bmr on 2008-07-01 10:41 EST -- Created an attachment (id=310677) Set sock.nonblocking(true) in ClusterMonitor::get_cluster() -- Additional comment from pm-rhel on 2008-07-01 11:27 EST -- This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. -- Additional comment from rmccabe on 2008-07-03 10:56 EST -- Thanks for the patch. Applied to the current CVS trees.
*** Bug 484880 has been marked as a duplicate of this bug. ***
fix verified in clustermon-0.11.2-1.el4 I can run the snmpwalk commands above, and do not see any hangs or timeouts.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-1064.html