This is just a copy of the Bug description against RHEL5. Same patch works against RHEL4 Clustersuite. Modified .spec file and patch are attached. Description of problem: The SNMPD plugin for clustersuite uses the ClusterMonitoring::ClusterMonitor::get_cluster() method to retrieve the cluster information. This in turn calls ClientSocket::recv() -> read_restart(). The read_restart function is designed to fill a buffer with all data currently buffered on the socket and to return when the underlying read() returns with EAGAIN. This will only work if the socket has O_NONBLOCK set. Using this method on a blocking socket will cause the thread calling get_cluster() to block indefinitely waiting for additional data to arrive on the socket. Version-Release number of selected component (if applicable): 0.10.0-5.el5 contains the defect but it is masked by bug 441947; rebuilding the package to avoid the dlopen problem or using a later package (e.g. 0.12.0-7.el5) allows the bug to be triggered. How reproducible: 100% Steps to Reproduce: 1. Configure a cluster with snmpd enabled on the nodes 2. Enable cluster-snmp 3. Try to access a REDHAT-CLUSTER-MIB MIB, e.g. REDHAT-CLUSTER-MIB::rhcMIBVersion.0 Actual results: $ cat /etc/snmp/snmpd.conf dlmod RedHatCluster /usr/lib/cluster-snmp/libClusterMonitorSnmp.so rocommunity public 127.0.0.1 $ snmpwalk -v2c -c public localhost [tons of output, works fine but doesn't show REDHAT-CLUSTER-MIB::RedHatCluster] $ snmpwalk -v2c -c public localhost REDHAT-CLUSTER-MIB::RedHatCluster REDHAT-CLUSTER-MIB::rhcMIBVersion.0 = INTEGER: 1 Timeout: No Response from localhost $ snmpwalk -v2c -c public localhost Timeout: No Response from localhost After this snmpd can only be interrupted by SIGKILL. Expected results: MIB output correctly, no hang of snmpd. Additional info: Analysis & proposed patch from Adrien Kunysz
Created attachment 331423 [details] patch from RHEL5
Created attachment 331424 [details] modified spec file
*** This bug has been marked as a duplicate of bug 453961 ***