Bug 484880 - cluster-snmp deadlocks snmpd
cluster-snmp deadlocks snmpd
Status: CLOSED DUPLICATE of bug 453961
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: clustermon (Show other bugs)
4
i386 Linux
low Severity high
: ---
: ---
Assigned To: Ryan McCabe
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-02-10 08:55 EST by Moritz Baumann
Modified: 2009-04-16 16:37 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: 453600
Environment:
Last Closed: 2009-02-17 15:33:26 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
patch from RHEL5 (575 bytes, patch)
2009-02-10 08:56 EST, Moritz Baumann
no flags Details | Diff
modified spec file (7.67 KB, text/plain)
2009-02-10 08:57 EST, Moritz Baumann
no flags Details

  None (edit)
Description Moritz Baumann 2009-02-10 08:55:52 EST
This is just a copy of the Bug description against RHEL5.

Same patch works against RHEL4 Clustersuite.

Modified .spec file and patch are attached.




Description of problem:
The SNMPD plugin for clustersuite uses the
ClusterMonitoring::ClusterMonitor::get_cluster() method to retrieve the cluster
information.

This in turn calls ClientSocket::recv() -> read_restart().

The read_restart function is designed to fill a buffer with all data currently
buffered on the socket and to return when the underlying read() returns with EAGAIN.

This will only work if the socket has O_NONBLOCK set. Using this method on a
blocking socket will cause the thread calling get_cluster() to block
indefinitely waiting for additional data to arrive on the socket.

Version-Release number of selected component (if applicable):
0.10.0-5.el5 contains the defect but it is masked by bug 441947; rebuilding the
package to avoid the dlopen problem or using a later package (e.g. 0.12.0-7.el5)
allows the bug to be triggered.

How reproducible:
100%

Steps to Reproduce:
1. Configure a cluster with snmpd enabled on the nodes
2. Enable cluster-snmp
3. Try to access a REDHAT-CLUSTER-MIB MIB, e.g. REDHAT-CLUSTER-MIB::rhcMIBVersion.0
  
Actual results:
$ cat /etc/snmp/snmpd.conf
dlmod RedHatCluster     /usr/lib/cluster-snmp/libClusterMonitorSnmp.so
rocommunity public 127.0.0.1
$ snmpwalk -v2c -c public localhost
[tons of output, works fine but doesn't show REDHAT-CLUSTER-MIB::RedHatCluster]
$ snmpwalk -v2c -c public localhost REDHAT-CLUSTER-MIB::RedHatCluster
REDHAT-CLUSTER-MIB::rhcMIBVersion.0 = INTEGER: 1
Timeout: No Response from localhost
$ snmpwalk -v2c -c public localhost
Timeout: No Response from localhost

After this snmpd can only be interrupted by SIGKILL.

Expected results:
MIB output correctly, no hang of snmpd.

Additional info:
Analysis & proposed patch from Adrien Kunysz
Comment 1 Moritz Baumann 2009-02-10 08:56:40 EST
Created attachment 331423 [details]
patch from RHEL5
Comment 2 Moritz Baumann 2009-02-10 08:57:15 EST
Created attachment 331424 [details]
modified spec file
Comment 3 Ryan McCabe 2009-02-17 15:33:26 EST

*** This bug has been marked as a duplicate of bug 453961 ***

Note You need to log in before you can comment on or make changes to this bug.