Bug 447947
| Summary: | SNMPd does not respond on cluster service IP | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 4 | Reporter: | Jesse Gonzalez <jesse.gonzalez> | ||||
| Component: | net-snmp | Assignee: | Jan Safranek <jsafrane> | ||||
| Status: | CLOSED ERRATA | QA Contact: | |||||
| Severity: | high | Docs Contact: | |||||
| Priority: | low | ||||||
| Version: | 4.6 | CC: | dmair, louis.savage, mkoci, rvokal, tao | ||||
| Target Milestone: | rc | ||||||
| Target Release: | --- | ||||||
| Hardware: | i386 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
* the method previously used by snmpd to process UDP did not work well in clustered environments. Queries against an IP configured as a resource of a cluster service would time out and fail unless first performed against a non-cluster resource IP. Net-snmp for Red Hat Enterprise Linux 4 now includes improved UDP handling. This allows snmpd queries to work reliably in a clustered environment.
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2009-05-18 20:19:17 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Eventually the SNMP query will begin to fail after 40+ minutes. The issue has been corrected in net-snmp release 5.3.2 The following patch is available: https://sourceforge.net/tracker/index.php?func=detail&aid=1553447&group_id=12694&atid=312694 Created attachment 306945 [details]
backported patch from RHEL-5
Could you please test experimental build at http://people.redhat.com/jsafrane/bugs/447947/ and report results? The parts, which are affected by the patch, are slightly different in the old net-snmp-5.1.2, which is distributed in Red Hat Enterprise Linux 4. Although I did a review and some testing, I'd like to be sure if it works as expected. Thanks in advance. This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. Jesse, have you tried to test the build mentioned in #4? I want to be sure I have real fix before I release it in next RHEL4 update. Sorry Jan, I have not had a chance to test that fix. Instead I am using tcp as the protocol to make the connection: snmpget -v2c -c public tcp:XX.XX.XX.XX OID I have encountered problems with cluster-snmp (no bugzilla) and the latest net-snmp packages( 462016 ) as submitted by my colleage Mr. Savage. It will take me *quite* some time to test your release posted in #4. Have you tested your build against a RHEL cluster? I didn't test it on cluster - I do not have one on my table*. I tried to simulate the problem on a (virtual) machine with two interfaces facing the same network, which was enough to reproduce the bug (net-snmp receiving request on 192.168.0.X and sending response from 192.168.0.Y) and test the fix. The patch above touches the very heart of UDP processing, I'm trying to test it as much as possible and every additional feedback would help. *: Of course, if this is going to be fixed in an update, our QA should try it on real cluster. I'll try to test your build over the weekend. The hotfix we received corrected the issue. Cluster vips respond continuously respond as expected. Great, thanks for the feedback. Release note added. If any revisions are required, please set the "requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: * the method previously used by snmpd to process UDP did not work well in clustered environments. Queries against an IP configured as a resource of a cluster service would time out and fail unless first performed against a non-cluster resource IP. Net-snmp for Red Hat Enterprise Linux 4 now includes improved UDP handling. This allows snmpd queries to work reliably in a clustered environment. An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2009-0984.html |
Description of problem: When performing an SNMP query against an IP that is configured as a resource of a cluster service, the SNMPd does not respond on the cluster IP until you first perform an SNMP query against an non-cluster resource IP of the device. Version-Release number of selected component (if applicable): net-snmp-5.1.2-11.el4_6.11.2 How reproducible: Configure a cluster with an IP resource as part of a cluster service. Perform an SNMP query against the IP configured in the cluster from a remote machine. The SNMP query will timeout. Next perform the SNMP query against an IP assigned to the device, and the SNMP query will succeed. Finally perform the SNMP query agains the cluster IP, and the SNMP query will succeed. Expected results: The SNMP query performs as expected when initially performing an SNMP query against the cluster IP. Additional info: Using the cluster-snmp package, SNMP queries are performed against cluster resource IPs to check status, and service changes. While tracing the network communication via tcpdump, tcpdump demonstrated the following behavior. 172.16.172.8 is the cluster resource IP, and 172.16.172.5 is the IP of bond0. Initial SNMP request against the cluster resource IP: [root@somehost ~]# tcpdump -n port 161 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on bond0, link-type EN10MB (Ethernet), capture size 96 bytes 08:43:06.827695 IP XX.XX.XX.XX.32830 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:06.834306 IP 172.16.172.5.snmp > XX.XX.XX.XX.32830: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] 08:43:07.764710 IP XX.XX.XX.XX.32830 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:07.771230 IP 172.16.172.5.snmp > XX.XX.XX.XX.32830: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] 08:43:08.743697 IP XX.XX.XX.XX.32830 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:08.750027 IP 172.16.172.5.snmp > XX.XX.XX.XX.32830: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] 08:43:09.812005 IP XX.XX.XX.XX.32830 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:09.818415 IP 172.16.172.5.snmp > XX.XX.XX.XX.32830: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] 08:43:10.756348 IP XX.XX.XX.XX.32830 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:10.762577 IP 172.16.172.5.snmp > XX.XX.XX.XX.32830: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] 08:43:11.817856 IP XX.XX.XX.XX.32830 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:11.823829 IP 172.16.172.5.snmp > XX.XX.XX.XX.32830: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] SNMP request against bond0 IP address of device: 08:43:57.935595 IP XX.XX.XX.XX.32832 > 172.16.172.5.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:43:57.942354 IP 172.16.172.5.snmp > XX.XX.XX.XX.32832: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] Second SNMP request against bond0 IP address of device: 08:44:16.359526 IP XX.XX.XX.XX.32832 > 172.16.172.8.snmp: C=community GetRequest(30) .1.3.6.1.4.1.2312.8.2.2.0[|snmp] 08:44:16.365848 IP 172.16.172.5.snmp > XX.XX.XX.XX.32832: C=community GetResponse(28) .1.3.6.1.4.1.2312.8.2.2=[|snmp] [root@somehost ~]# ip addr list bond0 2: bond0: <BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue link/ether 00:19:b9:cf:89:17 brd ff:ff:ff:ff:ff:ff inet 172.16.172.5/24 brd 172.16.172.255 scope global bond0 inet 172.16.172.8/32 scope global bond0