600319 – net-snmp-5.3.2.2-broadcast-response.patch broke answering to non local host on some interfaces

Bug 600319 - net-snmp-5.3.2.2-broadcast-response.patch broke answering to non local host on some interfaces

Summary: net-snmp-5.3.2.2-broadcast-response.patch broke answering to non local host o...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	net-snmp
Sub Component:
Version:	5.5
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Jan Safranek
QA Contact:	BaseOS QE Security Team
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-06-04 12:45 UTC by Michal Žejdl
Modified:	2011-07-21 12:22 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-07-21 09:10:48 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2011:1076	0	normal	SHIPPED_LIVE	net-snmp bug fix and enhancement update	2011-07-21 09:07:58 UTC

Description Michal Žejdl 2010-06-04 12:45:13 UTC

Description of problem:

Using

cmsg.ipi.ipi_ifindex = if_index

instead of

cmsg.ipi.ipi_ifindex = 0

in mentioned patch (netsnmp_udp_sendto()) pushes kernel to send answer locally (not via gateway) on some interfaces.

Upstream SVN uses cmsg.ipi.ipi_ifindex = 0.

I suggest this change to preserve broadcast answering issue:

84c84
< +    cmsg.ipi.ipi_ifindex = if_index;
---
> +    cmsg.ipi.ipi_ifindex = 0;
102a103
> +        cmsg.ipi.ipi_ifindex = if_index;


This thread can describe it better:

http://www.cs.helsinki.fi/linux/linux-kernel/2003-07/0711.html



Version-Release number of selected component (if applicable):

net-snmp-5.3.2.2-9.el5
net-snmp-5.3.2.2-9.el5_5.1



How reproducible:

always (on some particular machines - seems to depends on concrete route table)



Steps to Reproduce:

?


 
Actual results:

$ snmpget -v 1 -c public -r 0 10.107.1.1 sysUpTime.0
Timeout: No Response from 10.107.1.1.

14:22:51.155776 IP 10.1.220.105.38585 > 10.107.1.1.snmp:  GetRequest(28)  .1.3.6.1.2.1.1.3.0
14:22:51.157855 arp who-has 10.1.220.105 tell 10.107.1.1
14:22:52.157253 arp who-has 10.1.220.105 tell 10.107.1.1
14:22:53.157907 arp who-has 10.1.220.105 tell 10.107.1.1




Expected results:

(net-snmp with ipi_ifindex = 0)

$ snmpget -v 1 -c public -r 0 10.107.1.1 sysUpTime.0
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (891) 0:00:08.91

14:24:15.671491 IP 10.1.220.105.40785 > 10.107.1.1.snmp:  GetRequest(28)  .1.3.6.1.2.1.1.3.0
14:24:15.671957 IP 10.107.1.1.snmp > 10.1.220.105.40785:  GetResponse(30)  .1.3.6.1.2.1.1.3.0=891



Additional info:

multi-interface machine, quagga, ospfd, all interfaces responds to SNMP except one on _some_ machines

Comment 1 Michal Žejdl 2010-06-07 06:13:15 UTC

to be more exact: kernel sends datagram to local link every time the interface chosen via ipi_ifindex differs from interface used for routing to destination - it is unwanted when destination is not on local broadcast domain

Comment 2 Jan Safranek 2010-06-08 14:40:38 UTC

I haven't been able to reproduce this bug in a simple virtual test environment with three machines connected together and MASTER rule them all (via ssh):

TEST1 -------+
 | |         |      
  X          |
  |          |
ROUTER ------X--MASTER
  |          |
  X          |
 | |         |
TEST2 -------+

Where 'X' is a bridge, i.e. there are two cables from TEST1 connected to (virtual) bridge where ROUTER is connected. The same for TEST2. TEST1 and TEST2 are in different networks (192.168.101.0/24, 192.168.102.0/24) with ROUTER routing between them (with obvious routing table). All snmpgets from TEST1 to both IP addresses of TEST2 succeeded as expected,. I don't see any weird ARP queries on TEST2 trying to send responses to TEST1 on local link, all responses are correctly routed via ROUTER.

I understand, my environment is 1) very simple and 2) virtual. While I think the second does not matter - the packets look real to the kernel - I need to know how does your environment look like. Would you be able to describe it in full detail? I.e. simplify it to the smallest set of active elements and show me your interface configurations (ifconfig) and routing tables on all of them.


> Upstream SVN uses cmsg.ipi.ipi_ifindex = 0.

No, current SVN trunk uses my patches with ipi_ifindex = if_index/

> This thread can describe it better:
> 
> http://www.cs.helsinki.fi/linux/linux-kernel/2003-07/0711.html

The thread is not much useful there, just that man 7 ip shows wrong information. It was corrected (but it's still a bit misleading - full behaviour, especially re broadcast packets is not described anywhere).

Comment 3 Michal Žejdl 2010-06-09 07:29:18 UTC

R2 (10.1.0.2) ---- (10.1.0.1) R1 (10.3.0.1) ---- (10.3.0.2) PC
R2 (10.2.0.2) ---- (10.2.0.1) R1

nets are /16, R2 def gw is 10.1.0.1 - PC can't get SNMP response from 10.2.0.2


more configuration info (cut):

PC
ip a
inet 10.3.0.2/16 brd 10.3.255.255 scope global eth0
ip r
10.3.0.0/16 dev eth0  proto kernel  scope link  src 10.3.0.2 
default via 10.3.0.1 dev eth0 

R1
ip a
inet 10.1.0.1/16 brd 10.1.255.255 scope global eth1
inet 10.2.0.1/16 brd 10.2.255.255 scope global eth2
inet 10.3.0.1/16 brd 10.3.255.255 scope global eth3
ip r
10.1.0.0/16 dev eth1  proto kernel  scope link  src 10.1.0.1 
10.2.0.0/16 dev eth2  proto kernel  scope link  src 10.2.0.1 
10.3.0.0/16 dev eth3  proto kernel  scope link  src 10.3.0.1 

R2
ip a
inet 10.1.0.2/16 brd 10.1.255.255 scope global eth1
inet 10.2.0.2/16 brd 10.2.255.255 scope global eth2
ip r
10.1.0.0/16 dev eth1  proto kernel  scope link  src 10.1.0.2
10.2.0.0/16 dev eth2  proto kernel  scope link  src 10.2.0.2 
default via 10.1.0.1 dev eth1 


results:

PC
$ snmpget -v1 -cpublic -r0 10.1.0.2 sysUpTime.0
DISMAN-EVENT-MIB::sysUpTimeInstance = Timeticks: (16588) 0:02:45.88
$ snmpget -v1 -cpublic -r0 10.2.0.2 sysUpTime.0
Timeout: No Response from 10.2.0.2.

R2
tshark -i any, :c8 - eth1, :c9 - eth2
  0.000000     10.3.0.2 -> 10.1.0.2     SNMP get-request SNMPv2-MIB::sysUpTime.0
  0.001073 Intel_e1:68:c8 ->              ARP Who has 10.1.0.1?  Tell 10.1.0.2
  0.001159 Intel_d4:f5:ef ->              ARP 10.1.0.1 is at 00:04:23:d4:f5:ef
  0.001169     10.1.0.2 -> 10.3.0.2     SNMP get-response SNMPv2-MIB::sysUpTime.0
  2.738342     10.3.0.2 -> 10.2.0.2     SNMP get-request SNMPv2-MIB::sysUpTime.0
  2.740060 Intel_e1:68:c9 ->              ARP Who has 10.3.0.2?  Tell 10.2.0.2
  3.739052 Intel_e1:68:c9 ->              ARP Who has 10.3.0.2?  Tell 10.2.0.2
  4.739049 Intel_e1:68:c9 ->              ARP Who has 10.3.0.2?  Tell 10.2.0.2
  5.739082     10.2.0.2 -> 10.2.0.2     ICMP Destination unreachable (Host unreachable)



> No, current SVN trunk uses my patches with ipi_ifindex = if_index/

ok, I was wrong (checked against branches/V5-4-patches)


> The thread is not much useful there, just that man 7 ip shows wrong
> information. It was corrected (but it's still a bit misleading - full
> behaviour, especially re broadcast packets is not described anywhere).    

the thread explains the same experience with kernel - see "particular behaviour"

another zeroing ipi_ifindex patch:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=181701

Comment 4 Jan Safranek 2010-06-09 11:03:27 UTC

I reproduced it internally with the exact network setup as in comment #3.

sendmsg() with both ipi_spec_dst and ipi_ifindex nozero sends the message only when there is route defined from the ipi_ifindex interface and the packet destination. I'll investigate what can be done, clearing ipi_ifindex is one of the possibilities (but I think I get into troubles responding to broadcast requests, which is required by other customers...)

You can add new default route to R2 as temporary workaround: default via 10.2.0.1 dev eth2

Comment 7 Jan Safranek 2011-01-11 13:35:54 UTC

I've checked a fix to upstream SVN, http://net-snmp.svn.sourceforge.net/viewvc/net-snmp?view=revision&revision=19846

Comment 12 errata-xmlrpc 2011-07-21 09:10:48 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1076.html

Comment 13 errata-xmlrpc 2011-07-21 12:22:27 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1076.html

Note You need to log in before you can comment on or make changes to this bug.