221911 – snmpd errors under xen

Bug 221911 - snmpd errors under xen

Summary: snmpd errors under xen

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	net-snmp
Sub Component:
Version:	6
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Radek Vokál
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	253902
TreeView+	depends on / blocked

Reported:	2007-01-08 21:05 UTC by Curtis Doty
Modified:	2007-12-04 20:40 UTC (History)
CC List:	2 users (show)
Fixed In Version:	5.4-13
Clone Of:
Environment:
Last Closed:	2007-03-12 14:53:50 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Curtis Doty 2007-01-08 21:05:37 UTC

On a xen dom0 (tested just now with 2.6.18-1.2869.fc6xen.x86_64), when snmpd is
running, it barfs the following two lines into syslog every 30 seconds.

Jan  8 12:47:00 barleycorn snmpd[2230]: netsnmp_assert index == tmp failed
if-mib/data_access/interface.c:467 _access_interface_entry_save_name() 
Jan  8 12:47:00 barleycorn snmpd[2230]: netsnmp_assert __extension__ ({ size_t
__s1_len, __s2_len; (__builtin_constant_p (rowreq_ctx->data.ifentry->name) &&
__builtin_constant_p (ifentry->name) && (__s1_len = strlen
(rowreq_ctx->data.ifentry->name), __s2_len = strlen (ifentry->name),
(!((size_t)(const void *)((rowreq_ctx->data.ifentry->name) + 1) - (size_t)(const
void *)(rowreq_ctx->data.ifentry->name) == 1) || __s1_len >= 4) &&
(!((size_t)(const void *)((ifentry->name) + 1) - (size_t)(const void
*)(ifentry->name) == 1) || __s2_len >= 4)) ? __builtin_strcmp
(rowreq_ctx->data.ifentry->name, ifentry->name) : (__builtin_constant_p
(rowreq_ctx->data.ifentry->name) && ((size_t)(const void
*)((rowreq_ctx->data.ifentry->name) + 1) - (size_t)(const void
*)(rowreq_ctx->data.ifentry->name) == 1) && (__s1_len = strlen
(rowreq_ctx->data.ifentry->name), __s1_len < 4) ? (__builtin_constant_p
(ifentry->name) && ((size_t)(const void *)((ifentry->name) + 1) - (size_t)(const
void *)(ifentry->name) == 1) ? __builtin_strcmp (rowreq_c

Wadda mess! It totally ruins the S/N ratio in /var/log/messages.

And just FYI, here is a diff of the interface names/changes with and without xen
kernel.

--- ip-addr.2007-01-08  2007-01-08 12:13:33.000000000 -0800
+++ -   2007-01-08 12:44:00.105480000 -0800
@@ -1,10 +1,28 @@
 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue 
     link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
     inet 127.0.0.1/8 scope host lo
-2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
-    link/ether 00:04:23:d0:53:cc brd ff:ff:ff:ff:ff:ff
-    inet 10.10.10.202/29 brd 10.10.10.207 scope global eth0
+2: peth0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
+    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
     link/ether 00:04:23:d0:53:cd brd ff:ff:ff:ff:ff:ff
 4: eth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop qlen 1000
     link/ether 00:13:72:fe:3c:56 brd ff:ff:ff:ff:ff:ff
+5: vif0.0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue 
+    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
+6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue 
+    link/ether 00:04:23:d0:53:cc brd ff:ff:ff:ff:ff:ff
+    inet 10.10.10.202/29 brd 10.10.10.207 scope global eth0
+7: vif0.1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
+    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
+8: veth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
+    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
+9: vif0.2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
+    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
+10: veth2: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
+    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
+11: vif0.3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
+    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff
+12: veth3: <BROADCAST,MULTICAST> mtu 1500 qdisc noop 
+    link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff
+13: xenbr0: <BROADCAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue 
+    link/ether fe:ff:ff:ff:ff:ff brd ff:ff:ff:ff:ff:ff

Comment 1 Radek Vokál 2007-01-12 08:07:40 UTC

Can you please try to reproduce this issue with the latest rawhide net-snmp? I
can't reproduce the above assertion on my Xen test box.

Comment 2 Curtis Doty 2007-01-12 17:26:50 UTC

Not reproducible with 5.4-5.fc7.x86_64. This bug appears in fc6 i.e. net-snmp 5.3.1.

Comment 3 Curtis Doty 2007-01-12 20:06:59 UTC

Awwww foo! I roll back to fc6 package and now bug is gone. However...

# chkconfig snmpd on
# shutdown -r now

It's baaaack.

Even further, upgrade to 5.4 (this time my mirror had the rawhide package you
built last night), reboot and the problem persists.

So it is reproducible with both 5.3 and 5.4. Just only when started via boot. Is
there a race here? Or something screwey with my environment?

Comment 4 Michal Marciniszyn 2007-01-16 14:53:57 UTC

Altough my messages in /var/log/messages were quite different, the fact is, that
the snmpd logs wierd things. This difference may be configuration related, but
it is bug anyway.

The bug is reporducible.

Comment 5 Robert Story 2007-02-05 15:13:00 UTC

Fixes checked in upstream for 5.3.x:

-fix (reported elsewhere) xen+x86_64 crash:
http://net-snmp.cvs.sourceforge.net/net-snmp/net-snmp/snmplib/snmp_logging.c?r1=5.34.2.5&r2=5.34.2.1&pathrev=V5-3-patches

-fix for overly-verbose log message:
http://net-snmp.cvs.sourceforge.net/net-snmp/net-snmp/agent/mibgroup/if-mib/ifTable/ifTable_data_access.c?r1=1.18.2.3&r2=1.18.2.4

Comment 6 Radek Vokál 2007-03-12 14:53:50 UTC

Both pathes are in the latest rawhide version, I believe this issue is fixed in
net-snmp-5.4-13

Comment 7 Peter Bieringer 2007-12-04 20:40:30 UTC

It still appears in F8

net-snmp-5.4.1-5.fc8

Dec  4 21:15:12 *** snmpd[32738]: netsnmp_assert index == tmp failed
if-mib/data_access/interface.c:469 _access_interface_entry_save_name()

@owner: please change version to 8 and reopen this bug.

I got this message after a while, kill -HUP doesn't help:

Dec  4 21:21:32 * snmpd[32738]: Reconfiguring daemon
Dec  4 21:21:32 * snmpd[32738]: NET-SNMP version 5.4.1 restarted
Dec  4 21:21:42 * snmpd[32738]: netsnmp_assert index == tmp failed
if-mib/data_access/interface.c:469 _access_interface_entry_save_name()
Dec  4 21:22:15 * snmpd[32738]:last message repeated 2 times
Dec  4 21:22:15 * snmpd[32738]: Received TERM or STOP signal...  shutting down...
Dec  4 21:22:16 * snmpd[14238]: netsnmp_assert !"registration != duplicate"
failed agent_registry.c:535 netsnmp_subtree_load()
Dec  4 21:22:17 * snmpd[14238]:last message repeated 2 times
Dec  4 21:22:17 * snmpd[14238]: error finding row index in
_ifXTable_container_row_restore
Dec  4 21:22:17 * snmpd[14238]: error finding row index in
_ifXTable_container_row_restore
Dec  4 21:22:17 * snmpd[14238]: NET-SNMP version 5.4.1

Finally I found the reason:

Nov 26 12:39:03 * pppd[15672]: primary   DNS address ***
Nov 26 12:39:03 * pppd[15672]: secondary DNS address ***
Nov 26 12:39:04 * snmpd[11638]: netsnmp_assert index == tmp failed
if-mib/data_access/interface.c:469 _access_interface_entry_save_name()
Nov 26 12:39:19 * snmpd[11638]: netsnmp_assert index == tmp failed
if-mib/data_access/interface.c:469 _access_interface_entry_save_name()

Problem is caused by changing of ppp interface after hangup and re-dialin.

This problem leads also to an mrtg problem, where mrtg can no longer retrieve
any interface data (I've tried already \ppp@...)

I believe the internal reason is that the ppp interface gets a new index number,
this disturbs net-snmp and as secondary reason, mrtg get no data anymore.


Workaround: 

Append to /etc/ppp/ip-up.local:

/sbin/service snmpd restart >/dev/null

Note You need to log in before you can comment on or make changes to this bug.