Bug 243316 - getbulk returns wrong oid's on 64-bit machine, but 32-bit machine is ok
Summary: getbulk returns wrong oid's on 64-bit machine, but 32-bit machine is ok
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: net-snmp
Version: 6
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Jan Safranek
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-06-08 15:13 UTC by Steve Falco
Modified: 2007-11-30 22:12 UTC (History)
1 user (show)

Fixed In Version: net-snmp-5.3.1-15.fc6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-07-16 09:15:37 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Script to invoke snmpbulkget (326 bytes, text/plain)
2007-06-08 15:13 UTC, Steve Falco
no flags Details
List of machines, architectures, test runs (4.80 KB, text/plain)
2007-06-26 14:33 UTC, Steve Falco
no flags Details
Module to report 64-bit cpu stats (18.13 KB, application/octet-stream)
2007-06-26 14:34 UTC, Steve Falco
no flags Details

Description Steve Falco 2007-06-08 15:13:27 UTC
Description of problem:  I have a 64-bit machine (called cvsdb) running FC6. 
When I do an snmpbulkget, the wrong oids are returned.  If I do the same thing
on a 32-bit machine (called localhost), the correct oids are returned. 


Version-Release number of selected component (if applicable):
net-snmp-5.3.1-14.fc6.x86_64


How reproducible: 100%


Steps to Reproduce:
1. run the attached script, passing the name of a 64-bit machine as the only
command argument.
  
Actual results: On the 64-bit "cvsdb" machine, the result returned is from
tcpConnState, but it should be from tcpActiveOpens.0.

./bulk cvsdb
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpInErrs.0 = Counter32: 0
tcpOutRsts.0 = Counter32: 261293


Expected results: On the 32-bit "localhost" machine, the correct oids are returned:

./bulk localhost
tcpActiveOpens.0 = Counter32: 5331
tcpPassiveOpens.0 = Counter32: 582
tcpAttemptFails.0 = Counter32: 3
tcpEstabResets.0 = Counter32: 12
tcpCurrEstab.0 = Gauge32: 18
tcpInSegs.0 = Counter32: 2291661
tcpOutSegs.0 = Counter32: 1837545
tcpRetransSegs.0 = Counter32: 152
tcpInErrs.0 = Counter32: 2
tcpOutRsts.0 = Counter32: 924


Additional info:

Comment 1 Steve Falco 2007-06-08 15:13:27 UTC
Created attachment 156581 [details]
Script to invoke snmpbulkget

Comment 2 Steve Falco 2007-06-08 15:24:53 UTC
Interestingly, if I request an individual oid, rather than using getbulk, then I
get the expected oid from the 64-bit machine:

$ snmpget -v2c -Os -c public cvsdb .1.3.6.1.2.1.6.5.0
tcpActiveOpens.0 = Counter32: 1311986

Unfortunately, I have to use getbulk, because the actual invoking program is
OpenNMS, which always uses getbulk with v2c machines.  (And I have to use v2c so
that counter64 oids can be read.)

Comment 3 Jan Safranek 2007-06-26 08:42:22 UTC
I have tried your script with various configurations (i386 client querying
x86_64 server, i386->i386, x86_64->i386 and x86_64->x86_64), all with the same
result:

tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpConnState.0.0.0.0.111.0.0.0.0.0 = INTEGER: listen(2)
tcpInErrs.0 = Counter32: 0
tcpOutRsts.0 = Counter32: 312

Could you please check for differences /etc/snmpd.conf and /usr/share/snmp/mibs/
on your cvsdb and localhost? Have you set MIBS enviromental variable on any of
the hosts? Do you have the same version installed? Did you build it on your own?

In the meantime I'll try to dig into sources why the counters are not returned...

Comment 4 Steve Falco 2007-06-26 14:32:11 UTC
I need to revise this bug report.  I have just tried a
half dozen different machines from FC3 through F7.  Here
is a summary of the behavior (I'll attach more details
separately): 

cvsdb,   FC6, x86_64, returns wrong oid
maytag,  FC3, i386,   returns correct oid
saf,     F7,  i386,   returns correct oid
slipper, FC6, i386,   returns wrong oid
sw1,     FC4, x86_64, returns wrong oid
sw2,     FC4, x86_64, returns wrong oid

At first, I thought this was an i386/x86_64 issue, but
now it looks like it is not, because "slipper" is an
i386 and it has the problem.  So perhaps there is something
broken in FC4 and FC6 that is working in FC3 and F7.

Unfortunately, I cannot upgrade these machines to a newer
Fedora release, because they are production machines.
However, I can build a newer version of snmpd for them.

Answers to your detailed questions follow:

> Could you please check for differences /etc/snmpd.conf

There is one difference between the various config files,
because I have a home-grown module for the oids
.1.3.6.1.4.1.2021.253.1 through .7, and I had to compile it
separately for each machine.  I installed this module because
our machines typically have uptimes of a year or more, and so
I wanted 64-bit counters for the cpu times.  I'll add a copy
of the module source in case you wish to look at it or build
it.  Here is the difference in the config files:

localhost: dlmod nstAgentPluginObject /usr/local/lib64/nstAgentPluginObject.fc6.so
cvsdb:     dlmod nstAgentPluginObject /usr/local/lib/nstAgentPluginObject.f7.so

> and /usr/share/snmp/mibs/ on your cvsdb and localhost?

There are differences in the mibs, because these machines
are not all running the same release of Fedora.  However, in
all cases the mibs are "stock" - I have not edited them.

> Have you set MIBS enviromental variable on any of the hosts?

No, I use the standard rc script to start snmpd.

> Do you have the same version installed?

No, again because of the different Fedora releases.  The attached
list shows the versions on each machine.

> Did you build it on your own?

No, I installed snmpd from yum, so it should be a standard rpm.

Comment 5 Steve Falco 2007-06-26 14:33:19 UTC
Created attachment 157902 [details]
List of machines, architectures, test runs

Comment 6 Steve Falco 2007-06-26 14:34:01 UTC
Created attachment 157905 [details]
Module to report 64-bit cpu stats

Comment 7 Jan Safranek 2007-06-26 14:49:43 UTC
Hey man, thanks for overwhelming answer. I have found a bug upstream dealing
with the issue:
http://sourceforge.net/tracker/index.php?func=detail&aid=1554989&group_id=12694&atid=112694

It's fixed in net-snmp 5.4 (=Fedora 7), I'll port the fix back to Fedora 6.

Comment 8 Steve Falco 2007-06-26 14:56:11 UTC
I personally hate tracking problems where someone leaves out a clue that would
have helped me - so I tried for "full disclosure". :-)

Anyway - it sounds like you found the root cause.  I'll test the fix whenever
you are ready.

Comment 9 Jan Safranek 2007-07-16 09:15:37 UTC
net-snmp-5.3.1-15.fc6 has been pushed for FC6, which should resolve this issue.
If these problems are still present in this version, then please make note of it
in this bug report. 


Note You need to log in before you can comment on or make changes to this bug.