Bug 170875 - perl script using net-snmp hangs on newer kernels
perl script using net-snmp hangs on newer kernels
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: net-snmp (Show other bugs)
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Radek Vokal
Depends On:
  Show dependency treegraph
Reported: 2005-10-14 16:48 EDT by Marc Wiartrowski
Modified: 2007-11-30 17:07 EST (History)
1 user (show)

See Also:
Fixed In Version: net-snmp-5.0.9-2.30E.18
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-11-08 10:09:52 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
text tcpdump of last few snmp gets (41.28 KB, text/plain)
2005-10-14 16:51 EDT, Marc Wiartrowski
no flags Details
binary tcpdump of that last few snmp gets (2.22 KB, application/octet-stream)
2005-10-14 16:53 EDT, Marc Wiartrowski
no flags Details
Last debug lines from perl script (62.60 KB, text/plain)
2005-10-14 17:01 EDT, Marc Wiartrowski
no flags Details
snmp perl script (4.23 KB, application/octet-stream)
2005-10-19 09:33 EDT, Marc Wiartrowski
no flags Details

  None (edit)
Description Marc Wiartrowski 2005-10-14 16:48:46 EDT
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.12) Gecko/20050915 Firefox/1.0.7

Description of problem:
We have been running a piece of perl code that loops through thousands of 
IP address and does a few snmp get queries on each.  It works fine with the
Update 2 '2.4.21-15.EL' and a local compiled version of net-snmp 5.2.1

Upon upgrading to Update 5 with kernel 2.4.21-32.0.1.EL, the script will
run fine for many IPs, but then it appears that upon recieving a responce
with an incorrect UDP checksum it just hangs the perl script.

In testing, if we use the Update 5 kernel with the redhat rpm of net-snmp-*-5.0.9-2.30E.19 the script eventually still hangs.  BUT if
we go back to the Update 2 kernel and keep the rpm snmp it works
just fine everytime.  So it appears to be something between the 2 kernels.

Its not the same IP address either.  

tcpdump and net-snmp debug attachments to follow.

Snipet of the perl code:

while (my ($mac, $data) =  each %{$device}) {
   print "Polling: $mac ($data->{ip})\n";
   $SNMP::debugging = 3;
   my $sess2 = new SNMP::Session(DestHost => $data->{ip}, Community => 'public', UseSprintValue => 1, UseLongNames => 1, UseNumeric => 1, Timeout => 100000, Retries => 2);
   print "Made Connection\n";
   my $vars = new SNMP::VarList(['.', 0], ['.', 0], ['.', 0], ['.', 0]);
   print "Set OIDs\n";
   my @val = $sess2->get($vars);
   print "Did get\n";
   if ($sess2->{ErrorStr}) {
      print ". . . SNMP Error: $sess2->{ErrorStr}\n";
   } else {
      # Do stuff with returned results

Version-Release number of selected component (if applicable):
kernel-2.4.21-32.0.1.EL net-snmp-5.0.9-2.30E.19

How reproducible:

Steps to Reproduce:
1. Run perl with net-snmp script on Update 5 kernel with any snmp version

Additional info:
Comment 1 Marc Wiartrowski 2005-10-14 16:51:25 EDT
Created attachment 119999 [details]
text tcpdump of last few snmp gets
Comment 2 Marc Wiartrowski 2005-10-14 16:53:06 EDT
Created attachment 120001 [details]
binary tcpdump of that last few snmp gets
Comment 3 Marc Wiartrowski 2005-10-14 17:01:12 EDT
Created attachment 120003 [details]
Last debug lines from perl script
Comment 4 Suzanne Hillman 2005-10-17 10:59:54 EDT
Could you check if this still happens on the Update 6 kernel, please?
Comment 5 Marc Wiartrowski 2005-10-17 13:36:04 EDT
Just tried kernel 2.4.21-37.EL and it still locked up.
Comment 6 Radek Vokal 2005-10-18 07:40:07 EDT
May I have whole perl script so I can rerun the test. I didn't reproduce the
hang on -32 kernel, but I think it appears after more than several runs. 
Comment 7 Marc Wiartrowski 2005-10-19 09:33:00 EDT
Created attachment 120165 [details]
snmp perl script

ok.  This is a trimmed down version of the script, but I just ran it
on 2.4.21-37.EL and it locked up.  It basically cycles through ~250k
ips doing the snmp query.  

I believe it broke somewhere between the Update 2 and Update 4 kernels.
We can try to narrow it down if you like.
Comment 8 Radek Vokal 2005-10-21 08:47:41 EDT
Should I pass some arguments to your script to make it run? Or how do you test
it? (I thought I can load in in snmpd.conf with `perl do` but this doesn't work)
Comment 9 Marc Wiartrowski 2005-10-21 09:20:21 EDT
Its not a script that runs through snmpd, its a command line script actually
run through cron once a day that does snmp gets. 

The script takes 2 parameters, --comm for the community string for snmp and
--cust which tells it the customer name.   The customer name is used to tell
the script which table in a database to connect to in which to get the 250k+
IP address it then snmp quries.

I am not sure how you would run the script without modifing it as it needs
to connect to a database to get the IP addresses.  And even if I could send 
you the database, you wouldn't be able to query the IP addresses, as they 
are on a private network.

The script takes many hours to run through the 250,000+ IP addresses when 
it works on the Update 2 kernel.  With a newer kernel it will run for an
hour or so and make it through several 1000 IP addresses before it hangs.

If there is something I can help with or run, please let me know.
Comment 10 Radek Vokal 2006-11-08 08:58:59 EST
Sorry I got back to this issue now. Is this still a problem? I've tried your
script but probably I need to probe more machines.
Comment 11 Marc Wiartrowski 2006-11-08 09:22:04 EST
In the time frame of Update 5 we went back to the Update 2 kernel.
Currently our problem appears to have been fixed in Updates 6 and 7 as we
are running them just fine.  (We have not went to Update 8 anywhere
Comment 12 Radek Vokal 2006-11-08 10:09:52 EST
Thanks for testing. 

Note You need to log in before you can comment on or make changes to this bug.