Bug 85190

Summary: UDP packages lost on their way to samba
Product: [Retired] Red Hat Public Beta Reporter: Daniel Resare <noa-bugzilla-redhat>
Component: sambaAssignee: Jay Fenlason <fenlason>
Status: CLOSED NOTABUG QA Contact: David Lawrence <dkl>
Severity: medium Docs Contact:
Priority: medium    
Version: phoebeCC: jfeeney
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2003-02-26 23:27:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Resare 2003-02-26 14:52:47 UTC
Description of problem:
SMB connections from a Windows 2000 Server box to a samba server on phoebe3 beta
hangs about every 10 minutes or so. Playing around with ethereal i found that 
NBNS (NetBIOS Name Service) UDP packets gets lost on their to the nmbd process. 

Reconfiguring the ethernet interface with '/sbin/ifconfig eth0 down;
/sbin/ifconfig eth0 up' fixes the problem with the disappearing UDP packets and
after a few seconds the SMB mount starts working again. 


Version-Release number of selected component (if applicable):
kernel-2.4.20-2.54 (from rawahide, i686)
samba-2.2.7a-5
[root@ulysses noa]# grep Ethernet /proc/pci
    Ethernet controller: 3Com Corporation 3c905B 100BaseTX [Cyclone] (rev 48).
The client: Windows 2000 Server SP3 with all Hotfixes applied


How reproducible:
Havent seen it on any other networks, but here it is totally reproducible. I'm
willing to put the needed time in trying out test setups and stuff to nail this
one. 

Steps to Reproduce:
1. The simplest test case is doing an explicit name table lookup for the samba
server using the windows commandline util nbtstat. "nbtstat -a ulysses" should
give a table of names that has different properties. Looking at the lookup from
ethereal on the server gives the following pattern:

  6.803884 213.114.26.12 -> 213.114.26.96 NBNS Name query NBSTAT ULYSSES<00>
  6.804272 213.114.26.96 -> 213.114.26.12 NBNS Name query response NBSTAT

Doing "strace -p `/sbin/pidof nmbd` -eselect" shows that the select(2) returns 
 when the UDP packet arrives.

2. Mount a volume from the samba server serving some music, and listen to it.
When the music abruptly stops the server is in "fail mode" 

3. Trying "nbtstat -a ulysses" again produces 3 UDP packet in ethereal on the
samba server but no answers, like this:

  1.551568 213.114.26.12 -> 213.114.26.96 NBNS Name query NBSTAT ULYSSES<00>
  3.050753 213.114.26.12 -> 213.114.26.96 NBNS Name query NBSTAT ULYSSES<00>
  4.550720 213.114.26.12 -> 213.114.26.96 NBNS Name query NBSTAT ULYSSES<00>

The really interesting part though is the strace line from above. The select(2)
call doesnt doesnt return when the packets hit the machine.

4. To restore UDP delivery to the nmb process again, wihtout waiting for about
10 minutes is most easily done by running '/sbin/ifconfig eth0 down;
/sbin/ifconfig eth0 up'

Additional info:
- No firewall is configured, /sbin/iptables-save returns without printing anything.

Comment 1 Daniel Resare 2003-02-26 15:12:42 UTC
I think you were a little to quick classing this as a samba problem, as I can
trivially reproduce it with netcat also.

When the server is in failure mode, if i shut down samba and instead run 'nc -u
-l -p 137' I don't get any traffic from the network. Restarting the interface
makes the UDP packets get sent to netcat (and displayed at chunks of strange ascii).

If this is indeed not a kernel bug I would be most interested in finding out
what valid reasons the kernel has for not forwarding incoming udp packets to
userspace, or at least point me in a direction where I can RTFM a bit :)

Comment 2 Jay Fenlason 2003-02-26 20:16:08 UTC
Is the box paticularly loaded when packets are being dropped?  One of the other
develpers here wolud like to know what "cat /proc/net/snmp" shows when it's
dropping packets.  If the kernel is dropping packets, that'll show why.

I'd also suggest trying a different (kind of) nic in the machine.  If it only
fails when you're using the 3c905B, it'll be easier for me to say "this is a
kernel bug".

Comment 3 Daniel Resare 2003-02-26 23:27:38 UTC
After some more hours of debugging I found the problem. It turns out that my ISP
has the habit of spaming my network with ARP responses. This creates a race
condition between the samba server and the routers. When the routers win they
get the oppurtunity to block certain udp packets. 

This was hard to track down because the routers only drops certain classes of
packets.

A piece of advice to anyone tracking down similiar problems in the future is to
have a quick look at the Ethernet headers for the packets that seem to disappear.