Red Hat Bugzilla – Bug 426853
Missing multicast messages with realtime beta kernel
Last modified: 2014-08-14 21:43:03 EDT
Description of problem:
Customer's multicast message stream test harness (using Tibco Smart Sockets)
reports missing messages with realtime beta kernel. No problems running same
test on stock rhel5.1 kernel.
Version-Release number of selected component (if applicable):
Problem happens consistently, almost immediately the customer's test starts.
Working with their 29West engineer, customer can reproduce problem using
29West's multicast tools msend, mdump. Tarball containing these is attached,
though they are also available, including source from
http://188.8.131.52/docs/TestNet/testnet.html#INITIAL-FOUR. The 29West engineer
modified mdump to display missing message sequence numbers. It's this version
Steps to Reproduce:
1.Build msend & mdump, "make msend" & "make mdump" is sufficient.
2.The test requires 2 separate multicast streams for the problem to manifest
itself (the customer's application test suite uses 6 streams). These can run all
be run on a single box provided they use different multicast addresses. The
tests need to be bound to one of the system's addresses. We found that running
against loopback reproduced the issue more quickly that against an external
interface. Initially start 2 "mdump" listener processes in separate windows
./mdump -v -q 184.108.40.206 4400 127.0.0.1
./mdump -v -q 220.127.116.11 4400 127.0.0.1
There should be no output from either
3. In another window run "msend" against one of the multicast addresses above.
./msend -b400 -m8192 -n500000 -p5 -q -s2000 -S8388608 18.104.22.168 4400 2 127.0.0.1
There should be no output from this, nor either of the mdump processes.
4. Now start a second "msend" against the other multicast address.
./msend -b400 -m8192 -n500000 -p5 -q -s2000 -S8388608 22.214.171.124 4400 2 127.0.0.1
What we see after a very short time is something like the following
error message in both mdump windows ...
Expected seq 1137f, got 11df0
Expected seq 11380, got 11df1
Expected seq 11381, got 11df2
Expected seq 11382, got 11df3
No output from either mdump process.
Hardware is 4-way dual core HP DL585 with 8GB memory.
Created attachment 290452 [details]
tarball containing source for 29West test tools msend.c & mdump.c
The problem happens with 2.6.18-53.1.4.el5 as well, checking now with
Problem also present on:
[root@mica barcap]# uname -a
Linux mica.ghostprotocols.net 2.6.21-57.el5rtvanilla #1 SMP PREEMPT Fri Nov 30
10:53:20 EST 2007 x86_64 x86_64 x86_64 GNU/Linux
Tried now with:
[root@mica barcap]# uname -r
And the problem happens even with just one msend instance and two mdump instances.
Please see attached patch I'm using to see how many packets are lost each time
the packet sequential is different, when we resync and then try to correlate it
with SNMP UDP MIB variables, using it the output for mdump becomes:
Expected seq 4a6b8a, got 4a6b90
6 packets lost, resync
Expected seq 4a6bad, got 4a6cd3
294 packets lost, resync
Expected seq 4a6cd4, got 4a6cd5
1 packets lost, resync
And we get output only when we detect packet loss, and it correlates with what
is reported in the UDP "packet receive error" line in the netstat -s output. Now
looking at the situations where this MIB variable is bumped in the udp_rcv
routine in the kernel sources.
Created attachment 290672 [details]
show how many packets were lost and resync the sequential to avoid useless continuous printfs
mdump is running out of receive buffer space, if one bumps rmem_max as described
in the 29west docs the problem goes away.
To reproduce this its not even needed to run multiple instances of msend and
mdump, just a client + server pair + some other heavy system activity will
provide the same results, i.e. packet loss because mdump is not being scheduled
fast enough to consume what msend is producing.
So please bump the rmem_max, make mdump (and other multicast receivers) use
setsockopt(SO_RCVBUF) and also consider making the receiver run at a higher
priority so that it can process the incoming packets faster and thus reduce the
possibility that it runs out of receive buffer space.
In fact mdump.c already has a -r parameter where the user can set SO_RCVBUF, so
its just a matter of using '-r $((8192 * 1024))' + 'sysctl -w
net.core.rmem_max=$((8192 * 1024))' to make it extremely unlikely that packets
will be lost with the tests mentioned in this bug ticket.
The buffer limit tests were all performed on non-rt kernels from RHEL5 and
RHEL5.1, as this problem is not RT specific.
rmem_max was already set to 20971520.
We were not seeing any packet errors reported by "netstat -us"
We finally tracked down the source of the out of sequence messages when running
the mdump/msend against the loopback interface to the mtu being set to 7700.
Returning this to the default 16436 resolved that issue.
Running the same tests to one of the system's external interfaces, when mdump &
msend were given rtpio greater than the softirq for that network interface we
also saw similar out of sequence messages. With lower rtprio, no problems.
Running mdump & msend with normal TS rtprio class, during a brief test we saw no
out of sequence packets (though the customer did see these running the test over
a much longer period).
UDP is unreliable and the loss perceived was due to the way the system was set
up (MTU, etc), so I'm closing this ticket after talking with Grahan.