Bug 424581

Summary: /proc/net/dev sometimes contains bogus values
Product: Red Hat Enterprise Linux 4 Reporter: Chuck Berg <cberg>
Component: kernelAssignee: Andy Gospodarek <agospoda>
Status: CLOSED DUPLICATE QA Contact: Martin Jenner <mjenner>
Severity: medium Docs Contact:
Priority: low    
Version: 4.4CC: benlu, hanfang, mbozzore, peterm, tao, vgoyal
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 515274 (view as bug list) Environment:
Last Closed: 2011-08-19 01:02:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
repeatedly reads /proc/net/dev and complains about bad data
none
results of proc_net_dev_watcher on many machines none

Description Chuck Berg 2007-12-14 03:21:32 UTC
/proc/net/dev occasionally gives bogus data. I have seen this on 2.4.21-37 and
2.6.9-42 (I have not tested much on other kernels). These are all 64-bit systems.

Here is an example, which was produced in one read() call:
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes   
packets errs drop fifo colls carrier compressed
    lo:8358204734 7281043    0    0    0     0          0         0 8358204734
7281043    0    0    0     0       0          0
 bond0:2509896022455090019 2721742904916  118    0    0   118          0  
5672718 301653043605 3139935252    0    0    0     0       0          0
  eth0:2509896020739034695 2721736433935  118    0    0   118          0  
5672718 301653043605 3139935252    0    0    0     0       0          0
  eth1:1716055324 6470981    0    0    0     0          0         0        0   
   0    0    0    0     0       0          0

This is from /proc/net/dev being read every 100ms, and the sample before and
after this one was reasonable.

I first noticed this when seeing bogus data in sar output. On many machines, I
am running this every day:
sadc 1 86400 /tmp/sarlog

Then, when I read the data with
sar -n DEV /tmp/sarlog

I occasionally see insanely large numbers. (on about one in 10 machines when
saving a full day's of sar data at 1 second intervals). This makes reporting
very difficult.

The attached proc_net_dev_watcher.c will read /proc/net/dev every 100ms (in one
read() call), and complain about deltas over 10^10. The output is a time stamp,
the previous read, and the bad one. I ran this on 577 machines for about 24
hours, and 313 produced at least one error.

The attached proc_net_dev_watcher_results.tgz contains this output from the 313
affected machines (one file per host).

Here is a typical (95% of the affected machines) uname -a:
Linux host1 2.6.9-42.ELsmp #1 SMP Tue Aug 15 10:35:26 BST 2006 x86_64 x86_64
x86_64 GNU/Linux

Comment 1 Chuck Berg 2007-12-14 03:21:32 UTC
Created attachment 288331 [details]
repeatedly reads /proc/net/dev and complains about bad data

Comment 2 Chuck Berg 2007-12-14 03:22:40 UTC
Created attachment 288351 [details]
results of proc_net_dev_watcher on many machines

Comment 3 Chuck Berg 2008-02-05 20:33:04 UTC
I have verified that the problem occurs on kernel 2.6.9-67.ELsmp as well.

To repeat the problem more quickly, you may wish to remove the sleep from
proc_net_dev_watcher.c, and run it on a system with a significant amount of
network traffic.

This bug makes sar and net-snmp return bogus data. I can't do performance
monitoring correctly, nor can I alert based on network error counters. I am
being laughed at by Windows admins, who's performance counters work correctly.

Comment 4 Chuck Berg 2008-02-14 04:14:16 UTC
I have repeated this on hundreds of 64-bit systems running 2.6.9-42 and 2.6.9-67.

I beg others who have 64-bit systems running RedHat ES4 to compile and run my
simple test case. (other platforms are of course welcome as it may help narrow
things down.) The only catch is that it may take hours (on average) for a single
machine to reproduce the error.

Download the proc_net_dev_watcher.c attachment.
gcc -Wall -O2 -o proc_net_dev_watcher proc_net_dev_watcher.c
nohup ./proc_net_dev_watcher < /dev/null >& /tmp/proc_net_dev_watcher.log &

If it produces output in /tmp/proc_net_dev_watcher.log, you have repeated the
bug. The contents of /proc/net/dev, read in one read() call, is the proof.

You may wish to remove the "usleep(100000);" line and generate some network
traffic to increase the odds that it will happen.

Comment 5 Andy Gospodarek 2008-05-01 14:24:39 UTC
This will be fixed in the upcoming RHEL4 update.  Please verify with my test
kernels located here:

http://people.redhat.com/agospoda/

or any beta 4.7 kernels that exist.



*** This bug has been marked as a duplicate of 430576 ***

Comment 6 Chuck Berg 2008-05-03 20:20:52 UTC
The problem still exists with 2.6.9-69.EL.gtest.43smp. This is a separate issue.

Here are two consecutive /proc/net/dev reads, from the attached proc_net_dev_watcher

warn: counter rx_bytes on bond1 delta was 172752163853975
1209833008.751634
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes   
packets errs drop fifo colls carrier compressed
    lo:84448910  331481    0    0    0     0          0         0 84448910 
331481    0    0    0     0       0          0
 bond0:713534965513 698989252    0    0    0     0          0         0
713484683532 698545505    0    0    0     0       0          0
 bond1:21444908   84386    0    0    0     0          0         0 21508295  
84452    0    0    0     0       0          0
  eth3:10725747   42193    0    0    0     0          0         0 10746131  
42226    0    0    0     0       0          0
  eth2:34757343  200233    0    0    0     0          0         0        0     
 0    0    0    0     0       0          0
  eth0:713500208170 698789019    0    0    0     0          0         0
713484683532 698545505    0    0    0     0       0          0
  eth1:10719161   42193    0    0    0     0          0         0 10762164  
42226    0    0    0     0       0          0

1209833008.853638
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes   
packets errs drop fifo colls carrier compressed
    lo:84448910  331481    0    0    0     0          0         0 84448910 
331481    0    0    0     0       0          0
 bond0:713534965513 698989252    0    0    0     0          0         0
713484683532 698545505    0    0    0     0       0          0
 bond1:172752185298883 10830517    0    0    0     0          0         0
21487911   84452    0    0    0     0       0          0
  eth3:172752174579722 10788324    0    0    0     0          0         0
10725747   42226    0    0    0     0       0          0
  eth2:34757343  200233    0    0    0     0          0         0        0     
 0    0    0    0     0       0          0
  eth0:713500208170 698789019    0    0    0     0          0         0
713484683532 698545505    0    0    0     0       0          0
  eth1:10719161   42193    0    0    0     0          0         0 10762164  
42226    0    0    0     0       0          0


Comment 7 Andy Gospodarek 2008-05-05 15:15:38 UTC
Sorry I closed this one too quickly on you.  We fixed a bug in rhel4 and rhel5
that dealt with totals from the bonding driver being invalid, but this appears
to be specifically from the NIC (at least in the example you've got above).  It
appears that:

bond0 contains eth0 and eth2
bond1 contains eth1 and eth3

The totals from the bonds look fine, but it seems like eth3's stats changed
drastically:

first run:
eth3:         10725747     42193
second run:
eth3:  172752174579722  10788324

This seems to account for the difference in your bonding stats.

Can you tell me more about your system, what type of NICs you are using (at
least an lspci from the system would be helpful)?

It doesn't seem realistic to expect approx 135Tb of traffic in 100ms on those
interfaces, so I'm guessing we've got a driver or hardware stats bug. :-)

I don't want to assume too much, but when looking at a few of the log files
attached it appears that this is mostly an ethernet interface statistics bug
rather than one that appears to sum the bonding stats incorrectly.  If you
happen to notice bonding totals that are invalid when running any of my gtest
kernels please let me know, but I'm going to examine the individual ethernet
devices and drivers first.

Comment 8 Chuck Berg 2008-06-23 21:27:08 UTC
All systems are running the tg3 or bnx2 driver (all are HP DL380 or DL360
G3/G4/G5, HP DL385 G1 or DL585 G1). I didn't bother to check that bonding totals
add up since I don't think there was suspicion of that being the problem here,
but I'll modify my test program for any checks that might be useful.

32-bit systems (e.g. the HP DL380 G3s) seem to be not affected, only 64-bit
systems seem to be. But I have a lower population of 32-bit systems and they
typically do less network traffic, so it could be coincidence.

Here's an example lspci -v output for one of these interfaces.
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit
Ethernet (rev 12)
        Subsystem: Hewlett-Packard Company: Unknown device 7038
        Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 74
        Memory at f8000000 (64-bit, non-prefetchable) [size=32M]
        Capabilities: [40] PCI-X non-bridge device.
        Capabilities: [48] Power Management version 2
        Capabilities: [50] Vital Product Data
        Capabilities: [58] Message Signalled Interrupts: 64bit+ Queue=0/0 Enable+




Comment 9 Andy Gospodarek 2008-06-24 14:36:44 UTC
Chuck, thanks for the feedback.  I'll take a look at those drivers (tg3 and
bnx2) and see if there is anything that might be causing this.

Comment 10 Terry Hanks 2008-11-05 07:09:00 UTC
This bug is bothering  me for a looooooong time.
I find that it's bogus just in the bnx2_get_stats() call.

Comment 11 Andy Gospodarek 2008-11-05 16:36:46 UTC
So you see invalid stats on bnx2-based cards sometimes?  I haven't personally seen it, but looking at the code I can see how it might happen.  Can you describe what you see when you are seeing what appear to be invalid stats?

Comment 12 Terry Hanks 2008-11-06 04:11:40 UTC
(In reply to comment #11)
> So you see invalid stats on bnx2-based cards sometimes?  I haven't personally
> seen it, but looking at the code I can see how it might happen.  Can you
> describe what you see when you are seeing what appear to be invalid stats?

It has been fixed~
I applied this patch in kernel 2.6.9-34 and it seems ok for me.   
The upcoming RH 4.x should apply this patch.

See linux-2.6 tree 
git log -p 02537b0676930b1bd9aff2139e0e645c79986931


commit 02537b0676930b1bd9aff2139e0e645c79986931
Author: Michael Chan <mchan>
Date:   Mon Jun 4 21:24:07 2007 -0700

    [BNX2]: Fix occasional counter corruption on 5708.
    
    The statistics block DMA on 5708 can be messed up occasionally on the
    average of about once per hour.  If the user is reading the counters
    within one second after the corruption, the counters will be all
    messed up.  One second later, the counters will be ok again until the
    next corruption occurs.
    
    The workaround is to disable the periodic statistics DMA.  Instead,
    we manually trigger the DMA once a second in bnx2_timer().  This
    manual trigger of the DMA avoids the problem.
    
    As a consequence, we can only allow 0 or 1 second settings for
    ethtool -C statistics block.
    
    Thanks to Jean-Daniel Pauget <jd> and
    CaT <cat.au> for reporting this rare problem.
    
    Signed-off-by: Michael Chan <mchan>
    Acked-by: Jeff Garzik <jeff>
    Signed-off-by: David S. Miller <davem>

Comment 13 Andy Gospodarek 2008-11-07 14:09:52 UTC
Terry, That patch is already in our upcoming rhel5 update and it will get pulled into 4.8 as part of bug 452883.  Feel free to add yourself to the CC-list to that bug if you like.

Comment 14 Chuck Berg 2009-01-22 21:38:32 UTC
I also see this problem on the loopback interface, with kernel 2.6.18-92.

Here's some example output from proc_net_dev_watcher (source attached). Note the two timestamps 100ms apart, each followed by the full /proc/net/dev contents with a gigantic delta for all of the counters on lo.

I had to run this program for about an hour on 100 machines to catch 10 occurrences of this.

If attempting to repeat this, it may happen faster if you take the usleep() out of proc_net_dev_watcher.c and generate some localhost traffic. (my 100-system test group were production systems with a nontrivial amount of localhost traffic)

warn: counter rx_bytes on lo delta was 1620963998645
1232656157.416785
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:2550068943552 3527004034    0    0    0     0          0         0 2550068943552 3527004034    0    0    0     0       0          0
  eth0:3458672361860 3760436791    0    0    0     0          0 1215017822 1633785213814 1864773066    0    0    0     0       0          0
  eth1:1308425220 9393943    0    0    0     0          0        20     4435      20    0    0    0     0       0          0
  sit0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
 bond0:3459980787080 3769830734    0    0    0     0          0 1215017842 1633785218249 1864773086    0    0    0     0       0          0

1232656157.517763
Inter-|   Receive                                                |  Transmit
 face |bytes    packets errs drop fifo frame compressed multicast|bytes    packets errs drop fifo colls carrier compressed
    lo:4171032942197 6763066679    0    0    0     0          0         0 4926815958089 6763066679    0    0    0     0       0          0
  eth0:3458672361860 3760436791    0    0    0     0          0 1215017822 1633785213814 1864773066    0    0    0     0       0          0
  eth1:1308425592 9393946    0    0    0     0          0        20     4435      20    0    0    0     0       0          0
  sit0:       0       0    0    0    0     0          0         0        0       0    0    0    0     0       0          0
 bond0:3459980787452 3769830737    0    0    0     0          0 1215017842 1633785218249 1864773086    0    0    0     0       0          0

Comment 15 Andy Gospodarek 2009-03-02 20:22:19 UTC
Chuck, I'm trying to wrap up my RHEL4 bugs, and it sounds like the problems you have in bnx2 are going to be fixed for in our upcoming RHEL4.8 kernels.  Do you still see problems with tg3?  If so can you test with my test kernels to see if the latest update fixes it?  You can download them here:

http://people.redhat.com/agospoda/#rhel4

I suspect these kernels will resolve both your bnx2 and tg3 issues since they contain the patch in comment #12 (to address the bnx2 issue) as well as the following patch (to address the tg3 issue):

commit 816f8b86621000fc82e16ae5f172164de761d5eb
Author: Stefan Buehler <stbuehler>
Date:   Fri Aug 15 14:10:54 2008 -0700

    tg3: fix 64 bit counter for ethtool stats
    
    Ethtool stats are 64-bits in length.  net_device_stats members are
    unsigned long types.  When gathering information for
    a get_ethtool_stats call, the driver will call a driver-private,
    inlined get_stat64() function, which returns an unsigned long value.
    This call will inadvertently mask off the upper 32-bits of a stat on
    32-bit machines.

Thanks for your help testing this.

Comment 16 Andy Gospodarek 2011-08-19 01:02:50 UTC
The patch in comment #12 that resolves this is included in RHEL4.8.  Closing as a dup of the bug used for that update.

*** This bug has been marked as a duplicate of bug 452883 ***