679407 – [5.7] niu: Fix races between up/down and get_stats.

Bug 679407 - [5.7] niu: Fix races between up/down and get_stats.

Summary: [5.7] niu: Fix races between up/down and get_stats.

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.5
Hardware:	Unspecified
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Jiri Pirko
QA Contact:	Weibing Zhang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	683069
TreeView+	depends on / blocked

Reported:	2011-02-22 14:12 UTC by Flavio Leitner
Modified:	2015-05-05 01:22 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	683069 (view as bug list)
Environment:
Last Closed:	2011-07-21 10:10:31 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:1065	0	normal	SHIPPED_LIVE	Important: Red Hat Enterprise Linux 5.7 kernel security and bug fix update	2011-07-21 09:21:37 UTC

Description Flavio Leitner 2011-02-22 14:12:59 UTC

Description of problem:

There is no synchronization to protect NIU's get_stats method from
seeing a NULL pointer in either np->rx_rings or np->tx_rings.  In fact,
as far as ->ndo_get_stats is concerned, these values are set completely
asynchronously.

The lack of synchronization causes this patch below:
Unable to handle kernel NULL pointer dereference at 0000000000000058 RIP: 
 [<ffffffff8827f10c>] :niu:niu_get_stats+0x2f/0xa4
PGD 3fbc60067 PUD 3fbc57067 PMD 0 
Oops: 0000 [1] SMP 
last sysfs file: /class/net/eth0/address
CPU 5 
Modules linked in: 8021q iptable_mangle ip_tables x_tables bonding autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec dell_wmi wmi button battery asus_acpi acpi_memhotplug ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport tpm_infineon joydev sr_mod cdrom i5000_edac tpm_tis i2c_i801 tpm edac_mc tpm_bios i2c_core niu e1000e pcspkr sg serio_raw dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod usb_storage ahci ata_piix libata shpchp mptsas mptscsih mptbase scsi_transport_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 6532, comm: ifconfig Not tainted 2.6.18-238.el5.0.DEBUG00413622 #1
RIP: 0010:[<ffffffff8827f10c>]  [<ffffffff8827f10c>] :niu:niu_get_stats+0x2f/0xa4
RSP: 0018:ffff8103fbcf3e90  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff810418a84000 RCX: 0000000000000001
RDX: ffff810418a84500 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff810418851740 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000004 R11: 0000000000000000 R12: 000000000000068c
R13: 0000000000001000 R14: 0000000000000000 R15: 00002b7b99cc0000
FS:  00002b7b99cce5e0(0000) GS:ffff81041fc7b4c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000058 CR3: 00000003fbc66000 CR4: 00000000000006e0
Process ifconfig (pid: 6532, threadinfo ffff8103fbcf2000, task ffff8103fc873820)
Stack:  ffffffff802335ed ffff810418851740 ffff810418851740 ffff810418a84000
 ffffffff8003fa30 ffff8103fbcf3f50 ffff8103fce17a80 ffff810418851770
 000000000000000d 000000000000000c ffff8103fce17a80 0000000000001000
Call Trace:
 [<ffffffff802335ed>] dev_seq_show+0x35/0xd0
 [<ffffffff8003fa30>] seq_read+0x1b8/0x28c
 [<ffffffff8000b787>] vfs_read+0xcb/0x171
 [<ffffffff80011c9a>] sys_read+0x45/0x6e
 [<ffffffff8005e28d>] tracesys+0xd5/0xe0


Code: 48 03 78 58 48 03 70 60 4c 03 48 68 4c 03 40 70 44 39 d1 7c 
RIP  [<ffffffff8827f10c>] :niu:niu_get_stats+0x2f/0xa4
 RSP <ffff8103fbcf3e90>

Version-Release number of selected component (if applicable):
kernel-2.6.18-238.el5

How reproducible:
Frequently

Steps to Reproduce:
1. run 'cat /proc/net/dev' in a busy loop to see the stats
2. then run ifup <niu_iface>
  
Actual results:
kernel panic

Additional info:
This has been worked upstream in the following threads:
http://www.spinics.net/lists/netdev/msg154375.html
http://www.spinics.net/lists/netdev/msg154405.html

The result patch is merged upstream and works out for the customer.

commit 9690c636ac118b6662f28308bee817343d9932d8
Author: David S. Miller <davem>
Date:   Thu Feb 3 16:12:50 2011 -0800

    niu: Fix races between up/down and get_stats.
    
    As reported by Flavio Leitner, there is no synchronization to protect
    NIU's get_stats method from seeing a NULL pointer in either
    np->rx_rings or np->tx_rings.  In fact, as far as ->ndo_get_stats
    is concerned, these values are set completely asynchronously.
    
    Flavio attempted to fix this using a RW semaphore, which in fact
    works most of the time.  However, dev_get_stats() can be invoked
    from non-sleepable contexts in some cases, so this fix doesn't
    work in all cases.
    
    So instead, control the visibility of the np->{rx,tx}_ring pointers
    when the device is being brough up, and use properties of the device
    down sequence to our advantage.
    
    In niu_get_stats(), return immediately if netif_running() is false.
    The device shutdown sequence first marks the device as not running (by
    clearing the __LINK_STATE_START bit), then it performans a
    synchronize_rcu() (in dev_deactive_many()), and then finally it
    invokes the driver ->ndo_stop() method.
    
    This guarentees that all invocations of niu_get_stats() either see
    netif_running() as false, or they see the channel pointers before
    ->ndo_stop() clears them out.
    
    If netif_running() is true, protect against startup races by loading
    the np->{rx,tx}_rings pointer into a local variable, and punting if
    it is NULL.  Use ACCESS_ONCE to prevent the compiler from reloading
    the pointer on us.
    
    Also, during open, control the order in which the pointers and the
    ring counts become visible globally using SMP write memory barriers.
    We make sure the np->num_{rx,tx}_rings value is stable and visible
    before np->{rx,tx}_rings is.
    
    Such visibility control is not necessary on the niu_free_channels()
    side because of the RCU sequencing that happens during device down as
    described above.  We are always guarenteed that all niu_get_stats
    calls are finished, or will see netif_running() false, by the time
    ->ndo_stop is invoked.
    
    Reported-by: Flavio Leitner <fleitner>
    Signed-off-by: David S. Miller <davem>

Comment 2 Flavio Leitner 2011-02-22 14:26:52 UTC

Processes racing:
PID: 8348   TASK: ffff8103eaffe7e0  CPU: 3   COMMAND: "ifenslave"
 #0 [ffff81041fcbbf20] crash_nmi_callback at ffffffff8007bce5
 #1 [ffff81041fcbbf40] do_nmi at ffffffff800658c5
 #2 [ffff81041fcbbf50] nmi at ffffffff80064eaf
    [exception RIP: niu_init_hw+2199]
    RIP: ffffffff8825c700  RSP: ffff8103ea48bc88  RFLAGS: 00000086
    RAX: ffffc20013600160  RBX: 0000000000000000  RCX: 000000000000000c
    RDX: 0000000000000000  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff81041df74500   R8: 0000000000000001   R9: 0000000000000002
    R10: 0000e00800000000  R11: 0000000000000006  R12: ffff81041df7565c
    R13: ffff81041df755d4  R14: 0000000000000160  R15: 000000000000002c
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
--- <NMI exception stack> ---
 #3 [ffff8103ea48bc88] niu_init_hw at ffffffff8825c700
 #4 [ffff8103ea48bcc0] niu_open at ffffffff88263704
 #5 [ffff8103ea48bcf0] dev_open at ffffffff80234b54
 #6 [ffff8103ea48bd00] bond_enslave at ffffffff885df6a5
 #7 [ffff8103ea48bd80] bond_do_ioctl at ffffffff885e165c
 #8 [ffff8103ea48bde0] dev_ioctl at ffffffff802347a5
 #9 [ffff8103ea48be90] sock_ioctl at ffffffff8022a58c
#10 [ffff8103ea48beb0] do_ioctl at ffffffff8004226a
#11 [ffff8103ea48bed0] vfs_ioctl at ffffffff8003026e
#12 [ffff8103ea48bf40] sys_ioctl at ffffffff8004c73b
#13 [ffff8103ea48bf80] tracesys at ffffffff8005d28d (via system_call)
    RIP: 0000003c2e6cc647  RSP: 00007fffac465908  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: ffffffff8005d28d  RCX: ffffffffffffffff
    RDX: 00007fffac465b00  RSI: 0000000000008990  RDI: 000000000000000b
    RBP: 00007fffac4670c3   R8: 00007fffac465b15   R9: 0000000000000003
    R10: 0000000000000000  R11: 0000000000000246  R12: 0000000000000000
    R13: 0000000000000000  R14: 00007fffac4670c9  R15: 00007fffac465b00
    ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b

PID: 8444   TASK: ffff8103eafa9100  CPU: 4   COMMAND: "ifconfig"
 #0 [ffff8103ea34dbf0] crash_kexec at ffffffff800af85a
 #1 [ffff8103ea34dcb0] __die at ffffffff80065117
 #2 [ffff8103ea34dcf0] do_page_fault at ffffffff8006748d
 #3 [ffff8103ea34dde0] error_exit at ffffffff8005dde9
    [exception RIP: niu_get_stats+121]
    RIP: ffffffff88262156  RSP: ffff8103ea34de90  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff81041df74000  RCX: 0000000000000001
    RDX: ffff81041df74500  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff81041d702e40   R8: 0000000000000000   R9: 0000000000000006
    R10: 0000000000000004  R11: 0000000000000000  R12: 00000000000002b4
    R13: 0000000000001000  R14: 0000000000000000  R15: 00002b3ff19f4000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #4 [ffff8103ea34de90] dev_seq_show at ffffffff8023256d
 #5 [ffff8103ea34deb0] seq_read at ffffffff8003f858
 #6 [ffff8103ea34df10] vfs_read at ffffffff8000b787
 #7 [ffff8103ea34df40] sys_read at ffffffff80011c5c
 #8 [ffff8103ea34df80] tracesys at ffffffff8005d28d (via system_call)
    RIP: 0000003c2e6c5ff0  RSP: 00007fff078ab918  RFLAGS: 00000246
    RAX: ffffffffffffffda  RBX: ffffffff8005d28d  RCX: ffffffffffffffff
    RDX: 0000000000001000  RSI: 00002b3ff19f4000  RDI: 0000000000000005
    RBP: 000000001e6005a0   R8: 00000000ffffffff   R9: 0000000000000000
    R10: 0000000000000022  R11: 0000000000000246  R12: 0000000000000000
    R13: 0000000000000000  R14: 000000001e6005a0  R15: 00000000078ab9a8
    ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b

Comment 6 RHEL Program Management 2011-02-25 18:20:02 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 10 Andy Gospodarek 2011-03-02 15:48:31 UTC

Adding Stefan to the CC list as he is the niu maintainer and should have been assigned this bug.

Comment 14 Jarod Wilson 2011-03-03 20:34:58 UTC

in kernel-2.6.18-246.el5
You can download this test kernel (or newer) from http://people.redhat.com/jwilson/el5

Detailed testing feedback is always welcomed.

Comment 18 Weibing Zhang 2011-06-23 08:11:31 UTC

Reproduced on kernel-2.6.18-238.el5, execute "while true; do cat /proc/net/dev; done", kernel panic was hit after several ifup and ifdown on niu NIC. 
On  kernel-2.6.18-246.el5, running "while true; do cat /proc/net/dev; done" and 
"while true; do ifdown eth5; sleep 3; ifup eth5; sleep 3 ; done" in two ssh sessions, no panic is triggered in test test for about an hour.

Set Verified.

Comment 20 errata-xmlrpc 2011-07-21 10:10:31 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-1065.html

Note You need to log in before you can comment on or make changes to this bug.