Bug 1292902

Summary: rt: netpoll: live lock with NAPI polling and busy polling on realtime kernel
Product: Red Hat Enterprise Linux 7 Reporter: Clark Williams <williams>
Component: kernel-rtAssignee: Clark Williams <williams>
kernel-rt sub component: Misc QA Contact: Zhang Kexin <kzhang>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: high CC: bhu, daolivei, kzhang, lgoncalv, zshi
Version: 7.3Keywords: ZStream
Target Milestone: rc   
Target Release: 7.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1293230 (view as bug list) Environment:
Last Closed: 2016-11-03 19:38:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1274397, 1282922, 1293230, 1295884, 1313485    
Attachments:
Description Flags
netpoll: Always take poll_lock when doing polling
none
Revert "ixgbevf: Prevent livelock spinning grabbing ixgbevf_qv_lock"
none
revert "ixgbe: Prevent livelock spinning grabbing ixgbe_qv_lock" none

Description Clark Williams 2015-12-18 16:55:55 UTC
A "live lock" has been seen in the NAPI polling capable NICs, such as ixgbe and sfc. 

Synchronization between NAPI polling and busy polling is done by looping on NAPI_STATE_SCHED 'bitset'. This method works fine on a non-rt kernel because a softirq can not be preempted, and the thread poll is called with local_bh_disable() which prevents softirqs from running and preempting it. But on rt, this code can be preempted. Thus, the code may be preempted out while holding the NAPI_STATE_SCHED 'bitset', opening a window for a livelock.

Comment 1 Clark Williams 2015-12-18 18:47:55 UTC
Created attachment 1107309 [details]
netpoll: Always take poll_lock when doing polling

Patch to synchronize NAPI polling and busy-polling to prevent live-lock.

Comment 2 Clark Williams 2015-12-18 18:49:34 UTC
Note: the RT engineering team originally thought this was a problem in the ixgbe driver code but further BZs revealed that it was a consequence of how RT is implemented combined with the NAPI polling and busy-polling code in the network driver framework.

Comment 6 Luis Claudio R. Goncalves 2016-01-07 02:37:31 UTC
Created attachment 1112320 [details]
Revert "ixgbevf: Prevent livelock spinning grabbing ixgbevf_qv_lock"

Comment 7 Luis Claudio R. Goncalves 2016-01-07 02:38:26 UTC
Created attachment 1112321 [details]
revert "ixgbe: Prevent livelock spinning grabbing ixgbe_qv_lock"

Comment 9 Zhenjie Chen 2016-06-06 05:14:54 UTC
QE update,

Reproduced on 3.10.0-327.rt56.204.el7.x86_64 with test like https://bugzilla.redhat.com/show_bug.cgi?id=1293230#c14

[ 1112.876788] INFO: rcu_preempt self-detected stall on CPU { 13}  (t=60000 jiffies g=4995 c=4994 q=0)          
[ 1112.876789] sending NMI to all CPUs:
[ 1112.876793] NMI backtrace for cpu 0
[ 1112.876796] CPU: 0 PID: 788 Comm: irq/86-0000:07: Not tainted 3.10.0-327.rt56.204.el7.x86_64 #1              
[ 1112.876797] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 12/14/2012
[ 1112.876799] task: ffff880416031780 ti: ffff880416040000 task.ti: ffff880416040000
[ 1112.876807] RIP: 0010:[<ffffffff810a9f8f>]  [<ffffffff810a9f8f>] migrate_disable+0xf/0xf0
[ 1112.876808] RSP: 0018:ffff880416043b38  EFLAGS: 00000203
[ 1112.876808] RAX: ffff880416043fd8 RBX: ffff88042f613680 RCX: 0000000000000020
[ 1112.876809] RDX: 0000000000000000 RSI: 0000000000000020 RDI: 0000000000000200
[ 1112.876810] RBP: ffff880416043b78 R08: 000000000000003c R09: 0000000000000001
[ 1112.876810] R10: ffff880419a1368e R11: ffff880416efc980 R12: 0000000000013680
[ 1112.876811] R13: 0000000000000200 R14: 0000000000000020 R15: ffff880416031780
[ 1112.876812] FS:  0000000000000000(0000) GS:ffff88042f600000(0000) knlGS:0000000000000000
[ 1112.876813] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1112.876813] CR2: 00000000006eb0f8 CR3: 00000000bb4b9000 CR4: 00000000000407f0
[ 1112.876814] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1112.876815] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1112.876824] Stack:
[ 1112.876827]  ffff880416043b78 ffffffff81501dd4 ffff8800bc82dca0 0000000000000200
[ 1112.876828]  ffff8804165fb000 ffff8800bc82dcb8 ffff880416efc980 0000000000000001
[ 1112.876830]  ffff880416043b98 ffffffff815024b1 ffff880416efc000 ffff8804165fb000
[ 1112.876831] Call Trace:
[ 1112.876836]  [<ffffffff81501dd4>] ? __netdev_alloc_frag+0x54/0xe0
[ 1112.876838]  [<ffffffff815024b1>] __alloc_rx_skb+0x51/0xb0
[ 1112.876840]  [<ffffffff8150252b>] __netdev_alloc_skb+0x1b/0x40
[ 1112.876869]  [<ffffffffa04c423f>] __efx_rx_packet+0xff/0x5f0 [sfc]
[ 1112.876877]  [<ffffffffa04c49d9>] efx_rx_packet+0x2a9/0x3f0 [sfc]
[ 1112.876884]  [<ffffffffa04be90b>] efx_ef10_ev_process+0x3bb/0x6b0 [sfc]
[ 1112.876887]  [<ffffffff81512ef9>] ? netif_receive_skb+0x89/0xe0
[ 1112.876893]  [<ffffffffa04a8469>] efx_process_channel+0x99/0x1b0 [sfc]
[ 1112.876898]  [<ffffffffa04a8760>] efx_poll+0xb0/0x230 [sfc]
[ 1112.876900]  [<ffffffff81513f5b>] net_rx_action+0x1fb/0x360
[ 1112.876903]  [<ffffffff81077558>] do_current_softirqs+0x1d8/0x3c0
[ 1112.876906]  [<ffffffff8110bfc0>] ? irq_thread_fn+0x50/0x50
[ 1112.876908]  [<ffffffff810777b4>] local_bh_enable+0x74/0xa0
[ 1112.876909]  [<ffffffff8110c001>] irq_forced_thread_fn+0x41/0x70
[ 1112.876911]  [<ffffffff8110c49f>] irq_thread+0x12f/0x180
[ 1112.876912]  [<ffffffff8110c080>] ? wake_threads_waitq+0x50/0x50
[ 1112.876914]  [<ffffffff8110c370>] ? irq_thread_check_affinity+0x30/0x30
[ 1112.876917]  [<ffffffff81099e41>] kthread+0xc1/0xd0
[ 1112.876919]  [<ffffffff81099d80>] ? kthread_worker_fn+0x170/0x170
[ 1112.876922]  [<ffffffff81631558>] ret_from_fork+0x58/0x90
[ 1112.876923]  [<ffffffff81099d80>] ? kthread_worker_fn+0x170/0x170
[ 1112.876934] Code: 75 08 48 83 87 88 07 00 00 01 e8 ed b1 ff ff 5d c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 65 48 8b 04 25 78 c0 00 00 <48> 89 e5 41 55 41 54 53 65 48 8b 1c 25 80 c0 00 00 f7 80 44 c0


Verified on 3.10.0-415.rt56.298.el7.x86_64
Run the reproducer several hours, no problem found.

Comment 10 Beth Uptagrafft 2016-07-29 14:06:19 UTC
*** Bug 1273264 has been marked as a duplicate of this bug. ***

Comment 13 errata-xmlrpc 2016-11-03 19:38:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2016-2584.html