Description of problem: The latest rawhide 4.11rc1 kernels are throwing [ 196.472194] Unbalanced enable for IRQ 33 [ 196.476119] ------------[ cut here ]------------ [ 196.480731] WARNING: CPU: 5 PID: 40 at kernel/irq/manage.c:529 __enable_irq+0x74/0x84 [ 196.488553] Modules linked in: fuse rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables vfat fat ipmi_si ipmi_devintf crc32_ce crct10dif_ce amd_xgbe ghash_ce ptp spi_pl022 pps_core ccp i2c_designware_platform ipmi_msghandler i2c_designware_core nfsd auth_rpcgss nfs_acl lockd grace sunrpc xfs libcrc32c [ 196.553414] [ 196.554897] CPU: 5 PID: 40 Comm: ksoftirqd/5 Tainted: G W I 4.11.0-rc #263 [ 196.563327] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS ROD1002C 04/08/2016 [ 196.571931] task: ffff8003dc220a00 task.stack: ffff8003dc1e4000 [ 196.577841] PC is at __enable_irq+0x74/0x84 [ 196.582015] LR is at __enable_irq+0x74/0x84 [ 196.586189] pc : [<ffff000008132d04>] lr : [<ffff000008132d04>] pstate: 600001c5 [ 196.593576] sp : ffff8003dc1e7c40 [ 196.596881] x29: ffff8003dc1e7c40 x28: 0000000000000040 [ 196.602185] x27: ffff8003de92cce0 x26: 0000000000000000 [ 196.607489] x25: ffff000008df0000 x24: 0000000000000000 [ 196.612793] x23: ffff8003d02c0900 x22: 0000000000000040 [ 196.618096] x21: ffff8003de92cc80 x20: 0000000000000021 [ 196.623400] x19: ffff8003dc4f8a00 x18: 00000000fd515fca [ 196.628703] x17: 00000000a619fadc x16: 00000000701097db [ 196.634007] x15: 00000000a6b67fff x14: 0000000000000001 [ 196.639310] x13: 0000b68000003cdc x12: 0000000000000b68 [ 196.644614] x11: 0000000000000000 x10: 0000000000000006 [ 196.649917] x9 : 0000000000000006 x8 : 000000000000353a [ 196.655221] x7 : 65206465636e616c x6 : ffff000008fd9e20 [ 196.660524] x5 : 0000000000000000 x4 : 0000000000000000 [ 196.665827] x3 : 0000000000000000 x2 : ffff8003fef787b0 [ 196.671131] x1 : 00008003f61d0000 x0 : 000000000000001c [ 196.676434] [ 196.677916] ---[ end trace 09dac66497f8a0d6 ]--- [ 196.682524] Call trace: [ 196.684961] Exception stack(0xffff8003dc1e7a70 to 0xffff8003dc1e7ba0) [ 196.691392] 7a60: ffff8003dc4f8a00 0001000000000000 [ 196.699215] 7a80: ffff8003dc1e7c40 ffff000008132d04 ffff8003dc1e7c40 ffff8003dc1e7c40 [ 196.707037] 7aa0: ffff8003dc1e7c00 00000000ffffffc8 ffff8003dc1e7b00 ffff00000812fdb8 [ 196.714859] 7ac0: ffff8003dc1e7bc0 ffff00000878aaa4 ffff8003c8ab4800 ffff000008b3b398 [ 196.722682] 7ae0: ffff8003dc1e7c40 ffff8003dc1e7c40 ffff8003dc1e7c00 00000000ffffffc8 [ 196.730504] 7b00: ffff8003dc1e7ba0 ffff0000081ebc80 000000000000001c 00008003f61d0000 [ 196.738326] 7b20: ffff8003fef787b0 0000000000000000 0000000000000000 0000000000000000 [ 196.746149] 7b40: ffff000008fd9e20 65206465636e616c 000000000000353a 0000000000000006 [ 196.753971] 7b60: 0000000000000006 0000000000000000 0000000000000b68 0000b68000003cdc [ 196.761793] 7b80: 0000000000000001 00000000a6b67fff 00000000701097db 00000000a619fadc [ 196.769618] [<ffff000008132d04>] __enable_irq+0x74/0x84 [ 196.774835] [<ffff000008132d54>] enable_irq+0x40/0x74 [ 196.779900] [<ffff0000017f3f94>] xgbe_one_poll+0x80/0xbc [amd_xgbe] [ 196.786160] [<ffff00000878c06c>] net_rx_action+0x2fc/0x3f8 [ 196.791637] [<ffff0000080d0f5c>] __do_softirq+0x13c/0x2fc [ 196.797026] [<ffff0000080d1164>] run_ksoftirqd+0x48/0x70 [ 196.802330] [<ffff0000080f02ac>] smpboot_thread_fn+0x16c/0x1ac [ 196.808154] [<ffff0000080eb7fc>] kthread+0x100/0x12c [ 196.813110] [<ffff000008082b70>] ret_from_fork+0x10/0x20 [ 196.818519] Unbalanced enable for IRQ 30 [ 196.822439] ------------[ cut here ]------------ warnings as fast as possible when under network load. Version-Release number of selected component (if applicable): 4.11rc1 How reproducible: 100% of the time when running vnc/nfs/etc and loading the network adapter. There are a couple likely looking commits that may be causing this problem. Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
A fix is being merged: https://www.mail-archive.com/netdev@vger.kernel.org/msg157242.html
This landed in 4.11rc3: commit d7aba644ffdebf756e51e26a2229055211838e89 Author: Lendacky, Thomas <Thomas.Lendacky> Date: Thu Mar 9 17:48:23 2017 -0600 amd-xgbe: Enable IRQs only if napi_complete_done() is true Depending on the hardware, the amd-xgbe driver may use disable_irq_nosync() and enable_irq() when an interrupt is received to process Rx packets. If the napi_complete_done() return value isn't checked an unbalanced enable for the IRQ could result, generating a warning stack trace. Update the driver to only enable interrupts if napi_complete_done() returns true. Reported-by: Jeremy Linton <jeremy.linton> Signed-off-by: Tom Lendacky <thomas.lendacky> Signed-off-by: David S. Miller <davem>