Bug 581620 - [e1000e] spams userspace with netlink events
Summary: [e1000e] spams userspace with netlink events
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-12 19:47 UTC by Marc Sauton
Modified: 2010-12-03 15:59 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-12-03 15:59:30 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Marc Sauton 2010-04-12 19:47:55 UTC
Description of problem:

F12 x86_64
using iwl3945 for wireless and vpn to Red Hat office.

one of my T61's core is permanently used, in the 20-100 % cpu usage range, e.g.:
    9 root      20   0     0    0    0 R 100.7  0.0  12:24.35 events/0

Just found out if I stop NetworkManager, system load goes back to normal.


Version-Release number of selected component (if applicable):

Note: This laptop had an F12 upgrade ran, not a full clean install.

Was ok for a few weeks/month(?)
Just did a yum update to get to the lastest, in case, but no change, now has:

Linux testms 2.6.32.11-99.fc12.x86_64 #1 SMP Mon Apr 5 19:59:38 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

NetworkManager-0.7.998-2.git20100106.fc12.x86_64
NetworkManager-glib-0.7.998-2.git20100106.fc12.x86_64
NetworkManager-gnome-0.7.998-2.git20100106.fc12.x86_64
NetworkManager-vpnc-0.7.996-4.git20090921.fc12.x86_64


How reproducible:
always


Steps to Reproduce:
1. yum update
2. reboot
3. wireless and vpn

  
Actual results:
"slow" laptop


Expected results:


Additional info:

Comment 1 Jirka Klimes 2010-04-21 10:43:53 UTC
It looks like a driver problem (maybe in connection with a particular hardware). I've found number of posts indicating that.

Does it appear only with VPN? Or just wireless alone will cause the problem?
Wired ethernet is OK? E.g. this thread reports problems in e1000e : http://www.mail-archive.com/e1000-devel@lists.sourceforge.net/msg02239.html

Could you try 'network' service instead of Network Manager to find out if NM triggers it or not.

Some references:
http://bbs.archlinux.org/viewtopic.php?id=88781
http://www.linuxquestions.org/questions/linux-kernel-70/events-0-process-usage-441837/
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/537396

Comment 2 Dan Williams 2010-04-25 07:24:36 UTC
One thing to do (while still running NM) is:

rmmod iwl3945
modprobe iwl3945

or

rmmod e1000e
modprobe e1000e

if the problem goes away, then it's definitely a driver issue.  If so, we'll punt it over to the  kernel.  Can you also grab the output of 'lspci' and paste it in here?  THanks!

Comment 3 sho.shimauchi 2010-05-03 01:27:32 UTC
I'm faced with a same problem.
This problem occured maybe a few days ago, but I found this today.

Description of problem:

PC Thinkpad X61
Distribution F12 x86_64

events process eat my CPU resource.(30-100%, change periodically)

---
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
    9 root      20   0     0    0    0 R 59.7  0.0  13:32.10 events/0
---

If I stop NetworkManager, system load goes back to normal.


Version-Release number of selected component (if applicable):

uname -pr 
2.6.32.11-105.fc12.i686.PAE i686

NetworkManager-0.8.0-8.git20100426.fc12.i686
NetworkManager-pptp-0.8.0-1.git20100411.fc12.i686
NetworkManager-glib-0.8.0-8.git20100426.fc12.i686
NetworkManager-gnome-0.8.0-8.git20100426.fc12.i686
NetworkManager-openconnect-0.8.0-1.git20100411.fc12.i686
NetworkManager-vpnc-0.8.0-1.git20100411.fc12.i686
NetworkManager-openvpn-0.8-2.git20100411.fc12.i686



How reproducible:
always


Steps to Reproduce:
Boot my PC, and the problem is happen.


Actual results:
"slow" laptop

Comment 4 sho.shimauchi 2010-05-03 02:40:17 UTC
I disabled e1000e, and CPU exhaustion was stopped.

# rmmod e1000e

I think the cause may be network card problem.

Comment 5 sho.shimauchi 2010-05-03 12:56:07 UTC
I also did 'rmmod e1000e; modprobe e1000e', and the problem was not occuered.

This is my 'lspci' output.

---
# lspci
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:19.0 Ethernet controller: Intel Corporation 82566MM Gigabit Network Connection (rev 03)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3)
00:1f.0 ISA bridge: Intel Corporation 82801HBM (ICH8M-E) LPC Interface Controller (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) SATA AHCI Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 03)
03:00.0 Network controller: Intel Corporation PRO/Wireless 4965 AG or AGN [Kedron] Network Connection (rev 61)
05:00.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev ba)
05:00.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 04)
05:00.2 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 21)

---

Comment 6 Dan Williams 2010-05-04 21:48:16 UTC
Ok, over to kernel it is then.  NM has some netlink throttling code which could be broken, but the driver shouldn't be spamming userspace with netlink events either...

kernel team: I'm not entirely sure how to get it back in this state; NM doesn't do anything particularly interesting to the device except setting IFF_UP, asking for ethtool/MII information for carrier state support, and getting the MAC address.  Internally NM registers for a variety of netlink events like interface flags, newaddr/deladdr, newprefix, nduseropt, ipv6 info, etc.  We've had some issues with netlink spammage before:

rh bug #459205
novell #443429
lp #284507

Comment 7 Kirill Kolyshkin 2010-07-24 17:47:46 UTC
I have the very same problem with F12 on X61s laptop. Will retry on the latest F12 kernel and report here soon.

Comment 8 Kirill Kolyshkin 2010-07-24 18:41:40 UTC
OK, latest F12 kernel, same problem.

Except from top:


top - 22:39:47 up 5 min,  3 users,  load average: 1.43, 1.10, 0.52
Tasks: 185 total,   4 running, 181 sleeping,   0 stopped,   0 zombie
Cpu(s): 18.8%us, 62.5%sy,  0.0%ni, 12.5%id,  6.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   2020520k total,   492988k used,  1527532k free,    36640k buffers
Swap:  4063224k total,        0k used,  4063224k free,   252068k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
   10 root      20   0     0    0    0 R 103.9  0.0   2:42.59 events/1

$ lspci
00:00.0 Host bridge: Intel Corporation Mobile PM965/GM965/GL960 Memory Controller Hub (rev 0c)
00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:02.1 Display controller: Intel Corporation Mobile GM965/GL960 Integrated Graphics Controller (rev 0c)
00:19.0 Ethernet controller: Intel Corporation 82566MM Gigabit Network Connection (rev 03)
00:1a.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #4 (rev 03)
00:1a.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #5 (rev 03)
00:1a.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #2 (rev 03)
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)
00:1c.0 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 1 (rev 03)
00:1c.1 PCI bridge: Intel Corporation 82801H (ICH8 Family) PCI Express Port 2 (rev 03)
00:1d.0 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #1 (rev 03)
00:1d.1 USB Controller: Intel Corporation 82801H (ICH8 Family) USB UHCI Controller #2 (rev 03)
00:1d.7 USB Controller: Intel Corporation 82801H (ICH8 Family) USB2 EHCI Controller #1 (rev 03)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev f3)
00:1f.0 ISA bridge: Intel Corporation 82801HBM (ICH8M-E) LPC Interface Controller (rev 03)
00:1f.1 IDE interface: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) IDE Controller (rev 03)
00:1f.2 SATA controller: Intel Corporation 82801HBM/HEM (ICH8M/ICH8M-E) SATA AHCI Controller (rev 03)
00:1f.3 SMBus: Intel Corporation 82801H (ICH8 Family) SMBus Controller (rev 03)
03:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG [Golan] Network Connection (rev 02)
05:00.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev ba)
05:00.1 FireWire (IEEE 1394): Ricoh Co Ltd R5C832 IEEE 1394 Controller (rev 04)
05:00.2 SD Host controller: Ricoh Co Ltd R5C822 SD/SDIO/MMC/MS/MSPro Host Adapter (rev 21)

$  uname -a
Linux sus 2.6.32.16-141.fc12.i686 #1 SMP Wed Jul 7 04:47:25 UTC 2010 i686 i686 i386 GNU/Linux

unloading e1000e helps

Comment 9 Kirill Kolyshkin 2010-07-24 18:53:38 UTC
Same with 2.6.33 kernel from F13:
$ uname -a
Linux sus 2.6.33.6-147.fc13.i686 #1 SMP Tue Jul 6 22:30:55 UTC 2010 i686 i686 i386 GNU/Linux

$ top

top - 22:52:00 up 1 min,  2 users,  load average: 3.33, 1.05, 0.36
Tasks: 184 total,   2 running, 182 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.3%us, 31.7%sy,  0.0%ni, 61.0%id,  6.8%wa,  0.0%hi,  0.2%si,  0.0%st
Mem:   2020656k total,   363644k used,  1657012k free,    33808k buffers
Swap:  4063224k total,        0k used,  4063224k free,   171344k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            
    9 root      20   0     0    0    0 S 62.2  0.0   0:32.03 events/0

Comment 10 Chuck Ebbert 2010-07-25 20:43:05 UTC
(In reply to comment #6)
> Ok, over to kernel it is then.  NM has some netlink throttling code which could
> be broken, but the driver shouldn't be spamming userspace with netlink events
> either...
> 
> kernel team: I'm not entirely sure how to get it back in this state; NM doesn't
> do anything particularly interesting to the device except setting IFF_UP,
> asking for ethtool/MII information for carrier state support, and getting the
> MAC address.  Internally NM registers for a variety of netlink events like
> interface flags, newaddr/deladdr, newprefix, nduseropt, ipv6 info, etc.  We've
> had some issues with netlink spammage before:
> 
> rh bug #459205
> novell #443429
> lp #284507    

Can we get a dump of what events it's sending?

Comment 11 Manuel Bejarano 2010-08-10 19:28:16 UTC
Same issue here too:

Thinkpad x61s running Fedora 13:

Linux helena 2.6.33.6-147.2.4.fc13.x86_64 #1 SMP Fri Jul 23 17:14:44 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux

Unloading and loading back the e1000e module helps.

Comment 12 Chuck Ebbert 2010-08-11 12:17:09 UTC
Without a dump of exactly what events the adapter is sending I'll probably just have to close this bug.

Comment 13 Kirill Kolyshkin 2010-08-11 13:02:44 UTC
(In reply to comment #12)
> Without a dump of exactly what events the adapter is sending I'll probably just
> have to close this bug.    

Chuck,

You probably assume that every user who commented "mee too" here knows how to obtain what you request. This assumption is just wrong, so it would be cardinally helpful if you could provide a command we need to run in order to collect the dump. I quickly googled for it but with no luck. There is nothing in system logs etc.

I am currently on vacation and that notebook is not with me, but other people can do it, or me after 20th of Aug.

Comment 14 Bug Zapper 2010-11-03 17:17:26 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Bug Zapper 2010-12-03 15:59:30 UTC
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.