Bug 541476

Summary: NetworkManager dies upon return from suspend
Product: [Fedora] Fedora Reporter: Sean Waite <swaite>
Component: NetworkManagerAssignee: Dan Williams <dcbw>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 12CC: dcbw, ivan.mironov, jnicolet, torvalds
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-02-09 02:03:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sean Waite 2009-11-26 01:12:00 UTC
Description of problem:

When laptop returns from suspend, NetworkManager Segfaults and dies


Version-Release number of selected component (if applicable):

Fedora 12 x86_64 running on a Lenovo T61.

NetworkManager-0.7.996-6.git20091021.fc12.x86_64

How reproducible:

Happens every once and a while, does not occur every time

Steps to Reproduce:
1. Suspend Laptop
2. Unsuspend laptop
  
Actual results:

NetworkManager isn't there when OS returns from suspend

Expected results:

NetworkManager comes up and retrieves DHCP lease etc.
Additional info:


From /var/log/messages:


Nov 25 19:50:08 erebus NetworkManager: <info>  Waking up...
Nov 25 19:50:08 erebus NetworkManager: <info>  (eth0): now managed
Nov 25 19:50:08 erebus NetworkManager: <info>  (eth0): device state change: 1 -> 2 (reason 2)
Nov 25 19:50:08 erebus NetworkManager: <info>  (eth0): bringing up device.
Nov 25 19:50:08 erebus kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Nov 25 19:50:08 erebus NetworkManager: <info>  (eth0): preparing device.
Nov 25 19:50:08 erebus NetworkManager: <info>  (eth0): deactivating device (reason: 2).
Nov 25 19:50:08 erebus NetworkManager: <info>  (wlan0): now managed
Nov 25 19:50:08 erebus NetworkManager: <info>  (wlan0): device state change: 1 -> 2 (reason 2)
Nov 25 19:50:08 erebus NetworkManager: <info>  (wlan0): bringing up device.
Nov 25 19:50:08 erebus kernel: Registered led device: iwl-phy0::radio
Nov 25 19:50:08 erebus kernel: Registered led device: iwl-phy0::assoc
Nov 25 19:50:08 erebus kernel: Registered led device: iwl-phy0::RX
Nov 25 19:50:08 erebus kernel: Registered led device: iwl-phy0::TX
Nov 25 19:50:08 erebus NetworkManager: <info>  (wlan0): preparing device.
Nov 25 19:50:08 erebus kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready
Nov 25 19:50:08 erebus NetworkManager: <info>  (wlan0): deactivating device (reason: 2).
Nov 25 19:50:08 erebus kernel: wpa_supplicant[24195]: segfault at 18 ip 00000038c66746b4 sp 00007fff16396610 error 4 in libc-2.11.so[38c6600000+16f000]
Nov 25 19:50:08 erebus NetworkManager: <WARN>  nm_call_store_remove(): Trying to remove a non-existant call id.
Nov 25 19:50:08 erebus NetworkManager: <WARN>  nm_call_store_remove(): Trying to remove a non-existant call id.
Nov 25 19:50:08 erebus NetworkManager: <WARN>  nm_call_store_remove(): Trying to remove a non-existant call id.

.........

Nov 25 19:50:10 erebus NetworkManager: <WARN>  nm_call_store_remove(): Trying to remove a non-existant call id.
Nov 25 19:50:10 erebus NetworkManager: <WARN>  nm_call_store_remove(): Trying to remove a non-existant call id.
Nov 25 19:50:10 erebus abrtd: Directory 'ccpp-1259196610-1389' creation detected
Nov 25 19:50:10 erebus abrtd: Lock file '/var/cache/abrt/ccpp-1259196610-1389.lock' is locked by process 25534
Nov 25 19:50:10 erebus abrt: saved core dump of pid 1389 to /var/cache/abrt/ccpp-1259196610-1389/coredump
Nov 25 19:50:10 erebus abrtd: Getting local universal unique identification...
Nov 25 19:50:10 erebus abrtd: Crash is in database already
Nov 25 19:50:10 erebus abrtd: Already saved crash, deleting...




The "nm_call_store_remove()" error message repeats about 41000 times before giving up, I merely truncated the middle.

Comment 1 Linus Torvalds 2009-12-05 00:36:15 UTC
My daughter just reported this same bug. And since I don't want to give her the root password, she's kind of screwed when this happens - no network at school.

Same thing: a long stream of

   nm_call_store_remove(): Trying to remove a non-existant call id.

in the logs, followed by a core dump (except core dumps are disabled, so it just says "core ulimit 0".

Very annoying. And I think it's recent, I don't think I saw this when I was testing that laptop before giving it to her.

Comment 2 Dan Williams 2009-12-05 01:31:52 UTC
Yes, I believe this is caused by a wpa_supplicant segfault.  NM should certainly be  more resilient here.

Comment 3 Dan Williams 2010-02-09 02:03:08 UTC

*** This bug has been marked as a duplicate of bug 538717 ***