Created attachment 370288 [details] crash backtrace Description of problem: When resuming from suspend, NetworkManager will sometimes crash. The following will appear in /var/log/messages somewhere around 100k times: Nov 18 10:30:57 vagabond NetworkManager: <WARN> nm_call_store_remove(): Trying to remove a non-existant call id. The backtrace appears to show that NetworkManager enters a recursive function call until it crashes. I haven't gone through things closely enough to be sure... Version-Release number of selected component (if applicable): NetworkManager-0.7.996-6.git20091021.fc12.x86_64 As far as I know, this problem affects all of the NetworkManager releases under F11 and F12. How reproducible: Frequent. Steps to Reproduce: 1. Suspend 2. Resume 3. Repeat Actual results: Eventually NetworkManager will crash, and the nm-applet will no longer appear in the Gnome panel. Restarting the NetworkManager service will cause the icon to appear once again. More info: I've attached a backtrace from gdb which has about 41MB of repetition removed. All of the lines removed are exactly the same as the repeating lines which appear in the backtrace. If it would help, I have the full backtrace and the core file (41MB and 26MB, respectively).
I've also seen this error happen a few times after upgrading from F11 to F12. I never saw it under F11, though, but I also switched from x86 to x86_64 when I upgraded so it might be that it only affects 64-bits systems. No change in hardware though, see Smolt profile at <http://www.smolts.org/show?uuid=pub_42d1d513-8f1c-4746-9d6b-7301d0ba27aa>. I see that the system log reports a crash in wpa_supplicant. No idea if this is because of, or causing, the NM crash. Tore
Created attachment 371261 [details] Syslog output from when my laptop was resumed The log is filtered through "uniq -c" to reduce its size as well as making it more readable.
Thanks, Tore. I'd completely missed seeing that in the messages log. Now that you point it out, I see the same thing. abrt also saved the core file from wpa_supplicant. I've opened bug 539438 to track that one.
Looks like unexpected recursion in nm_supplicant_info_destroy ?
*** Bug 541476 has been marked as a duplicate of this bug. ***
Created attachment 405876 [details] Backtrace as generated by abrt I've seen this as well, and can confirm, like the others that the trigger seems to be wpa_supplicant dying (due to a separate wpa_supplicant bug). NM should be able to cope with that but doesn't. And like the others, /var/log/messages has many lines of the following form: Apr 12 01:47:51 lert NetworkManager: <WARN> nm_call_store_remove(): Trying to remove a non-existant call id. And for reference, here's what /var/log/messages said as it was bringing up the connection (before wpa_supplicant died) which shows state changes similar to Tore's: Apr 12 01:47:49 lert NetworkManager: <info> Waking up... Apr 12 01:47:49 lert NetworkManager: <info> (eth1): now managed Apr 12 01:47:49 lert NetworkManager: <info> (eth1): device state change: 1 -> 2 (reason 2) Apr 12 01:47:49 lert NetworkManager: <info> (eth1): bringing up device. Apr 12 01:47:49 lert NetworkManager: <info> (eth1): preparing device. Apr 12 01:47:49 lert NetworkManager: <info> (eth1): deactivating device (reason: 2). Apr 12 01:47:49 lert NetworkManager: <info> (eth0): now managed Apr 12 01:47:49 lert NetworkManager: <info> (eth0): device state change: 1 -> 2 (reason 2) Apr 12 01:47:49 lert NetworkManager: <info> (eth0): bringing up device. Apr 12 01:47:49 lert NetworkManager: <info> (eth0): preparing device. Apr 12 01:47:49 lert NetworkManager: <info> (eth0): deactivating device (reason: 2). Apr 12 01:47:49 lert kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready Apr 12 01:47:49 lert kernel: wpa_supplicant[1469]: segfault at 18 ip 00007fa1a3a70b94 sp 00007fffccdb5390 error 4 in libc-2.11.1.so[7fa1a39fc 000+16f000] Apr 12 01:47:50 lert abrt[3804]: saved core dump of pid 1469 (/usr/sbin/wpa_supplicant) to /var/cache/abrt/ccpp-1271033269-1469.new/coredump (897024 bytes) Apr 12 01:47:50 lert abrtd: Directory 'ccpp-1271033269-1469' creation detected Apr 12 01:47:50 lert NetworkManager: <WARN> nm_call_store_remove(): Trying to remove a non-existant call id. [ repeat times lots ] In normal operation, I would have expected it instead to have output: NetworkManager: <info> (eth1): supplicant interface state: starting -> ready Also in case it helps at all, I am attaching the backtrace generated by abrt. This has the slight advantage of including some of the local variables, although admittedly to my untutored eye, this may not be useful in practice. But I've attached it anyway, just in case.
Upstream fix is 5a01a0b39e634e2cf3c378deb73f15b16645b76e.
NetworkManager-0.8.0-7.git20100422.fc13 has been submitted as an update for Fedora 13. http://admin.fedoraproject.org/updates/NetworkManager-0.8.0-7.git20100422.fc13
NetworkManager-0.8.0-7.git20100422.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.
*** Bug 579821 has been marked as a duplicate of this bug. ***
NetworkManager-0.7.2.997-1.fc11 has been submitted as an update for Fedora 11. http://admin.fedoraproject.org/updates/NetworkManager-0.7.2.997-1.fc11
Hmm, since both F13 and F11 have been fixed, has F12 been forgotten? I write after it happening yet again (NetworkManager-0.8.0-6.git20100408.fc12.x86_64). I can send an abrt log if it's meant to have been fixed in this version.
NetworkManager-0.7.2.997-1.fc11 has been pushed to the Fedora 11 stable repository. If problems still persist, please make note of it in this bug report.