Description of problem: Checking /var/log/messages after a reboot, I notice a NM shutdown 'crash'. messages attached. Grepping old message files shows NM crashes starting on or about 9 March. Not sure if they are all the same. System is Thinkpad X60 with 3945 wifi. Version-Release number of selected component (if applicable): NetworkManager-0.7.0-0.9.1.svn3440.fc9.i386 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Created attachment 298789 [details] /var/log/messages snippet showing NM shutdown crash
Ah ... just noticed this: * Tue Mar 18 2008 Dan Williams <dcbw> - 1:0.7.0-0.9.1.svn3472 - Fix crashes on suspend/resume and exit (rh #437426) - Ensure there's always an option to chose the wired device - Never set default route via an IPv4 link-local addressed device (rh #437338) I'll update and retest, and close if this goes away.
Created attachment 298797 [details] complete /var/log/messages showing NM crash on shutdown I still seem to get a crash on shutdown. I attach the entire /var/log/messages
Created attachment 298888 [details] /var/log/messages showing possible circ. lockdep + NM crash I am getting these crashes every shutdown. Last time, I got the following (not sure its related or a different problem) right before the above type crash. Full '/var/log/messages' attached. Mar 23 20:36:32 localhost kernel: [ INFO: possible circular locking dependency detected ] Mar 23 20:36:32 localhost kernel: 2.6.25-0.139.rc6.git5.fc9 #1 Mar 23 20:36:32 localhost kernel: ------------------------------------------------------- Mar 23 20:36:32 localhost kernel: NetworkManager/2477 is trying to acquire lock: Mar 23 20:36:32 localhost kernel: ((name)){--..}, at: [flush_workqueue+0/133] flush_workqueue+0x0/0x85 Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: but task is already holding lock: Mar 23 20:36:32 localhost kernel: (rtnl_mutex){--..}, at: [rtnetlink_rcv+18/38] rtnetlink_rcv+0x12/0x26 Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: which lock already depends on the new lock. Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: the existing dependency chain (in reverse order) is: Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: -> #2 (rtnl_mutex){--..}: Mar 23 20:36:32 localhost kernel: [__lock_acquire+2713/3089] __lock_acquire+0xa99/0xc11 Mar 23 20:36:32 localhost kernel: [lock_acquire+106/144] lock_acquire+0x6a/0x90 Mar 23 20:36:32 localhost kernel: [mutex_lock_nested+219/625] mutex_lock_nested+0xdb/0x271 Mar 23 20:36:32 localhost kernel: [rtnl_lock+15/17] rtnl_lock+0xf/0x11 Mar 23 20:36:32 localhost kernel: [<f8e53628>] ieee80211_associated+0x15c/0x19b [mac80211] Mar 23 20:36:32 localhost kernel: [<f8e55eaa>] ieee80211_sta_work+0x15ad/0x172a [mac80211] Mar 23 20:36:32 localhost kernel: [run_workqueue+211/417] run_workqueue+0xd3/0x1a1 Mar 23 20:36:32 localhost kernel: [worker_thread+182/194] worker_thread+0xb6/0xc2 Mar 23 20:36:32 localhost kernel: [kthread+59/97] kthread+0x3b/0x61 Mar 23 20:36:32 localhost kernel: [kernel_thread_helper+7/16] kernel_thread_helper+0x7/0x10 Mar 23 20:36:32 localhost kernel: [<ffffffff>] 0xffffffff Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: -> #1 (&ifsta->work){--..}: Mar 23 20:36:32 localhost kernel: [__lock_acquire+2713/3089] __lock_acquire+0xa99/0xc11 Mar 23 20:36:32 localhost kernel: [run_workqueue+205/417] run_workqueue+0xcd/0x1a1 Mar 23 20:36:32 localhost kernel: [worker_thread+182/194] worker_thread+0xb6/0xc2 Mar 23 20:36:32 localhost kernel: [kthread+59/97] kthread+0x3b/0x61 Mar 23 20:36:32 localhost kernel: [kernel_thread_helper+7/16] kernel_thread_helper+0x7/0x10 Mar 23 20:36:32 localhost kernel: [<ffffffff>] 0xffffffff Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: -> #0 ((name)){--..}: Mar 23 20:36:32 localhost kernel: [__lock_acquire+2488/3089] __lock_acquire+0x9b8/0xc11 Mar 23 20:36:32 localhost kernel: [lock_acquire+106/144] lock_acquire+0x6a/0x90 Mar 23 20:36:32 localhost kernel: [flush_workqueue+68/133] flush_workqueue+0x44/0x85 Mar 23 20:36:32 localhost kernel: [<f8e4b421>] ieee80211_stop+0x28d/0x348 [mac80211] Mar 23 20:36:32 localhost kernel: [dev_close+82/111] dev_close+0x52/0x6f Mar 23 20:36:32 localhost kernel: [dev_change_flags+159/338] dev_change_flags+0x9f/0x152 Mar 23 20:36:32 localhost kernel: [do_setlink+586/764] do_setlink+0x24a/0x2fc Mar 23 20:36:32 localhost kernel: [rtnetlink_rcv_msg+418/444] rtnetlink_rcv_msg+0x1a2/0x1bc Mar 23 20:36:32 localhost kernel: [netlink_rcv_skb+48/134] netlink_rcv_skb+0x30/0x86 Mar 23 20:36:32 localhost kernel: [rtnetlink_rcv+30/38] rtnetlink_rcv+0x1e/0x26 Mar 23 20:36:32 localhost kernel: [netlink_unicast+439/533] netlink_unicast+0x1b7/0x215 Mar 23 20:36:32 localhost kernel: [netlink_sendmsg+600/613] netlink_sendmsg+0x258/0x265 Mar 23 20:36:32 localhost kernel: [sock_sendmsg+222/249] sock_sendmsg+0xde/0xf9 Mar 23 20:36:32 localhost kernel: [sys_sendmsg+319/402] sys_sendmsg+0x13f/0x192 Mar 23 20:36:32 localhost kernel: [sys_socketcall+363/392] sys_socketcall+0x16b/0x188 Mar 23 20:36:32 localhost kernel: [syscall_call+7/11] syscall_call+0x7/0xb Mar 23 20:36:32 localhost kernel: [<ffffffff>] 0xffffffff Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: other info that might help us debug this: Mar 23 20:36:32 localhost kernel: Mar 23 20:36:32 localhost kernel: 1 lock held by NetworkManager/2477: Mar 23 20:36:32 localhost kernel: stack backtrace: Mar 23 20:36:32 localhost kernel: Pid: 2477, comm: NetworkManager Not tainted 2.6.25-0.139.rc6.git5.fc9 #1 Mar 23 20:36:32 localhost kernel: [print_circular_bug_tail+91/102] print_circular_bug_tail+0x5b/0x66 Mar 23 20:36:32 localhost kernel: [print_circular_bug_entry+57/67] ? print_circular_bug_entry+0x39/0x43 Mar 23 20:36:32 localhost kernel: [__lock_acquire+2488/3089] __lock_acquire+0x9b8/0xc11 Mar 23 20:36:32 localhost kernel: [lock_acquire+106/144] lock_acquire+0x6a/0x90 Mar 23 20:36:32 localhost kernel: [flush_workqueue+0/133] ? flush_workqueue+0x0/0x85 Mar 23 20:36:32 localhost kernel: [flush_workqueue+68/133] flush_workqueue+0x44/0x85 Mar 23 20:36:32 localhost kernel: [flush_workqueue+0/133] ? flush_workqueue+0x0/0x85 Mar 23 20:36:32 localhost kernel: [<f8e4b421>] ieee80211_stop+0x28d/0x348 [mac80211] Mar 23 20:36:32 localhost kernel: [_spin_unlock_bh+37/40] ? _spin_unlock_bh+0x25/0x28 Mar 23 20:36:32 localhost kernel: [dev_close+82/111] dev_close+0x52/0x6f Mar 23 20:36:32 localhost kernel: [dev_change_flags+159/338] dev_change_flags+0x9f/0x152 Mar 23 20:36:32 localhost kernel: [do_setlink+586/764] do_setlink+0x24a/0x2fc Mar 23 20:36:32 localhost kernel: [_read_unlock+29/32] ? _read_unlock+0x1d/0x20 Mar 23 20:36:32 localhost kernel: [rtnl_setlink+226/230] rtnl_setlink+0xe2/0xe6 Mar 23 20:36:32 localhost kernel: [rtnl_setlink+0/230] ? rtnl_setlink+0x0/0xe6 Mar 23 20:36:32 localhost kernel: [rtnetlink_rcv_msg+418/444] rtnetlink_rcv_msg+0x1a2/0x1bc Mar 23 20:36:32 localhost kernel: [rtnetlink_rcv_msg+0/444] ? rtnetlink_rcv_msg+0x0/0x1bc Mar 23 20:36:32 localhost kernel: [netlink_rcv_skb+48/134] netlink_rcv_skb+0x30/0x86 Mar 23 20:36:32 localhost kernel: [rtnetlink_rcv+30/38] rtnetlink_rcv+0x1e/0x26 Mar 23 20:36:32 localhost kernel: [netlink_unicast+439/533] netlink_unicast+0x1b7/0x215 Mar 23 20:36:32 localhost kernel: [netlink_sendmsg+600/613] netlink_sendmsg+0x258/0x265 Mar 23 20:36:32 localhost kernel: [sock_sendmsg+222/249] sock_sendmsg+0xde/0xf9 Mar 23 20:36:32 localhost kernel: [autoremove_wake_function+0/51] ? autoremove_wake_function+0x0/0x33 Mar 23 20:36:32 localhost kernel: [native_sched_clock+181/209] ? native_sched_clock+0xb5/0xd1 Mar 23 20:36:32 localhost kernel: [sched_clock+8/11] ? sched_clock+0x8/0xb Mar 23 20:36:32 localhost kernel: [lock_release_holdtime+26/277] ? lock_release_holdtime+0x1a/0x115 Mar 23 20:36:32 localhost kernel: [fget_light+142/185] ? fget_light+0x8e/0xb9 Mar 23 20:36:32 localhost kernel: [copy_from_user+57/289] ? copy_from_user+0x39/0x121 Mar 23 20:36:32 localhost kernel: [verify_iovec+64/111] ? verify_iovec+0x40/0x6f Mar 23 20:36:32 localhost kernel: [sys_sendmsg+319/402] sys_sendmsg+0x13f/0x192 Mar 23 20:36:32 localhost kernel: [sys_recvmsg+366/379] ? sys_recvmsg+0x16e/0x17b Mar 23 20:36:32 localhost kernel: [sys_sendto+181/195] ? sys_sendto+0xb5/0xc3 Mar 23 20:36:32 localhost kernel: [dput+156/239] ? dput+0x9c/0xef Mar 23 20:36:32 localhost kernel: [copy_to_user+59/266] ? copy_to_user+0x3b/0x10a Mar 23 20:36:32 localhost kernel: [cp_new_stat64+228/246] ? cp_new_stat64+0xe4/0xf6 Mar 23 20:36:32 localhost kernel: [sys_socketcall+363/392] sys_socketcall+0x16b/0x188 Mar 23 20:36:32 localhost kernel: [syscall_call+7/11] syscall_call+0x7/0xb Mar 23 20:36:32 localhost kernel: =======================
Ooops.... previous if from kernel-2.6.25-0.139.rc6.git5.fc9.i686, Thinkpad X60, full Rawhide.
Hmm; looks like dbus-glib is having issues when shutting stuff down; perhaps dbus has already been shut down? I'll have to investigate.
Created attachment 299656 [details] st-NetworkManager-stop-erasedpackage.txt I see this also, although I didn't notice it on shutdown... I accidentally removed NetworkManager while using it to manage the wired network configurations on the system. The network shutdown to remove the package caused the traces in the attachment. The tulip issue as mentioned there is just another current issue with the machine that was triggered by this network shutdown.
I haven't seen one of these since 8 April. Anyone still seeing this?
I have seen NM fail to stop but I haven't noticed this crash specifically. Tom could you try duplicating my 'remove NM' and see whether it triggers this bug before you close it? I'll try it again as well. Maybe normal use doesn't show it anymore but that might.
Sure, but how? ============================================================================= Package Arch Version Repository Size ============================================================================= Removing: NetworkManager i386 1:0.7.0-0.9.1.svn3549.fc9 installed 2.5 M Removing for dependencies: NetworkManager-glib i386 1:0.7.0-0.9.1.svn3549.fc9 installed 127 k NetworkManager-gnome i386 1:0.7.0-0.9.1.svn3549.fc9 installed 826 k NetworkManager-vpnc i386 1:0.7.0-0.7.7.svn3549.fc9 installed 329 k PackageKit i386 0.1.11-1.fc9 installed 1.4 M PackageKit-libs i386 0.1.11-1.fc9 installed 253 k bigboard i386 0.5.31-2.fc9 installed 2.1 M evolution i386 2.22.1-1.fc9 installed 39 M evolution-exchange i386 2.22.1-1.fc9 installed 2.4 M finch i386 2.4.1-2.fc9 installed 534 k gnome-packagekit i386 0.1.11-3.fc9 installed 1.7 M krb5-auth-dialog i386 0.7-7.fc9 installed 52 k libpurple i386 2.4.1-2.fc9 installed 20 M libpurple-perl i386 2.4.1-2.fc9 installed 1.0 M libpurple-tcl i386 2.4.1-2.fc9 installed 59 k nautilus-sendto i386 0.14.0-1.fc9 installed 348 k online-desktop i386 0.2.25-1.fc9 installed 271 k pidgin i386 2.4.1-2.fc9 installed 2.2 M pidgin-debuginfo i386 2.4.1-2.fc9 installed 24 M pidgin-libnotify i386 0.13-3.fc9 installed 51 k pidgin-otr i386 3.1.0-3.fc9 installed 156 k system-config-printer i386 0.7.82.2-1.fc9 installed 1.3 M yum-packagekit i386 0.1.11-1.fc9 installed 2.5 k Transaction Summary ============================================================================= Install 0 Package(s) Update 0 Package(s) Remove 23 Package(s) Is this ok [y/N]: n Exiting on user Command Complete! [root@localhost ~]# and [root@localhost ~]# rpm -e NetworkManager error: Failed dependencies: libnm-util.so.0 is needed by (installed) NetworkManager-glib-0.7.0-0.9.1.svn3549.fc9.i386 libnm-util.so.0 is needed by (installed) NetworkManager-vpnc-0.7.0-0.7.7.svn3549.fc9.i386 libnm-util.so.0 is needed by (installed) NetworkManager-gnome-0.7.0-0.9.1.svn3549.fc9.i386 NetworkManager >= 1:0.7.0-0.9.1.svn3549 is needed by (installed) NetworkManager-vpnc-0.7.0-0.7.7.svn3549.fc9.i386 NetworkManager = 1:0.7.0-0.9.1.svn3549.fc9 is needed by (installed) NetworkManager-gnome-0.7.0-0.9.1.svn3549.fc9.i386 [root@localhost ~]#
Well when I caused it to happen thats basically the remove operation I did, then put them all back. If you do try it you might want to have the packages in yum cache in case you have network problems getting it back up. I removed PackageKit which pulled out NM and most of those other packages with it. It was probably NM shutting down before removal that caused the error, so it may not be necessary to do anything except try: rpm -e --nodeps NetworkManager then put it back.
Yuck..... OK. I did 'rpm -e --nodeps NetworkManager' and rebooted. There was no "NM crash at shutdown". "rpm -ivh NetworkManager....." completed without error. All appears to function as expected.
I tried this and can't make the crash happen this way.
OK. Since no visible opposition, I'm closing this.