Bug 440648 - system/kernel lockup on service NetworkManager start
Summary: system/kernel lockup on service NetworkManager start
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel   
(Show other bugs)
Version: rawhide
Hardware: All
OS: Linux
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL: http://www.smolts.org/show?uuid=pub_7...
Depends On:
TreeView+ depends on / blocked
Reported: 2008-04-04 12:05 UTC by Andrew Farris
Modified: 2008-04-30 02:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2008-04-10 23:27:30 UTC
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

Description Andrew Farris 2008-04-04 12:05:33 UTC
Description of problem:
NetworkManager is making my system lockup immediately as I start the service. 
The system does not respond to any input or network traffic.

I had previously switched to using legacy configuration for my network devices
(two wired, one unused wireless) on this desktop machine.  I tried to return to
NM config but this lockup is making it impossible to test/workaround, and the
machine must be hard-reset each time (a plus is I'm giving my luks ext4
filesystem a good lashing, it recovers the journal each time).

Version-Release number of selected component (if applicable):

I've tried all these kernels:

How reproducible:
A sure bet.

Additional info:

I've got kernel debuginfo installed, but absolutely nothing gets printed to
terminal after NetworkManager starts (I get the new root terminal prompt, then
the cursor stops blinking and thats it...).  I have NM turned off by chkconfig
and am manually requesting start after its fully booted.

I have the configuration scripts setup like this:
-> cat /etc/sysconfig/network-scripts/ifcfg-eth0
# ADMtek NC100 Network Everywhere Fast Ethernet 10/100

$-> cat /etc/sysconfig/network-scripts/ifcfg-eth1
# 3Com Corporation 3c905C-TX/TX-M [Tornado]

The tulip (network everwhere) card is turned off for NM because I know it has an
ifup/ifdown bug #431038 which causes a kernel stacktrace but never hard-locks
the system like this by itself.  Even with that not NM controlled the system
locks up.  I can use that device with legacy config but it sometimes causes the

Any clues where to attack this one?  If I leave service network running, and
just ifup eth0 or ifup eth1 the network works just fine.  If I stop network, and
then start NM I get a deadlocked kernel (same if I leave network running and
start NM).

I've tried getting NM started up without any configs in place and same thing

Comment 1 Dan Williams 2008-04-04 18:32:20 UTC
Is the capslock light flashing to indicate a kernel panic?

Can you turn NM off with chkconfig, then boot up and switch to a VT, then from
there start NetworkManager to capture some of the panic?  If the machine
hardlocks, it's almost certainly a kernel driver bug.

Comment 2 Andrew Farris 2008-04-04 21:03:00 UTC
Capslock was not flashing any time its happened yet (none of the LEDS were).

I have NM off via chkconfig already, and was doing the service start manually.  No output shows up other 
than the 'Starting NetworkManager..[ok]' and my next terminal prompt (I use a complicated custom $PS1, 
it finishes doing that).  Then, and only after the bash prompt.. the cursor stops blinking.

I've tried in runlevel 3 and 5.  Right now I don't know if its actually a kernel lock or not because I have no 
network response to check (it has no active interface).

Comment 3 Andrew Farris 2008-04-04 21:06:56 UTC
BTW, until I turned off NM it was partially working at one time in rawhide, but it was not correctly 
configuring the wired interfaces then.  It was at least able to start the service, and it would use DHCP on 
both interfaces but without any DNS.

I know it was working with kernel-2.6.25-0.121.rc5.git4.fc9.i686 because thats the reason I've still got it 
installed.  It now fails with that kernel, so I'm not sure how to decide if its a kernel bug or not.

Comment 4 Fernando Atrio 2008-04-05 15:04:52 UTC
I see the exact same bug on a fedora 8 machine running the latest kernel update,
Everything works fine if I go back to kernel-
This is my smolt profile:

Comment 5 Andrew Farris 2008-04-05 15:30:25 UTC
Thats  interesting Fernando.  It looks like we have a somewhat similar wireless adapter, although yours is 
using the rt61 driver and mine is using the rt2500pci driver.

Dan, I may be able to test using only the two wired adapters.  Other than me removing my wireless 
adapter, or blacklisting the driver, how do I turn it off or prevent NM from trying to configure it when the 
service starts?

Right now I do not have anything configured for that adapter in system-config-network at all.  There are 
no config scripts in /etc/sysconfig/network* for it either.  Would NM take any action trying to auto-
configure that adapter in this case?

Comment 6 Andrew Farris 2008-04-08 02:57:49 UTC
I have narrowed this down a little bit.  I used system-config-network to create
a configuration for ifcfg-wlan0 using the rt2500pci driver and setting the
adapter to be NM controlled.  The system now allows NetworkManager to start, and
control all the adapters.

The problem seems to be related to some attempt at auto configuration when the
kernel driver for the wireless device was loaded but no config was present yet.

Comment 7 Dan Williams 2008-04-10 16:16:47 UTC
Any chance you can revert to the broken setup and try sysrq on the box to see
where it's hanging?  would be good to track down the hang even if it's not a


Comment 8 Andrew Farris 2008-04-10 23:27:30 UTC
I removed the configuration for my wlan0 in /etc/sysconfig/network-scripts and
/etc/sysconfig/networking/{default,profiles/default}, then attempted to
reproduce the problem and I cannot.

When starting from a fresh boot, no configuration scripts in place for wlan,
ifconfig shows an unconfigured adapter and NM starts up fine.

I had to blacklist the tulip module due to my tulip device problems to test this
but once I did I cannot get any hang now (only tested .204 kernel).  Going to
close until it can be reproduced.  I'll try with a fresh Preview install.

Comment 9 Fernando Atrio 2008-04-30 02:29:44 UTC
Everything works fine here, running kernel- 

Note You need to log in before you can comment on or make changes to this bug.