Bug 413281 - NetworkManager after an update to F8 first promptly crashes and later refuses to connect (DHCP, wired)
Summary: NetworkManager after an update to F8 first promptly crashes and later refuses...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: NetworkManager
Version: 8
Hardware: All
OS: Linux
low
low
Target Milestone: ---
Assignee: Dan Williams
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-12-06 01:32 UTC by Michal Jaegermann
Modified: 2008-10-20 17:46 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2008-10-20 15:33:26 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
bug-buddy spillage for nm-applet (291.74 KB, text/plain)
2007-12-06 01:32 UTC, Michal Jaegermann
no flags Details

Description Michal Jaegermann 2007-12-06 01:32:59 UTC
Description of problem:

On the first login into a gnome session after an upgrade
of a system to F8 (maybe this was a bad move as, among other things,
a laptop now takes ages to start a session) nm-applet first
promptly crashed. A bug-buddy report was saved and is attached
(now for me completely unreadable while old format at least was
giving me clues about what is happening).

On subsequent attempts to login NetworkManager did not crash
but stubborny claims that a network is disconnected.  As a matter
of fact this network is connected and it is a wired network
with a DHCP server running on it.  It does not matter if eth0
was already brought up manually or if it is down NetworkManager
is adamant that it is not there.  Old version of NetworkManager
from FC6 never had this problem so this is a sharp regression.

I do not know at this moment what will happen on wireless but
this is bad enough even without NetworkManager issues on
networks with fixed IPs.

Version-Release number of selected component (if applicable):
NetworkManager-0.7.0-0.6.6.svn3109.fc8

How reproducible:
it just does not work

Comment 1 Michal Jaegermann 2007-12-06 01:32:59 UTC
Created attachment 278971 [details]
bug-buddy spillage for nm-applet

Comment 2 Michal Jaegermann 2007-12-06 17:17:58 UTC
It came to my mind that on an older installation unplugging and
plugging an ethernet cable was immediately reflected in a status
of nm-applet and a connection was switching there and back between
wired and wireless when both were active.  Currently unplugging
and plugging that cable has no effect.

Comment 3 Warren Togami 2007-12-06 18:15:37 UTC
In response to Comment #2:
If you have e1000, you could be hitting a known kernel bug.


Comment 4 Michal Jaegermann 2007-12-06 19:25:48 UTC
> If you have e1000 .....

No such luck, I am afraid.

+-1e.0-[0000:02-06]--+-05.0  Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+

and in dmesg:

8139too Fast Ethernet driver 0.9.28
ACPI: PCI Interrupt 0000:02:05.0[A] -> GSI 19 (level, low) -> IRQ 19
eth0: RealTek RTL8139 at 0xd0184000, 00:00:e2:9d:4f:fb, IRQ 19
eth0:  Identified 8139 chip type 'RTL-8100B/8139D'

OTOH I just noticed that _something_ adds to /etc/modprobe.conf
alias eth1 tulip
line and 'tulip' module is loaded although not detected or used
in any manner.  No idea where this is coming from.  There is surely
only one ethernet interface on this laptop (Acer Travelmate 230).

Wireless card happens to use Atheros chip (and I did not try to
insert it yet after a switch to F8).

Comment 5 Michal Jaegermann 2007-12-14 00:00:18 UTC
I had a chance to look at the problem again and a picture is a bit
more complicated.

There are two user accounts on this laptop.  On one the situation
is as described in this report.  On the other one NetworkManager
does recognize both wired and wireless networks.  Before a switch
to F8 this worked like expected on both.

Killing nm-applet process an account where this does not work
and trying to strace 'nm-applet --sm-disable' is not very illuminating.
I see access and stat64 on /usr/bin/bug-buddy but bug-buddy does not
start.  There was this initial report attached to a comment #1
and after that I did not see anything of that sort.  I see
a bunch of rt_sigaction calls after the last bug-buddy occurence
so I gather this is a "preparatory move".

Any ideas what is really going on?

Dumping on an exit from a user account an existing network connection,
even if it was brought up outside of NetworkManager and nm-applet
steadfastly claims "No network connection", is a diabolical concept
with a pretty negative impact on attempts to debug that.  That part
works too well even when everything else fails.

Comment 6 Gustavo Maciel Dias Vieira 2008-01-05 21:25:05 UTC
I have exactly the same problem, but with a fresh install of F8 instead of an
upgrade. I installed F8 over F7 but preserved the /home partition. For at least
one account moved over from F7 the problem manifests itself, for a newly created
account it works perfectly.

For the record: the first time I started my session nn-applet crashed, after
that it refuses to connect to the wired network, even when told to.

Network device: Broadcom Corporation BCM4401 100Base-T
Network driver: b44
NetworkManager-0.7.0-0.6.6.svn3109.fc8

This bug might be related with bug #385821

Finally, if someone knows how to reset only the NM properties to that of a
freshly created account, this would be a nice workaround. :)


Comment 7 Gustavo Maciel Dias Vieira 2008-01-21 21:54:37 UTC
NetworkManager-0.7.0-0.6.6.svn3138.fc8 solved the problem for me. I can know
switch from the wired to the wireless network, and plugging/unplugging the
network cable connects/disconnects the wired network as before.


Comment 8 Dan Williams 2008-02-12 03:49:03 UTC
Can you please test with latest F8 NM updates and report if this is still a
problem?  Thanks!

Comment 9 Michal Jaegermann 2008-02-19 02:58:38 UTC
Apologies for a delay.  

I just updated the laptop in question to NetworkManager-0.7.0-0.6.7.svn3235.fc8
and nothing changed.  That means that on one account nm applet is
finding networks.  On the other one, on which nm crashed like described
in the original report, from that time on I did not managed to get it
connect on any network.

This is not "critical" with this laptop at the moment, so I am trying
to refrain from drastic steps - like a complete rebuild of that account -
in a hope that one day I will figure out what is happening.  I tried,
among other things, to strace 'nm' and I do not see where it even
_tries_ to raise some network connections.  I may miss something
as all that stuff is very far from clear.  Maybe one day I will find
some time go walk through sources and debugger.

It may happen that I will be forced in some moment to destroy
the whole evidence.  I would not mind some hints where to look.

Missing network when nobody is logged in is really a PITA.  It means
that no cron jobs which require network connections will be running
just because that machine is simply powered up.  Not a show stopper
but quite annoying.

Comment 10 Michal Jaegermann 2008-02-20 07:13:51 UTC
I still do not know how nm-applet is doing its job.  It looks
that it is multithreadead and very quickly one is in a maze.
But it appears that it reads through
.gconf/system/networking/connections/ for a given account.

Looking there for the working one, i.e. where nm-applet is
able to find both wired and wireless connections, I see the
following files (some of these empty):

connections/1/802-3-ethernet/%gconf.xml
connections/1/connection/%gconf.xml
connections/1/%gconf.xml
connections/2/802-11-wireless/%gconf.xml
connections/2/802-11-wireless-security/%gconf.xml
connections/2/connection/%gconf.xml
connections/2/%gconf.xml
connections/%gconf.xml

On that account where the reported crash happened only
those below are currently present.  No idea what dumped
those missing.  nm-applet?  There were no problems of that sort
before the crash.

connections/1/802-11-wireless/%gconf.xml
connections/1/802-11-wireless-security/%gconf.xml
connections/1/connection/%gconf.xml
connections/1/%gconf.xml
connections/%gconf.xml

Once there things seems to be stuck and nothing restores
that missing subdirectory of 'connections'.  I did not try yet
if removing 'connections' would force a re-creation.  Even if
this would induce a recovery it does not look like an "obvious"
thing to do.

Surely enough toggling a status of a wireless card makes nm-applet
to connect and disconnect through it, on any account, while wired
link is stubbornly ignored for that "broken" one.

Comment 11 Michal Jaegermann 2008-03-17 00:58:37 UTC
I am in a dark why it happened what it happened but I think that
eventually I "solved" the problem (or rather found a magic allowing
me to work around).

I noticed that after my failed attempts a month ago I ended up with
the following files in ~/.gconf/system/networking/connections on
an account in question (files only and sorted by time):

    0 2007-12-05 17:50 ./%gconf.xml
    0 2007-12-05 17:50 ./1/%gconf.xml
  557 2007-12-05 17:50 ./1/connection/%gconf.xml
 1338 2007-12-05 17:50 ./1/802-11-wireless-security/%gconf.xml
 2401 2007-12-05 17:50 ./1/802-11-wireless/%gconf.xml
    0 2008-02-19 18:46 ./2/%gconf.xml
  540 2008-02-19 19:15 ./2/connection/%gconf.xml
  393 2008-02-19 19:15 ./2/802-11-wireless-security/%gconf.xml
 1378 2008-02-19 19:15 ./2/802-11-wireless/%gconf.xml

Those below ./1/ are from the time of NetworkManager crash and
although something was added in ./2/ NetworkManager was steadfastly
refusing to recognize a wired interface and I could not find
a way to convince it otherwise.

I decided to remove ~/.gconf/system/networking/ and try again.
That actually made a wired interface active (it took quite a while
to do recognize it but it managed). Following that I had an
exceedingly hard time to make NetworkManager to connected on
wireless.  I ended up in /var/log/wpa_supplicant.log with
multiple entries of that sort:

Trying to associate with 00:13:10:1b:6f:68 (SSID='fermat' freq=2437 MHz)
Authentication with 00:00:00:00:00:00 timed out.

despite the fact that iwlist and iwconfig from a command line
were not indicating any problems and a high signal strength.
After various tries I suddenly got:

Trying to associate with 00:13:10:1b:6f:68 (SSID='fermat' freq=2437 MHz)
Associated with 00:13:10:1b:6f:68
CTRL-EVENT-CONNECTED - Connection to 00:13:10:1b:6f:68 completed (reauth) [id=0
id_str=]

and so far this seems to be working.  Hopefuly this behaviour will
be not habitual.

After all these adventures in a newly recreated 
~/.gconf/system/networking/connections there are the following files:

    0 2008-03-16 17:18 ./%gconf.xml
    0 2008-03-16 17:18 ./1/%gconf.xml
    0 2008-03-16 17:23 ./2/%gconf.xml
  541 2008-03-16 18:06 ./1/connection/%gconf.xml
  308 2008-03-16 18:06 ./1/802-3-ethernet/%gconf.xml
  540 2008-03-16 18:06 ./2/connection/%gconf.xml
  393 2008-03-16 18:06 ./2/802-11-wireless-security/%gconf.xml
 1376 2008-03-16 18:06 ./2/802-11-wireless/%gconf.xml

Quite a bit different results under ./1/.

Comment 12 Dan Williams 2008-10-20 15:33:26 UTC
If this is still an issue with latest NetworkManager from F8 updates (svn4022.4 or later) please re-open!  A number of fixes have gone in since March that might affect this problem.

Comment 13 Michal Jaegermann 2008-10-20 17:46:55 UTC
> If this is still an issue ...
AFAIK the current NM from F8 does not suffer from that bug.
It may ask for a password to a keyring on a wired connection
where none of passwords exists but that looks minor.


Note You need to log in before you can comment on or make changes to this bug.