Bug 590202 - NM spontaneously reconnects, seemingly for no reason
NM spontaneously reconnects, seemingly for no reason
Status: CLOSED CURRENTRELEASE
Product: Fedora
Classification: Fedora
Component: NetworkManager (Show other bugs)
12
All Linux
low Severity medium
: ---
: ---
Assigned To: Dan Williams
Fedora Extras Quality Assurance
: Reopened
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-08 00:20 EDT by Scott Schmit
Modified: 2013-01-10 03:08 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2012-01-04 15:47:02 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
syslog extract during reconnect (9.20 KB, application/octet-stream)
2010-05-08 00:20 EDT, Scott Schmit
no flags Details
syslog extract during reconnects (5.23 MB, text/plain)
2010-05-13 00:16 EDT, Scott Schmit
no flags Details

  None (edit)
Description Scott Schmit 2010-05-08 00:20:27 EDT
Created attachment 412479 [details]
syslog extract during reconnect

Description of problem:
Every now and again, my wireless connection will disconnect and immediately reconnect on its own.

Version-Release number of selected component (if applicable):
NetworkManager-0.8.0-12.git20100504.fc12.x86_64

How reproducible:
I'm not sure what triggers this. I'm hoping the log offers some clue.

Actual results:
Occasionally the network will reconnect for no apparent reason.

Expected results:
I would expect NM to stay connected, considering that I generally have a strong wireless connection (98% right now).

Additional info:
SELinux:  77 classes, 164035 rules
wlan0: deauthenticating from xx:xx:xx:xx:xx:xx by local choice (reason=3)
wlan0: deauthenticating from xx:xx:xx:xx:xx:xx by local choice (reason=3)
wlan0: direct probe to AP xx:xx:xx:xx:xx:xx (try 1)
wlan0: direct probe to AP xx:xx:xx:xx:xx:xx (try 2)
wlan0: direct probe responded
wlan0: authenticate with AP xx:xx:xx:xx:xx:xx (try 1)
wlan0: authenticated
wlan0: associate with AP xx:xx:xx:xx:xx:xx (try 1)
wlan0: RX AssocResp from xx:xx:xx:xx:xx:xx (capab=0x411 status=0 aid=1)
wlan0: associated
wlan0: deauthenticating from xx:xx:xx:xx:xx:xx by local choice (reason=3)
wlan0: deauthenticating from xx:xx:xx:xx:xx:xx by local choice (reason=3)
wlan0: direct probe to AP xx:xx:xx:xx:xx:xx (try 1)
wlan0: direct probe responded
wlan0: authenticate with AP xx:xx:xx:xx:xx:xx (try 1)
wlan0: authenticated
wlan0: associate with AP xx:xx:xx:xx:xx:xx (try 1)
wlan0: RX AssocResp from xx:xx:xx:xx:xx:xx (capab=0x411 status=0 aid=1)
wlan0: associated
...
Comment 1 Dan Williams 2010-05-10 18:03:08 EDT
Looks like IPv6 autoconf nameservers are timing out because you don't get a router advertisement before their expiration, which could be due to a number of things.

What package version of NM do you have installed?  'rpm -q NetworkManager' should tell you.

Second, you could set the IPv6 method to "ignored" in the connection editor for this connection to work around the problem until we figure out what's going on.
Comment 2 Scott Schmit 2010-05-10 19:54:12 EDT
As I put in the original report:

Version-Release number of selected component (if applicable):
NetworkManager-0.8.0-12.git20100504.fc12.x86_64

On the server end, I have:
radvd-1.5-2.fc12.i686

My /etc/radvd.conf:
interface eth1
{
	AdvSendAdvert on;
	AdvLinkMTU 1280;
	MaxRtrAdvInterval 30;

	prefix fdxx:xxxx:xxxx::/64
	{
		AdvOnLink on;
		AdvAutonomous on;
		AdvRouterAddr on;
	};

	prefix 0:0:0:1::/64
	{
		Base6to4Interface eth0;
		AdvPreferredLifetime 120;
		AdvValidLifetime 300;
	};

	RDNSS fdxx:xxxx:xxxx:0:yyyy:yyyy:yyyy:yyyy fdxx:xxxx:xxxx:0:zzzz:zzzz:zzzz:zzzz {
	};
};
Comment 3 Dan Williams 2010-05-12 15:08:55 EDT
Try this to get more debug information out of IPv6:

1) service NetworkManager stop
2) /usr/sbin/NetworkManager --log-level=debug
3) try to reproduce the issue

then lets look at the logs.  You should see something like:

(eth0): found RA-provided nameserver fdxx:xxxx:xxxx:0:yyyy:yyyy:yyyy:yyyy
 (expires in xxx seconds)
(eth0): IPv6 RDNSS information expired
(eth0): removing expired RA-provided nameserver fdxx:xxxx:xxxx:0:yyyy:yyyy:yyyy:yyyy

or something like that.
Comment 4 Scott Schmit 2010-05-13 00:16:01 EDT
Created attachment 413618 [details]
syslog extract during reconnects

This gives you the high-level context:
# Started NetworkManager --log-level=debug (output in attachment)
# Start script to track output of STATE in "nmcli nm":
May 12 21:59:23.313422733  -> connected       
May 12 22:26:50.558083825 connected        -> disconnected    
May 12 22:26:53.485197991 disconnected     -> connecting      
May 12 22:26:58.902270479 connecting       -> connected       
May 12 23:04:30.532938662 connected        -> disconnected    
May 12 23:04:33.468890671 disconnected     -> connecting      
May 12 23:04:37.840516138 connecting       -> connected       
May 12 23:52:54.546409659 connected        -> disconnected    
May 12 23:52:57.460776132 disconnected     -> connecting      
May 12 23:53:06.879834666 connecting       -> connected       
# End capture
Comment 5 Dan Williams 2010-05-14 12:49:39 EDT
Yeah, so that "MaxRtrAdvInterval" causes the DNS servers in the RA to be valid for only 30 seconds.  If an another advertisement is not received within 30 seconds then the nameservers are no longer valid and must be removed from resolv.conf.  At this time, if nameservers are sent in the RA, NetworkManager will fail the IPv6 connection when the nameservers become invalid.

I've padded the expiry by 10 seconds upstream to give a little slack just in case the router advertisement doesn't filter up exactly at the timeout.  Lets see if this helps, otherwise we can re-open and decide how to proceed.

Upstream commit is 0b8ee13ee0ae6b6a344473e674849c05aca3ca08
Comment 6 Bug Zapper 2010-11-03 11:21:27 EDT
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping
Comment 7 Bug Zapper 2010-12-03 09:56:36 EST
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.
Comment 8 Matthew Garrett 2011-06-09 21:58:08 EDT
I'm seeing this with NetworkManager-0.8.999-1.fc16.x86_64:

NetworkManager[11375]: <debug> [1307670675.785698] [nm-ip6-manager.c:265] rdnss_expired(): (wlan4): IPv6 RDNSS information expired
NetworkManager[11375]: <debug> [1307670675.785868] [nm-ip6-manager.c:299] set_rdnss_timeout(): (wlan4): removing expired RA-provided nameserver 2001:470:1f07:1371::1
NetworkManager[11375]: <info> (wlan4): device state change: activated -> failed (reason 'ip-config-unavailable') [100 120 5]
Comment 9 Mathieu Chouquet-Stringer 2011-06-15 12:26:52 EDT
I get the same thing with F15.  The sucky part is I use a NFS server and when NM does this, my xterm or any application in/using a directory mounted through NFS just dies...

NetworkManager[864]: <info> (wlan0): supplicant interface state: completed -> disconnected
NetworkManager[864]: <warn> Couldn't disconnect supplicant interface: This interface is not connected.

I'll add traces as soon as I have some.
Comment 10 Mathieu Chouquet-Stringer 2011-06-15 15:07:31 EDT
Well strace tells me processes just get killed, it's really annoying.

11950 poll([{fd=3, events=POLLIN}, {fd=4, events=POLLIN|POLLPRI|POLLRDNORM|POLLRDBAND}], 2, 0) = 0 (Timeout)
11950 select(6, [3 4 5], [], NULL, NULL <unfinished ...>
11952 +++ killed by SIGKILL +++
11950 <... select resumed> )            = 1 (in [5])
11950 --- {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=11952, si_status=SIGKILL, si_utime=15, si_stime=42} (Child exited) ---
11950 wait4(-1, NULL, 0, NULL)          = 11952
11950 rt_sigaction(SIGCHLD, {0x425110, [CHLD], SA_RESTORER|SA_RESTART, 0x371f035300}, {0x425110, [], SA_RESTORER|SA_RESTART|SA_NOCLDSTOP, 0x371f035300}, 8) = 0
11950 wait4(-1, NULL, WNOHANG, NULL)    = -1 ECHILD (No child processes)

11950 was a xterm and 11952 zsh:

11950 execve("/usr/bin/xterm", ["xterm"], ["SSH_AGENT_PID=1713", "XDG_SESSION_ID=1", "HOSTNAME=...
11952 execve("/bin/zsh", ["zsh"], ["SSH_AGENT_PID=1713", "XDG_SESSION_ID=1", "HOSTNAME=...
Comment 11 Dan Williams 2011-11-30 12:04:32 EST
Is this still showing up with F15's NM 0.9 or F16's 0.9.2?
Comment 12 Scott Schmit 2011-12-04 23:54:35 EST
I've been using Fedora 16 on my laptop for about 3 weeks and I haven't noticed the connection cycling problem. Speaking only for myself, I think you can close this bug.
Comment 13 Dan Williams 2012-01-04 15:47:02 EST
(In reply to comment #12)
> I've been using Fedora 16 on my laptop for about 3 weeks and I haven't noticed
> the connection cycling problem. Speaking only for myself, I think you can close
> this bug.

Ok, thanks for the update.
Comment 14 satellitgo 2012-03-07 12:54:29 EST
https://bugzilla.redhat.com/show_bug.cgi?id=801052#c8
May be similar

Note You need to log in before you can comment on or make changes to this bug.