Bug 749988

Summary: Network gets lost pretty often
Product: [Fedora] Fedora Reporter: hannes <johannes.lips>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED DUPLICATE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 16CC: awilliam, christoph.wickert, dcbw, gansalmon, itamar, jklimes, jonathan, kernel-maint, madhu.chinakonda
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-02-01 20:16:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
dmesg NetworkManager
none
/var/log/messages none

Description hannes 2011-10-29 15:03:14 UTC
Description of problem:
Since the upgrade to f16 the network gets lost pretty often.
Wifi:
lspci -nn | grep Network
04:00.0 Network controller [0280]: Intel Corporation WiFi Link 5100 [8086:4232]


Version-Release number of selected component (if applicable):
rpm -qa NetworkManager\*
NetworkManager-glib-0.9.1.90-5.git20110927.fc16.x86_64
NetworkManager-0.9.1.90-5.git20110927.fc16.x86_64
NetworkManager-pptp-0.9.0-1.fc16.x86_64
NetworkManager-gtk-0.9.1.90-5.git20110927.fc16.x86_64
NetworkManager-vpnc-0.9.0-1.fc16.x86_64
NetworkManager-gnome-0.9.1.90-5.git20110927.fc16.x86_64
NetworkManager-openconnect-0.9.0-1.fc16.x86_64

How reproducible:
Erroneous

  
Actual results:
Need to reconnect from time to time.

Expected results:
Constant connectivity like before in f15.


Additional info:
I found the following warning in dmesg:
[ 2607.008078] NetworkManager[680]: NetworkManager[680]: <warn> could not spawn process '/etc/init.d/nscd condrestart': Failed to execute child process "/etc/init.d/nscd" (No such file or directory)
[ 4283.445797] NetworkManager[680]: <warn> could not spawn process '/etc/init.d/nscd condrestart': Failed to execute child process "/etc/init.d/nscd" (No such file or directory)
[ 4283.445805] NetworkManager[680]: NetworkManager[680]: <warn> could not spawn process '/etc/init.d/nscd condrestart': Failed to execute child process "/etc/init.d/nscd" (No such file or directory)
[ 4307.007206] NetworkManager[680]: <warn> could not spawn process '/etc/init.d/nscd condrestart': Failed to execute child process "/etc/init.d/nscd" (No such file or directory)
[ 4307.007426] NetworkManager[680]: NetworkManager[680]: <warn> could not spawn process '/etc/init.d/nscd condrestart': Failed to execute child process "/etc/init.d/nscd" (No such file or directory)

I could not really determine if it's the cause but anyway it doesn't seem right. nscd is installed:
rpm -qa nscd
nscd-2.14.90-14.x86_64

Comment 1 Adam Williamson 2011-10-29 17:57:23 UTC
well, it just looks like NM tries to restart nscd in certain cases (like when the network goes down). I doubt it's the *cause*. The use of 'condrestart' implies that it's only an 'optional' thing and it failing shouldn't be an issue.

What it should be doing now is 'systemctl try-restart nscd.service' , but I doubt fixing that would fix your network.

Can you attach *all* NM logs - something like 'grep NetworkManager /var/log/messages' ? Thanks.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 2 hannes 2011-10-30 07:20:57 UTC
Well ok, /var/log/messages is empty and I am not sure if syslogd is running:
ps ax | grep syslog
  361 ?        Ss     0:00 /lib/systemd/systemd-kmsg-syslogd
  372 ?        Ss     0:00 /lib/systemd/systemd-stdout-syslog-bridge
 2082 pts/0    S+     0:00 grep syslog

I attach the output of dmesg grepped for NetworkManager. Hope this helps.

Comment 3 hannes 2011-10-30 07:21:30 UTC
Created attachment 530818 [details]
dmesg NetworkManager

Comment 4 Adam Williamson 2011-10-30 17:42:17 UTC
"Well ok, /var/log/messages is empty and I am not sure if syslogd is running:"

systemctl enable rsyslogd.service

well, from the dmesg, it looks like it stays up for 13 minutes, then:

[  805.320176] NetworkManager[681]: NetworkManager[681]: <info> (wlan0): disconnecting for new activation request.

But I'm not totally sure why that would happen. I'd more expect that message if you changed APs, or something.



-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers

Comment 5 hannes 2011-10-31 06:59:59 UTC
Ok needed to install rsyslog-sysvinit and started it with chkconfig, since your command refused to work. Didn't think about this possibility since I thought it is already systemd, sorry.
Just before dropping the connection I have:
Oct 31 07:58:19 caprica systemd[1]: Failed to read PID file /run/sendmail.pid after start. The service might be broken.

Don't know if this is also related. Will attach the whole /var/log/messages since before the disconnect there were no NetworkManager related issues.

Comment 6 hannes 2011-10-31 07:00:32 UTC
Created attachment 530907 [details]
/var/log/messages

Comment 7 Christoph Wickert 2011-10-31 19:33:07 UTC
I have the same problem with F15, also with Intel wifi

$ lspci | grep Network
00:19.0 Ethernet controller: Intel Corporation 82577LM Gigabit Network Connection (rev 06)
02:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)

dmesg:
[51344.561558] wlan0: deauthenticating from bc:05:43:81:77:02 by local choice (reason=3)
[51344.580045] cfg80211: Calling CRDA to update world regulatory domain
[51345.039317] cfg80211: World regulatory domain updated:
[51345.039322] cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[51345.039328] cfg80211:     (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[51345.039334] cfg80211:     (2457000 KHz - 2482000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
[51345.039338] cfg80211:     (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm)
[51345.039343] cfg80211:     (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[51345.039347] cfg80211:     (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm)
[51345.039368] cfg80211: Calling CRDA for country: DE
[51345.044088] cfg80211: Regulatory domain changed to country: DE
[51345.044093] cfg80211:     (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp)
[51345.044099] cfg80211:     (2400000 KHz - 2483500 KHz @ 40000 KHz), (N/A, 2000 mBm)
[51345.044103] cfg80211:     (5150000 KHz - 5250000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[51345.044108] cfg80211:     (5250000 KHz - 5350000 KHz @ 40000 KHz), (N/A, 2000 mBm)
[51345.044112] cfg80211:     (5470000 KHz - 5725000 KHz @ 40000 KHz), (N/A, 2698 mBm)

Not sure if this is really NM, but I wonder what "by local choice (reason=3)" is supposed to mean.

Comment 8 hannes 2011-10-31 21:15:35 UTC
Probably something related to Intel drivers. Don't know which component or maintainer could help on a driver-related issue, or should we take it rather upstream (intel) in this case?

Comment 9 Jirka Klimes 2011-12-20 13:30:09 UTC
Yeah, that's probably iwlagn driver issue.

You can try whether disabling N-mode helps:
# rmmod iwlagn
# modprobe iwlagn 11n_disable=1

To debug further follow the steps here:
http://live.gnome.org/NetworkManager/Debugging#wifi
That will enable debug logs from wpa_supplicant

Other debugging tips:
* http://intellinuxwireless.org/?n=fw_error_report
* run 'iw event -f -t' to monitor wireless events

Comment 10 hannes 2012-01-30 20:26:16 UTC
Ok, it appears again with the new kernel:
3.2.2-1.fc16.x86_64
I have "disconnects" where for example firefox is not able to load pages but network manager is still connected. Most probably an intel driver issue.
04:00.0 Network controller [0280]: Intel Corporation WiFi Link 5100 [8086:4232]

Comment 11 hannes 2012-01-31 20:08:25 UTC
Changed the component to kernel since it's most probably a driver issue. I went back to a previous version of the kernel (3.1.10-2.fc16.x86_64), which doesn't have the problem.

Comment 12 Jirka Klimes 2012-02-01 07:47:22 UTC
hannes, could you test this:
https://bugzilla.redhat.com/show_bug.cgi?id=785239#c10

Comment 13 hannes 2012-02-01 20:16:00 UTC

*** This bug has been marked as a duplicate of bug 785239 ***