Created attachment 822637 [details] Backtrace of networkmanager Description of problem: NetworkManager starts to use 100% cpu and print NetworkManager[959]: <error> [1384207571.670130] [platform/nm-linux-platform.c:1109] add_object(): Netlink error: Invalid address for specified address family over and over into the system log. I have attached backtraces of all threads in network manager and parts of the system log. Might be related to: https://bugzilla.redhat.com/show_bug.cgi?id=1020068 Version-Release number of selected component (if applicable): kernel-3.11.7-300.fc20.x86_64 NetworkManager-0.9.9.0-15.git20131003.fc20.x86_64 NetworkManager-openvpn-0.9.8.2-3.fc20.x86_64 NetworkManager-vpnc-0.9.8.2-2.fc20.x86_64 NetworkManager-l2tp-0.9.8-4.fc20.x86_64 NetworkManager-glib-0.9.9.0-15.git20131003.fc20.i686 NetworkManager-glib-0.9.9.0-15.git20131003.fc20.x86_64 NetworkManager-pptp-0.9.8.2-3.fc20.x86_64 NetworkManager-openvpn-gnome-0.9.8.2-3.fc20.x86_64 NetworkManager-debuginfo-0.9.9.0-15.git20131003.fc20.x86_64 NetworkManager-openconnect-0.9.8.0-2.fc20.x86_64 NetworkManager-vpnc-gnome-0.9.8.2-2.fc20.x86_64 NetworkManager-pptp-gnome-0.9.8.2-3.fc20.x86_64 How reproducible: Hard to say. It happens during normal use of the system.
Created attachment 822638 [details] Log file
This bug has accumulated about 4G of log data during the last couple of days.
I'm hitting this bug too. For me it sometimes only repeats the log entry that Julian posted. But most of the time I get a constant repetition of the following lines: """ Nov 13 17:20:32 simon-desktop.local NetworkManager[407]: <info> Activation (em1) Stage 5 of 5 (IPv6 Commit) scheduled... Nov 13 17:20:32 simon-desktop.local NetworkManager[407]: <info> Activation (em1) Stage 5 of 5 (IPv6 Commit) started... Nov 13 17:20:32 simon-desktop.local NetworkManager[407]: <error> [1384359632.637356] [platform/nm-linux-platform.c:1109] add_object(): Netlink error: Invalid address for specified address family Nov 13 17:20:32 simon-desktop.local NetworkManager[407]: <info> Activation (em1) Stage 5 of 5 (IPv6 Commit) complete. """
Can people experiencing this problem grab the following RPMs from: http://people.redhat.com/dcbw/NetworkManager/fedora/netlink-error/ install them, and then see if you can reproduce the problem, and if so, what else gets printed to the logs. These builds have additional debugging in them which should print out the address/route which cannot be added to the kernel. This will help us debug the problem further. Thanks!
After installing NetworkManager-0.9.9.0-16.git20131003.fc20.x86_64.rpm and NetworkManager-glib-0.9.9.0-16.git20131003.fc20.x86_64.rpm the log now repeats the following lines: """ Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: <info> Activation (em1) Stage 5 of 5 (IPv6 Commit) scheduled... Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: <info> Activation (em1) Stage 5 of 5 (IPv6 Commit) started... Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: <error> [1384420546.28308] [platform/nm-linux-platform.c:1113] add_object(): Netlink error: Invalid address for specified address family Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: :: inet6 dev 3 scope nowhere Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: valid-lifetime forever preferred-lifetime forever Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: [54B blob data] Nov 14 10:15:46 simon-desktop.local NetworkManager[418]: <info> Activation (em1) Stage 5 of 5 (IPv6 Commit) complete. """
This happenes for me, too, when I'm using a dual-stack setup. The logs show the exact output as pasted above. When the repeated "Netlink error" appears in logs, there's no IPv6 connection anymore, but NM-applet shows a correctly assigned global IPv6-Adress. IPv4 connections are not affected by this. After toggling my connection with the hw-switch (or re-plugging the LAN-Cable) everything runs smooth again for some uncertain amount of time (30 mins to 2 h) and "the game" starts again...
I forgot to mention: At the moment global IPv6-connection fails, NM starts to use some address from link-local/ULA fe80::-scope as default route, which actually blocks every WAN / LAN v6-traffic.
This is an additional output on failure of the debugging pkgs previously mentioned: Nov 14 11:45:51 p170hm NetworkManager[950]: nm_platform_ip6_address_add: assertion 'lifetime > 0' failed Nov 14 11:45:51 p170hm rsyslogd-2177: imjournal: begin to drop messages due to rate-limiting
(In reply to Björn "besser82" Esser from comment #7) > I forgot to mention: At the moment global IPv6-connection fails, NM starts > to use some address from link-local/ULA fe80::-scope as default route, which > actually blocks every WAN / LAN v6-traffic. This may be your default IPv6 router. Since the IPv6 router is on the same link, it's normal that your system would use the link-local address of the router as the default route. Is there a chance you can check with wireshark and see where the IPv6 Router Advertisements come from, and if the source address of the Router Advertisements corresponds to the IPv6 address that NetworkManager uses as the default IPv6 gateway?
No, the fe80::-scope isn't advertised by my IPv6-router. My router advertises ULAs / link-local routes to fdbe:dead:beef:197::/64 But this fe80::-scope is applied by NM even if there is no IPv6-router present in the network...
Björn, would you mind running NM with more debugging? nmcli g logging level debug domains device,ip4,ip6,dhcp4,dhcp6 and then lets try reconnecting the connection that causes the problem with: nmcli con show active <find the connection uuid or name> nmcli dev disconnect em1 nmcli con up <uuid or name> And you should get a lot more journal spew, including IPv6 stuff that's going to be very interesting to us.
Created attachment 824466 [details] Debug NM log I did the following (with output): """ # nmcli g logging level debug domains device,ip4,ip6,dhcp4,dhcp6 ** (process:25329): CRITICAL **: nm_ip6_route_set_prefix: assertion 'prefix > 0' failed # nmcli con show active ** (process:25483): CRITICAL **: nm_ip6_route_set_prefix: assertion 'prefix > 0' failed NAME UUID DEVICES DEFAULT VPN MASTER-PATH Auto c90bccd1-3de6-41a7-823c-e03508f4221c em1 yes no -- # nmcli dev disconnect em1 ** (process:25490): CRITICAL **: nm_ip6_route_set_prefix: assertion 'prefix > 0' failed # nmcli con up c90bccd1-3de6-41a7-823c-e03508f4221c ** (process:25534): CRITICAL **: nm_ip6_route_set_prefix: assertion 'prefix > 0' failed Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/1) """ I waited until the bug got triggered and NM used 100 % CPU. Then I disconnected em1 and extracted the log approximately since i brought the connection up.
(In reply to Simon Gerhards from comment #12) > Created attachment 824466 [details] > Debug NM log Bizarre, your IPv6 Router is sending you a default route: Nov 15 11:29:09 simon-desktop.local NetworkManager[401]: <debug> [1384511349.912493] [rdisc/nm-rdisc.c:123] config_changed(): route ::/0 pref 3 exp 9360 which NetworkManager should completely ignore, but isn't currently doing. Can anyone else confirm that their IPv6 router is also doing this?
For documentation puroses: My router is an AVM "FRITZ!Box 6320 v2 Cable (um)" on firmware version "110.05.51". The "(um)" in the name indicates that it is running a custom firmware from my ISP (Unitymedia, Germany).
I have a Fritz!Box 7312 running firmware version 117.05.51. I'll try to reproduce the problem with tcpdump enabled.
Created attachment 826596 [details] Packet trace This is tcpdump output for icmpv6 traffic when network manager goes crazy. Network manager sends router solicitations in a loop!
And yes, my router also announces a ::/0 route.
commit 8586353b09460ec0a619058421743dd7d424a75d upstream now ignores the default route in router advertisements, like the kernel does.
Any idea whether this will be fixed until the Fedora 20 release?
Got the same problems with NetworkManager-0.9.9.0-20.git20131003.fc20.x86_64. I applied commit 799477872394a5e473cb868e8c3ae2ba5fff7f50 ("core: ignore RA-provided default routes (rh #1029213)") from upstream (I couldn't find commit 8586353b09460ec0a619058421743dd7d424a75d), which appears to have solved the busy-looping and resulting CPU usage, but there are still some error messages printed to the syslog: Dec 15 12:22:43 sloth NetworkManager[663]: <info> Activation (wlo1) Stage 5 of 5 (IPv6 Commit) scheduled... Dec 15 12:22:43 sloth NetworkManager[663]: <info> Activation (wlo1) Stage 5 of 5 (IPv6 Commit) started... Dec 15 12:22:43 sloth NetworkManager[663]: <error> [1387106563.947091] [platform/nm-linux-platform.c:1127] add_object(): Netlink error: Invalid address for specified address family Dec 15 12:22:43 sloth NetworkManager[663]: <info> Activation (wlo1) Stage 5 of 5 (IPv6 Commit) complete. Dec 15 12:22:43 sloth NetworkManager: :: inet6 dev 3 scope nowhere Dec 15 12:22:43 sloth NetworkManager: valid-lifetime forever preferred-lifetime forever Dec 15 12:22:43 sloth NetworkManager: created boot-time+p<AC>'^^D^? last-updated boot-time+p<AC>'^^D^? My router is an AVM FritzBox, so it does advertise a bogus default route information option in RA just like is the case for the other commenters. Tore
Created attachment 837049 [details] Last 2000 log lines I was a bit quick. 799477872394a5e473cb868e8c3ae2ba5fff7f50 does not help, at least not fully. My computer is now in a tight loop sending out RSes and receiving RAs as quickly as it can. I've attached the output from "journalctl -n 2000" after doing "nmcli g logging level debug". This is from NetworkManager-0.9.9.0-20.git20131003.fc20.x86_64 patched with 799477872394a5e473cb868e8c3ae2ba5fff7f50 from upstream git. Tore
Created attachment 837769 [details] same bug with OpenWrt trunk39114 I am probably observing the same bug here. Fedora 20 runs inside a Virtualbox 4.3.x on Windows 7 Router is running OpenWrt (Barrier Breaker) trunk39114 attached a logfile
This also happens in our university wifi.
(In reply to Tore Anderson from comment #21) > Created attachment 837049 [details] > Last 2000 log lines > > I was a bit quick. 799477872394a5e473cb868e8c3ae2ba5fff7f50 does not help, > at least not fully. My computer is now in a tight loop sending out RSes and > receiving RAs as quickly as it can. > > I've attached the output from "journalctl -n 2000" after doing "nmcli g > logging level debug". This is from > NetworkManager-0.9.9.0-20.git20131003.fc20.x86_64 patched with > 799477872394a5e473cb868e8c3ae2ba5fff7f50 from upstream git. It looks like this is a different bug actually; the NM RA code is soliciting an RA even though it just processed one. Perhaps the router doesn't set a minimum time between router advertisements it's sending out? In any case, lets file this one separately since it's not related to the default route thing.
NetworkManager-0.9.9.0-21.git20131003.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/NetworkManager-0.9.9.0-21.git20131003.fc20
(In reply to Tore Anderson from comment #21) > Created attachment 837049 [details] > Last 2000 log lines > > I was a bit quick. 799477872394a5e473cb868e8c3ae2ba5fff7f50 does not help, > at least not fully. My computer is now in a tight loop sending out RSes and > receiving RAs as quickly as it can. > > I've attached the output from "journalctl -n 2000" after doing "nmcli g > logging level debug". This is from > NetworkManager-0.9.9.0-20.git20131003.fc20.x86_64 patched with > 799477872394a5e473cb868e8c3ae2ba5fff7f50 from upstream git. > > Tore Tore, I've filed https://bugzilla.redhat.com/show_bug.cgi?id=1044757 to track this issue.
Package NetworkManager-0.9.9.0-21.git20131003.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing NetworkManager-0.9.9.0-21.git20131003.fc20' as soon as you are able to, then reboot. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-23593/NetworkManager-0.9.9.0-21.git20131003.fc20 then log in and leave karma (feedback).
NetworkManager-0.9.9.0-22.git20131003.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/NetworkManager-0.9.9.0-22.git20131003.fc20
(In reply to Fedora Update System from comment #27) > Package NetworkManager-0.9.9.0-21.git20131003.fc20: I installed this one, but I still get the "Invalid address for specified address family" error message described in the bug title (not the 100% CPU usage though, but I haven't waited long enough to trigger the RS/RA loop due to RDNSS timer expiry bug anyway). Every time an (unsolicited) RA is received, NM logs the following: des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: <info> Activation (wlo1) Stage 5 of 5 (IPv6 Commit) scheduled... des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: <info> Activation (wlo1) Stage 5 of 5 (IPv6 Commit) started... des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: <error> [1387528018.173284] [platform/nm-linux-platform.c:1127] add_object(): Netlink error: Invalid address for specified address family des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: <info> Activation (wlo1) Stage 5 of 5 (IPv6 Commit) complete. des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: :: inet6 dev 3 scope nowhere des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: valid-lifetime forever preferred-lifetime forever des. 20 09:26:58 sloth.fud.no NetworkManager[4523]: [54B blob data] As an aside, I question the wisdom of printing this information to the log (any information really) every time an RA is received, at least if it was identical to the previous one and the only thing NM needs to do to act on it is to reset lifetime counters. In some environments RAs may be sent very frequently. For example, Juniper suggest a max advertisement interval of 4 seconds when configuring RAs in conjunction with VRRP (see http://www.juniper.net/techpubs/en_US/junos/topics/example/vrrp-for-ipv6-configuring-example.html). Over time this will cause the system log to contain an enormous amount of identical log lines relating to IPv6 activation, which will probably not be appreciated by the sysadmins. Tore
NetworkManager-0.9.9.0-22.git20131003.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
The problem is not fixed with NetworkManager-0.9.9.0-22.git20131003.fc20 For me it is still spewing at 100% CPU to the log. Excerpt below: Dec 24 10:10:12 barentz NetworkManager[29936]: <error> [1387876212.130833] [platform/nm-linux-platform.c:1127] add_object(): Netlink error: Invalid address for specified address family Dec 24 10:10:12 barentz NetworkManager[29936]: <error> [1387876212.167090] [platform/nm-linux-platform.c:1127] add_object(): Netlink error: Invalid address for specified address family Dec 24 10:10:12 barentz NetworkManager[29936]: <error> [1387876212.202990] [platform/nm-linux-platform.c:1127] add_object(): Netlink error: Invalid address for specified address family
Disabling IPv6 and restarting NetworkManager works around the issue.
Going without IPv6 isn't much of a fix. Disabling NetworkManager and leaving it disabled seems okay, though I have to manually ifup my interface on reboot now.
Rather, I have to take the effort to configure my /etc/sysconfig/network-scripts/ifcfg-p20p1 config file better.
I have found one issue that prints this message, but only when DHCPv6 is used in information-only mode and the DHCP server does not provide a lease. However, it does not result in 100% CPU usage, only the log mesage. Geert, any chance you can run NetworkManager in debug mode for me? sudo nmcli g logging level debug domains core,hw,ip6,dhcp6
Created attachment 871547 [details] Diff of -20 vs -31 connection. I downgraded back to NetworkManager-0.9.9.0-20.git20131003.fc20.x86_64 (which works for me) and ran a diff of their debugging output. Oddly, both logs have the "Netlink error: Unspecific failure" message, but -20 doesn't error out the connection like -31 does. The logs were normalized with this command so I could compare them: perl -pe 's/\[\d+\]//; s/\[\d{10}\.\d{6}\]//; s/\.c:\d+\]/.c:~]/; s/0x[0-9a-f]{8,12}/0xdeadbeef/;' ...
Oops. I apologize; that attachment was meant for bug #1056053.