This service will be undergoing maintenance at 00:00 UTC, 2017-10-23 It is expected to last about 30 minutes
Bug 1302037 - Spurious NEWLINK netlink message after DELLINK when removing wifi module
Spurious NEWLINK netlink message after DELLINK when removing wifi module
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
23
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-26 10:29 EST by Beniamino Galvani
Modified: 2016-02-23 14:48 EST (History)
7 users (show)

See Also:
Fixed In Version: kernel-4.3.5-300.fc23 kernel-4.3.5-200.fc22
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-07 22:22:45 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Beniamino Galvani 2016-01-26 10:29:37 EST
Reproducible on Fedora 23 (kernel 4.2.6-300.fc23.x86_64)

When the wifi module is removed, the kernel sends a spurious NEWLINK
netlink message after DELLINK:

 # ip monitor link &
 [1] 6793

 # modprobe iwlwifi
 59: wlan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state UNKNOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DORMANT group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DORMANT group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP>
     link/ether
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state DORMANT group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP,LOWER_UP>
     link/ether
 59: wlp4s0: <NO-CARRIER,BROADCAST,MULTICAST,UP,LOWER_UP>
     link/ether
 59: wlp4s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff

 # modprobe -r iwlmvm iwlwifi
 59: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 Deleted 59: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
     link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
 59: wlp4s0: <BROADCAST,MULTICAST,UP>
     link/ether

Note that the last message arrives after the "Deleted" event. As a
consequence, userspace applications as NetworkManager which rely on
netlink messages to build a internal state of links believe that the
interface has appeared again.

The log above was captured with NetworkManager running, which brings
up and configures the wlp4s0 device.

No message regarding ifindex 59 should be sent after the DELLINK one.
Comment 1 Josh Boyer 2016-01-26 11:04:06 EST
Does this happen with the 4.3.3 update or with a rawhide kernel?
Comment 2 Beniamino Galvani 2016-01-26 11:23:31 EST
(In reply to Josh Boyer from comment #1)
> Does this happen with the 4.3.3 update or with a rawhide kernel?

Upgraded to 4.3.4-300.fc23.x86_64, still happens.
Comment 3 Johannes Berg 2016-01-26 17:08:08 EST
Would you be able to rebuild "ip" (iproute2) with a change to print out n->nlmsg_pid somewhere at the beginning of accept_msg() in ip/ipmonitor.c ?
Comment 4 Beniamino Galvani 2016-01-27 03:18:45 EST
(In reply to Johannes Berg from comment #3)
> Would you be able to rebuild "ip" (iproute2) with a change to print out
> n->nlmsg_pid somewhere at the beginning of accept_msg() in ip/ipmonitor.c ?

It's always zero:

(pid 0) 17: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
    link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
(pid 0) Deleted 17: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
    link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
(pid 0) 17: wlp4s0: <BROADCAST,MULTICAST,UP>
    link/ether
Comment 5 Johannes Berg 2016-01-27 03:22:51 EST
Indicating that the message is, indeed, coming from the kernel. Very odd, I don't even see how that could be generated without the MAC address etc.
Comment 6 Johannes Berg 2016-01-27 05:00:17 EST
I can't reproduce it - if you have some time, can you ping me on IRC ("johill" on freenode or OFTC)?

I think it might also be a wext message, can you print something like

  printf("ifla_wireless=%d\n", !!tb[IFLA_WIRELESS]);

in print_linkinfo() in ip/ipaddress.c - after parse_rtattr()?
Comment 7 Beniamino Galvani 2016-01-27 06:11:53 EST
Right, the last message has IFLA_WIRELESS set:

  wireless=0 42: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc mq state DOWN group default
      link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
  wireless=0 Deleted 42: wlp4s0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default
      link/ether 00:11:22:33:44:55 brd ff:ff:ff:ff:ff:ff
  wireless=1 42: wlp4s0: <BROADCAST,MULTICAST,UP>
      link/ether

and the caller seems to be:

 0xffffffff81768fe0 : wireless_send_event+0x0/0x400 [kernel]
 0xffffffffa0983185 : __cfg80211_disconnected+0x235/0x300 [cfg80211]
 0xffffffffa097ed3a : cfg80211_process_deauth+0xca/0xf0 [cfg80211]
 0xffffffffa097efef : cfg80211_tx_mlme_mgmt+0xaf/0xc0 [cfg80211]
 0xffffffffa0896843 : ieee80211_report_disconnect+0x63/0x130 [mac80211]
 0xffffffffa089c952 : ieee80211_mgd_deauth+0x132/0x220 [mac80211]
 0xffffffffa08666e8 : ieee80211_deauth+0x18/0x20 [mac80211]
 0xffffffffa097f9b2 : cfg80211_mlme_deauth+0xd2/0x130 [cfg80211]
 0xffffffffa097fbfb : cfg80211_mlme_down+0x6b/0x90 [cfg80211]
 0xffffffffa0983a45 : cfg80211_disconnect+0x175/0x190 [cfg80211]
 0xffffffffa0958ecd : __cfg80211_leave+0x8d/0x120 [cfg80211]
 0xffffffffa0958f8b : cfg80211_leave+0x2b/0x40 [cfg80211]
 0xffffffffa0959333 : cfg80211_netdev_notifier_call+0x393/0x5b0 [cfg80211]
 0xffffffff810bfc8a : notifier_call_chain+0x4a/0x70 [kernel]
 0xffffffff810bfe06 : raw_notifier_call_chain+0x16/0x20 [kernel]
 0xffffffff81665fb5 : call_netdevice_notifiers_info+0x35/0x60 [kernel]
 0xffffffff816662ca : __dev_close_many+0x5a/0x100 [kernel]
 0xffffffff816663f7 : dev_close_many+0x87/0x130 [kernel]
 0xffffffff81668745 : dev_close.part.77+0x45/0x70 [kernel]
 0xffffffff8166878a : dev_close+0x1a/0x20 [kernel]

I can reproduce this every time when NM or wpa_supplicant are managing the interface and the module is removed. I'll ping you on IRC later, thanks.
Comment 8 Johannes Berg 2016-01-27 06:13:58 EST
Ah, you were connected. Perhaps with that information I can reproduce it, let me try.
Comment 9 Johannes Berg 2016-01-27 06:48:45 EST
fix: https://p.sipsolutions.net/926eac7feec5a6a5.txt
Comment 10 Josh Boyer 2016-01-27 09:22:59 EST
(In reply to Johannes Berg from comment #9)
> fix: https://p.sipsolutions.net/926eac7feec5a6a5.txt

We'd likely want both patches in the series you sent to netdev, correct?
Comment 11 Johannes Berg 2016-01-27 09:24:24 EST
Yes; After sending I realized that there was another issue with the "UP" ordering, fixing that required the second patch.
Comment 12 Josh Boyer 2016-01-28 15:09:18 EST
I've added both to all branches in Fedora.  Thanks for such a quick fix, Johannes!
Comment 13 Josh Boyer 2016-01-29 08:32:26 EST
Looks like kernel test robot found an issue with the first patch.  Should I hold off on including these?

http://thread.gmane.org/gmane.linux.kernel/2139378
Comment 14 Johannes Berg 2016-01-29 11:14:41 EST
Ahrg. I'd fixed that issue, but discarded the change of approach (and introduced the second patch) and forgot to carry over the fix...

I've updated my tree at https://git.kernel.org/cgit/linux/kernel/git/jberg/mac80211.git/ to fix this issue.
Comment 15 Josh Boyer 2016-01-29 12:00:17 EST
(In reply to Johannes Berg from comment #14)
> Ahrg. I'd fixed that issue, but discarded the change of approach (and
> introduced the second patch) and forgot to carry over the fix...
> 
> I've updated my tree at
> https://git.kernel.org/cgit/linux/kernel/git/jberg/mac80211.git/ to fix this
> issue.

Could you point out the change you forgot to carry over?  I looked at your updated tree, and I don't see any difference in the patches there vs. the ones you sent to netdev.
Comment 16 Josh Boyer 2016-01-29 12:01:12 EST
(In reply to Josh Boyer from comment #15)
> (In reply to Johannes Berg from comment #14)
> > Ahrg. I'd fixed that issue, but discarded the change of approach (and
> > introduced the second patch) and forgot to carry over the fix...
> > 
> > I've updated my tree at
> > https://git.kernel.org/cgit/linux/kernel/git/jberg/mac80211.git/ to fix this
> > issue.
> 
> Could you point out the change you forgot to carry over?  I looked at your
> updated tree, and I don't see any difference in the patches there vs. the
> ones you sent to netdev.

Oh, wait.  I might have had a stale cached copy via gitweb.  Let me review again.
Comment 17 Josh Boyer 2016-01-29 12:02:24 EST
(In reply to Josh Boyer from comment #16)
> (In reply to Josh Boyer from comment #15)
> > (In reply to Johannes Berg from comment #14)
> > > Ahrg. I'd fixed that issue, but discarded the change of approach (and
> > > introduced the second patch) and forgot to carry over the fix...
> > > 
> > > I've updated my tree at
> > > https://git.kernel.org/cgit/linux/kernel/git/jberg/mac80211.git/ to fix this
> > > issue.
> > 
> > Could you point out the change you forgot to carry over?  I looked at your
> > updated tree, and I don't see any difference in the patches there vs. the
> > ones you sent to netdev.
> 
> Oh, wait.  I might have had a stale cached copy via gitweb.  Let me review
> again.

Yes, that was it.  I see the difference now.  I'll update the patches in Fedora git.
Comment 18 Fedora Update System 2016-02-01 21:26:05 EST
kernel-4.3.5-200.fc22 has been pushed to the Fedora 22 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-16a5625f33
Comment 19 Fedora Update System 2016-02-01 21:27:06 EST
kernel-4.3.5-300.fc23 has been pushed to the Fedora 23 testing repository. If problems still persist, please make note of it in this bug report.
See https://fedoraproject.org/wiki/QA:Updates_Testing for
instructions on how to install test updates.
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2016-fd30ad26a9
Comment 20 Fedora Update System 2016-02-07 22:22:36 EST
kernel-4.3.5-300.fc23 has been pushed to the Fedora 23 stable repository. If problems still persist, please make note of it in this bug report.
Comment 21 Fedora Update System 2016-02-23 14:48:45 EST
kernel-4.3.5-200.fc22 has been pushed to the Fedora 22 stable repository. If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.