Bug 1199727
Summary: | (iwlwifi) fail to flush all tx fifo queues | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Chris van de Sande <cvandesande> | ||||||||||||||||
Component: | kernel | Assignee: | fedora-kernel-wireless-iwl | ||||||||||||||||
Status: | CLOSED EOL | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||||||
Severity: | high | Docs Contact: | |||||||||||||||||
Priority: | unspecified | ||||||||||||||||||
Version: | 21 | CC: | anelson, cvandesande, extras-orphan, gansalmon, hamzy, itamar, jonathan, kernel-maint, linville, madhu.chinakonda, mchehab | ||||||||||||||||
Target Milestone: | --- | Flags: | kernel-team:
needinfo?
|
||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | x86_64 | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2015-12-02 09:52:25 UTC | Type: | Bug | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Attachments: |
|
Description
Chris van de Sande
2015-03-07 13:55:30 UTC
Created attachment 999343 [details]
dmesg output
I see this as well on a Lenovo Thinkpad W540 [root@hamzy-tp-w540 ~]# uname -a Linux hamzy-tp-w540 3.18.8-201.fc21.x86_64 #1 SMP Fri Feb 27 18:18:27 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux This is actually upstream commit a0855054e59b0c5b2b00237fdb5147f7bcc18efb. We probably also want to take 5a12a07e4495d1e4d79382e05c9d6e8b4d9fa4ec (which is a bugfix for the patch above), and 4e6c48e0984e28d064ee8fbc292aee7b7920c507 (which is the same patch/fix for a related set of hardware). (In reply to John W. Linville from comment #5) > This is actually upstream commit a0855054e59b0c5b2b00237fdb5147f7bcc18efb. This is already in 3.18 upstream and therefore in Fedora. > We probably also want to take 5a12a07e4495d1e4d79382e05c9d6e8b4d9fa4ec > (which is a bugfix for the patch above), and That one is in 3.18.y stable as commit e7fd25db8348873b40dfa8ef882758e731de0ae1 which is in 3.18.3 so already in Fedora. > 4e6c48e0984e28d064ee8fbc292aee7b7920c507 (which is the same patch/fix for a > related set of hardware). OK, this one went into 3.19 and hasn't been pulled into a stable tree yet. So it's possible that would fix this bug but it would be nice to confirm that the reporters have the hardware related to that bugfix. Going to test 3.19 today The output from 'lspci -n' is probably sufficient to see if you have the hardware, but testing 3.19 seems like a good idea too... :-) [root@hamzy-tp-w540 ~]# lspci -n 00:00.0 0600: 8086:0c04 (rev 06) 00:01.0 0604: 8086:0c01 (rev 06) 00:02.0 0300: 8086:0416 (rev 06) 00:03.0 0403: 8086:0c0c (rev 06) 00:14.0 0c03: 8086:8c31 (rev 04) 00:16.0 0780: 8086:8c3a (rev 04) 00:16.3 0700: 8086:8c3d (rev 04) 00:19.0 0200: 8086:153a (rev 04) 00:1a.0 0c03: 8086:8c2d (rev 04) 00:1b.0 0403: 8086:8c20 (rev 04) 00:1c.0 0604: 8086:8c10 (rev d4) 00:1c.1 0604: 8086:8c12 (rev d4) 00:1c.2 0604: 8086:8c14 (rev d4) 00:1c.4 0604: 8086:8c18 (rev d4) 00:1d.0 0c03: 8086:8c26 (rev 04) 00:1f.0 0601: 8086:8c4f (rev 04) 00:1f.2 0106: 8086:8c03 (rev 04) 00:1f.3 0c05: 8086:8c22 (rev 04) 01:00.0 0300: 10de:0ff6 (rev a1) 02:00.0 0805: 1217:8520 (rev 01) 03:00.0 0280: 8086:08b2 (rev 83) I built and installed 3.19.1 anyway, so far it's working but it's a bit too soon to tell. root@t430 cvandesande]# lspci -n 00:00.0 0600: 8086:0154 (rev 09) 00:02.0 0300: 8086:0166 (rev 09) 00:14.0 0c03: 8086:1e31 (rev 04) 00:16.0 0780: 8086:1e3a (rev 04) 00:19.0 0200: 8086:1502 (rev 04) 00:1a.0 0c03: 8086:1e2d (rev 04) 00:1b.0 0403: 8086:1e20 (rev 04) 00:1c.0 0604: 8086:1e10 (rev c4) 00:1c.1 0604: 8086:1e12 (rev c4) 00:1c.2 0604: 8086:1e14 (rev c4) 00:1d.0 0c03: 8086:1e26 (rev 04) 00:1f.0 0601: 8086:1e55 (rev 04) 00:1f.2 0106: 8086:1e03 (rev 04) 00:1f.3 0c05: 8086:1e22 (rev 04) 02:00.0 0880: 1180:e822 (rev 07) 03:00.0 0280: 8086:0891 (rev c4) [root@t430 cvandesande]# Well, it looks like Chris van de Sande's 8086:0891 device is a dvm device. This should have been covered by the two patches already in the 3.18.y stream. I don't find any listing for Mark Hamzy's 8086:08b2 device. Mark, is that device even working for you at all? Yes! After I rebooted, I had a lot of problems connecting to the network. And I would see a lot of messages from iwlwifi, cfg80211, and wlp3s0 in dmesg. But now the network seems to have stabilized. This machine is a Lenovo Thinkpad W540. Interesting...could you attatch the output of "modinfo iwlwifi" and the output of "ethtool -i wlp3s0"? Created attachment 999980 [details] Output for comment 13 OK, I see it now -- need to renew my source code search training... ;-) Mark, your device is in the MVM category and hopefully it will benefit from the 4e6c48e098 patch. Josh, I could test a scratch koji build, if you would be willing to provide one. It is really easy to download it and install/remove it as an rpm... Just confirmed the problem still happens in 3.19.1, as expected. Interesting though, it took a few hours for the error to manifest instead of a few minutes. Though it could be due to today's wifi climate in my area. Created attachment 1000098 [details]
dmesg output kernel 3.19.1
I'll send a tentative fix tomorrow. Created attachment 1000162 [details]
fix
please test this patch.
Thanks Emmanual! Patch applied, running now but it's getting a little late for me. Will report back tomorrow with results. Never mind, just happened with the patch :( Created attachment 1000199 [details]
dmesg with patched kernel
yes - this is because I forgot that this patch relies on 3b24f4c65386dc0f2efb41027bc6e410ea2c0049. Can you please take 3b24f4c65386dc0f2efb41027bc6e410ea2c0049 as well? I'm unable to apply 3b24f4c65386dc0f2efb41027bc6e410ea2c0049. I get 1 out of 1 hunk FAILED -- saving rejects to file net/mac80211/cfg.c.rej 1 out of 1 hunk FAILED -- saving rejects to file net/mac80211/ieee80211_i.h.rej 1 out of 1 hunk FAILED -- saving rejects to file net/mac80211/tx.c.rej 2 out of 3 hunks FAILED -- saving rejects to file net/mac80211/util.c.rej I'm using the Fedora kernel, 3.18.8. Sorry, but I won't port this patch. It is not stable material anyway. Can you please test on 4.0 with the patch attached? No problem, will report back. Ok so I built 4.0-rc3. The patch 3b24f4c65386dc0f2efb41027bc6e410ea2c0049 already seemed to be in, so I only applied the patch from comment 20. I then booted the new kernel, connected and started a background "ping 8.8.8.8". I've been doing the ping ever since I've been having this problem, just to see if I've really lost my connection. I then watched some YouTube for a short awhile until I lost connectivity. While I didn't get the "failed to flush" error, I did get the other symptom: 64 bytes from 8.8.8.8: icmp_seq=1477 ttl=59 time=52.8 ms 64 bytes from 8.8.8.8: icmp_seq=1478 ttl=59 time=318 ms 64 bytes from 8.8.8.8: icmp_seq=1479 ttl=59 time=6.41 ms 64 bytes from 8.8.8.8: icmp_seq=1480 ttl=59 time=21.3 ms ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available I waited a few minutes to see if it would recover on its own, but after about 2mins of no connectivity, I forced NetworkManager to reconnect. It succeeded and connectivity was restored. Created attachment 1000395 [details]
dmesg 4.0-rc3 with patch
You have disconnections here... not much I can do about that. Is the system behaving better now? Seems much less frequent, but still happened twice a few minutes apart. Created attachment 1000711 [details]
dmesg 4.0-rc3 patched
Ok- in the case it does happen, I can't do much. This is a firmware / environment problem. The patch is on its way upstream. Well you're certainly right about the environment. I never had this problem until I moved into my current dwelling. So while it's true the problem still occurred, last night was by far the best night I had since I've moved here in terms of connectivity. Instead getting drops every few minutes, I only had 2, since installing 4.0 with your patch. Where that leaves us in terms of this bug I'm not sure. Do we consider this as a upstream fix? yup - it is already on its way: https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/iwlwifi-fixes.git/commit/?id=a3a0a5992e47869232cffcb02b7d32fe5204ac7c FWIW - I am removing Intel from here since we've done what we could. Other issues would be a firmware problem and we don't have firmware support for these devices. Just to post an update, the bug really does seem to be related to the environment. Last night I tried to watch a streaming movie and got the "failed to flush" several times. That's with the patch running the 4.0-rc3 kernel. So I have to conclude that it's an Intel firmware bug, on hardware that they no longer support. I do appreciate Emmanuel's effort though, it seems his hands are tied. Normally, I would change wireless cards at this point, but my Lenovo T430 whitelists only a select few wireless cards, none of which are supported by Intel anymore. Not Red Hat's problem, I know. The best way to fix this problem is to get a stronger signal. Move closer or get a new AP. This is harder to do in a corporate environment, but referencing this bug might suffice in requesting a different machine. So unless anyone else has anything to add, this bug can probably be closed as WONTFIX, though the upstream patches do improve the situation. John, Is the Fedora 21 kernel going to get all of the patches mentioned in this bugzilla? Does anyone know what exactly the firmware problem is? I'll have to defer to Jarod on that one. The patch is not marked for -stable, so it probably won't get into Fedora 21 by default (unless/until Fedora 21 gets a 4.0 kernel)... Jarod probably doesn't want to be messing with Fedora kernels. F21 will get a 4.0 rebase around the time 4.0.1 comes out. Josh -- brain fart! At least you knew who I meant... ;-) *********** MASS BUG UPDATE ************** We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 21 kernel bugs. Fedora 21 has now been rebased to 3.19.5-200.fc21. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 22, and are still experiencing this issue, please change the version to Fedora 22. If you experience different issues, please open a new bug report for those. This message is a reminder that Fedora 21 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 21. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '21'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 21 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Original reporter here. Later kernel versions did improve the situation, but it still occurred regularly. I've since moved to a new country and am no longer in the environment that caused the issue in the first place. So I'm no longer able to reproduce. I'll leave to it you Fedora guys to decide what you want to do about it. I do appreciate the help and I support I got from everyone. Thank you guys! Fedora 21 changed to end-of-life (EOL) status on 2015-12-01. Fedora 21 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed. |