Bug 1000679 - System freeze when 'ip link set wlp2s1 up' (rt2800pci)
System freeze when 'ip link set wlp2s1 up' (rt2800pci)
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
19
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: Stanislaw Gruszka
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-24 04:39 EDT by Alexei Panov
Modified: 2013-09-22 20:35 EDT (History)
10 users (show)

See Also:
Fixed In Version: kernel-3.11.1-300.fc20
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-12 20:59:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
/var/log/messages after system freezes and reboot (125.74 KB, text/plain)
2013-08-24 04:39 EDT, Alexei Panov
no flags Details
dmesg (58.36 KB, text/plain)
2013-08-24 04:40 EDT, Alexei Panov
no flags Details
lspci -knnn (4.01 KB, text/plain)
2013-08-24 04:40 EDT, Alexei Panov
no flags Details
dmidecode (22.05 KB, text/plain)
2013-08-24 04:41 EDT, Alexei Panov
no flags Details
bisect.log (1.08 KB, text/x-log)
2013-08-26 14:09 EDT, Igor Gnatenko
no flags Details
bisect.log (1.81 KB, text/plain)
2013-08-28 04:11 EDT, Igor Gnatenko
no flags Details
bisect.log (3.40 KB, text/plain)
2013-08-28 08:34 EDT, Igor Gnatenko
no flags Details
journalctl from kernel-debug (135.57 KB, text/plain)
2013-08-30 02:53 EDT, Maxim Polyakov
no flags Details
revert_c630ccf1a127578421a928489d51e99c05037054.patch (1.15 KB, text/plain)
2013-09-03 07:57 EDT, Stanislaw Gruszka
no flags Details
0001-rt2800-change-initialization-sequence.patch (1.18 KB, text/plain)
2013-09-04 11:58 EDT, Stanislaw Gruszka
ignatenko: review+
Details

  None (edit)
Description Alexei Panov 2013-08-24 04:39:43 EDT
Created attachment 789798 [details]
/var/log/messages after system freezes and reboot

Description of problem:
When I run command 'ip link set wlp2s1 up' system is freezes.
02:01.0 Network controller: Ralink corp. RT3060 Wireless 802.11n 1T/1R (see attachments)

Version-Release number of selected component (if applicable):
kernel-3.10.7-200.fc19.x86_64
kernel-3.11.0-0.rc6.git2.2.fc19.R.x86_64 (rebuild from koji)

How reproducible:
All time

Steps to Reproduce:
1. Boot with given wireless adapter
2. run command 'ip link set wlp2s1 up'
3. system freezes

Actual results:
System freezes

Expected results:
Network interface up and works as expected

Additional info:
Comment 1 Alexei Panov 2013-08-24 04:40:14 EDT
Created attachment 789799 [details]
dmesg
Comment 2 Alexei Panov 2013-08-24 04:40:55 EDT
Created attachment 789800 [details]
lspci -knnn
Comment 3 Alexei Panov 2013-08-24 04:41:39 EDT
Created attachment 789801 [details]
dmidecode
Comment 4 Maxim Polyakov 2013-08-24 09:51:31 EDT
 
This error occurs on my computer. If it is necessary, I'm ready to re-examine or to provide additional information.
Comment 5 Alexei Panov 2013-08-25 17:45:28 EDT
I got the same bug on a laptop with brcmsmac.
Downgrade kernel to 3.9.9 helps me.
Comment 6 Igor Gnatenko 2013-08-26 00:42:07 EDT
(In reply to Alexei Panov from comment #5)
> I got the same bug on a laptop with brcmsmac.
> Downgrade kernel to 3.9.9 helps me.
Let me start bisect (if jwb or jforbes or ... not opposed)
Comment 7 Maxim Polyakov 2013-08-26 02:08:26 EDT
Downgrade kernel to 3.9.9 helps me too. Today I rechecked it.
With kernel 3.9.9-302.fc19.x86_64 system not freeze when 'ip link set wlp2s1 up' (rt2800pci).
Comment 8 Josh Boyer 2013-08-26 11:05:32 EDT
Does this happen with 3.10.9?  There were some minstrel_ht related fixes that went into that kernel.
Comment 9 Igor Gnatenko 2013-08-26 14:09:47 EDT
Created attachment 790642 [details]
bisect.log

(In reply to Josh Boyer from comment #8)
> Does this happen with 3.10.9?  There were some minstrel_ht related fixes
> that went into that kernel.
No. Problem w/ rt2800pci is still on 3.10.9, but fixed problem w/ brcmsmac
Today we started git bisect. But now we haven't completed bisect. Tomorrow I can say what patch causes this regression.
Comment 10 Stanislaw Gruszka 2013-08-27 03:42:23 EDT
Is this 3.9 -> 3.10 regression or 3.10.0 -> 3.10.9 regression ?

If the later, it is most likely problem described here:

http://marc.info/?l=linux-kernel&m=137697458915806&w=2

and we also need to disable HT_CCK rates on rt2800 i.e. 

--- a/drivers/net/wireless/rt2x00/rt2800lib.c
+++ b/drivers/net/wireless/rt2x00/rt2800lib.c
@@ -5912,8 +5912,7 @@ static int rt2800_probe_hw_mode(struct rt2x00_dev *rt2x00dev)
            IEEE80211_HW_SUPPORTS_PS |
            IEEE80211_HW_PS_NULLFUNC_STACK |
            IEEE80211_HW_AMPDU_AGGREGATION |
-           IEEE80211_HW_REPORTS_TX_ACK_STATUS |
-           IEEE80211_HW_SUPPORTS_HT_CCK_RATES;
+           IEEE80211_HW_REPORTS_TX_ACK_STATUS;

        /*
         * Don't set IEEE80211_HW_HOST_BROADCAST_PS_BUFFERING for USB devices

Igor, could you check that if that helps, if not please continue bisection :-)
Comment 11 Igor Gnatenko 2013-08-27 05:00:49 EDT
(In reply to Stanislaw Gruszka from comment #10)
> Is this 3.9 -> 3.10 regression or 3.10.0 -> 3.10.9 regression ?
> 
> If the later, it is most likely problem described here:
> 
> http://marc.info/?l=linux-kernel&m=137697458915806&w=2
> 
> and we also need to disable HT_CCK rates on rt2800 i.e. 
> 
> --- a/drivers/net/wireless/rt2x00/rt2800lib.c
> +++ b/drivers/net/wireless/rt2x00/rt2800lib.c
> @@ -5912,8 +5912,7 @@ static int rt2800_probe_hw_mode(struct rt2x00_dev
> *rt2x00dev)
>             IEEE80211_HW_SUPPORTS_PS |
>             IEEE80211_HW_PS_NULLFUNC_STACK |
>             IEEE80211_HW_AMPDU_AGGREGATION |
> -           IEEE80211_HW_REPORTS_TX_ACK_STATUS |
> -           IEEE80211_HW_SUPPORTS_HT_CCK_RATES;
> +           IEEE80211_HW_REPORTS_TX_ACK_STATUS;
> 
>         /*
>          * Don't set IEEE80211_HW_HOST_BROADCAST_PS_BUFFERING for USB devices
> 
> Igor, could you check that if that helps, if not please continue bisection
> :-)
Ok. I will prepare kernel rpms w/ this patch for Maxim. If this not helps we will continue bisect.
Comment 12 Igor Gnatenko 2013-08-27 07:00:02 EDT
Maxim, please test koji build:
http://koji.fedoraproject.org/koji/taskinfo?taskID=5859000

Stanislaw, but actually this regression at the begin of May. 1-3 of numbers.
Bisecting: 373 revisions left to test after this (roughly 9 steps)

After test this kernel we can draw conclusions.
Comment 13 Maxim Polyakov 2013-08-27 08:15:42 EDT
With kernel-3.11.0-0.rc7.git0.1rhbz1000679.fc19 (http://koji.fedoraproject.org/koji/taskinfo?taskID=5859000), x86_64 my system freeze when 'ip link set wlp2s1 up'.
Tomorrow I will continue my search commit using bisect
Comment 14 Igor Gnatenko 2013-08-28 04:11:16 EDT
Created attachment 791276 [details]
bisect.log

Hi Stanislaw again!
We continue bisect and I was wrong.
Wrong commit above 86feff3f3eb643cc5735d414e46a8201a8c67b8f and 5743756161518f279ad0bd21437713f7bc3f0a81.
It's above March and April. Not May.
17-20 of March was many commits with rt2x00 and mac80211 but we can't test it all. On Maxim's PC many kernels from this dates has not boot (have kernel panic).
You have idea what commit might be problem more accurately?
Comment 15 Stanislaw Gruszka 2013-08-28 05:33:24 EDT
Only thing I know which could help is "git bisect skip", but seems you know about it already. Did you also try to limit bisection to drivers i.e. "git bisect start -- net/mac80211/ net/wireless/ drivers/net/wireless/rt2x00/" ? If not you can try that.

But perhaps we can try to debug this by traditional way, i.e. get kernel messages which print what is wrong.

Maxim, please install kernel-debug, boot to it, switch to virtual console from X window (i.e. Ctrl+Alt+F3) and run this 'ip link set wlp2s1 up' command. It should print some messages before kernel hung, please take a photo of the screen and attach it here.
Comment 16 Maxim Polyakov 2013-08-28 06:48:37 EDT
I assemble kernel and install it without rpm-pakage (to spare time), that's why I don't know how to get kernel-debug. I will ask Igor Gnatenko about this. Concerning check-out - I always check (test) in console (Ctrl+Alt+F2), and there is no message, the system just freeze.

Now I see:

Bisecting: 28 revisions left to test after this (roughly 5 steps)
[345fb3f8efe5a4bf4f0e34e0a988b1482cbe9aa2] Merge tag 'for-linville-20130318' of git://github.com/kvalo/ath6kl

:)
Comment 17 Igor Gnatenko 2013-08-28 06:53:31 EDT
(In reply to Stanislaw Gruszka from comment #15)
> Only thing I know which could help is "git bisect skip", but seems you know
> about it already. Did you also try to limit bisection to drivers i.e. "git
> bisect start -- net/mac80211/ net/wireless/ drivers/net/wireless/rt2x00/" ?
> If not you can try that.
Yeah. We used skip. I didn't notice about limit bisection to drives. Is it works w/ started bisection? don't we lose current results?
Comment 18 Stanislaw Gruszka 2013-08-28 07:40:14 EDT
(In reply to Maxim Polyakov from comment #16)
> I assemble kernel and install it without rpm-pakage (to spare time), that's
> why I don't know how to get kernel-debug.

yum install kernel-debug

and you have to setup grub to boot that kernel
Comment 19 Igor Gnatenko 2013-08-28 08:34:01 EDT
Created attachment 791376 [details]
bisect.log

We found bad commit.
I will soon create rpm package w/ revert this commit.
c630ccf1a127578421a928489d51e99c05037054 is the first bad commit
commit c630ccf1a127578421a928489d51e99c05037054
Author: Stanislaw Gruszka <stf_xl@wp.pl>
Date:   Sat Mar 16 19:19:46 2013 +0100

    rt2800: rearrange bbp/rfcsr initialization
    
    This makes order of initialization of various registers similar like
    on vendor driver.
    
    Based on:
    NICInitializeAsic()
    RT5592LoadRFNormalModeSetup()
    
    from:
    DPO_RT5572_LinuxSTA_2.6.1.3_20121022/common/rtmp_init.c
    DPO_RT5572_LinuxSTA_2.6.1.3_20121022/chip/rt5592.c
    
    Signed-off-by: Stanislaw Gruszka <stf_xl@wp.pl>
    Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>

:040000 040000 69266bfa97e9e808257f9cfa7196f00222410ebf 7e18f08faa95bb3dd7159381d6085e5b5287809c M      drivers
Comment 20 Igor Gnatenko 2013-08-28 13:35:28 EDT
I can't revert this commit. We have too many commits in this file after.
Stanislaw, what we should to do ?
Comment 21 Stanislaw Gruszka 2013-08-29 10:12:25 EDT
Hmm, bad commit does not looks like it can cause freezes. Since revert is not possible, you can verify that by:

# git checkout -b test1 c630ccf1a127578421a928489d51e99c05037054
This one should freeze.

# git checkout -b test2 c630ccf1a127578421a928489d51e99c05037054~1
If bisection was correct, this one should work ok.

Even if bisection was correct, I'm not really sure where the problem is ...

Maxim, did you try install debug kernel ? Does it print any messages before freeze ?
Comment 22 Maxim Polyakov 2013-08-30 01:53:02 EDT
"test1" - system freeze
"test2" - Ok

"RFRemix, with Linux 3.10.9-200.fc19.x86_64.debug" - system freeze without any messages before freeze
http://s018.radikal.ru/i503/1308/1f/b54bb725c1a5.png
Comment 23 Maxim Polyakov 2013-08-30 02:53:43 EDT
Created attachment 792041 [details]
journalctl from kernel-debug
Comment 24 Stanislaw Gruszka 2013-09-03 07:57:49 EDT
Created attachment 793152 [details]
revert_c630ccf1a127578421a928489d51e99c05037054.patch

Here is patch that reverts non RT5592 specific part of commit c630ccf1a127578421a928489d51e99c05037054 . It applies on top of 3.11. Does it prevent freeze ?
Comment 25 Igor Gnatenko 2013-09-03 10:05:16 EDT
Applied patch to upstream 3.11.0.
http://ignatenkobrain.fedorapeople.org/kernel/rhbz1000679/
Maxim, please test this kernel.
Comment 26 Maxim Polyakov 2013-09-04 03:18:01 EDT
With that kernel everything works.
Comment 27 Stanislaw Gruszka 2013-09-04 11:58:42 EDT
Created attachment 793744 [details]
0001-rt2800-change-initialization-sequence.patch

Modified revert - proposed fix. It should be the final patch to test.

Maxim, does it still work for you? Kernel build with the patch is here:

http://koji.fedoraproject.org/koji/taskinfo?taskID=5893626
Comment 28 Maxim Polyakov 2013-09-05 08:27:32 EDT
"RFRemix, with Linux 3.9.9-302.fc19.x86_64" - system freeze without any messages before freeze
"RFRemix, with Linux 3.9.9-302.fc19.x86_64.debug" - system freeze without any messages before freeze
Comment 29 Maxim Polyakov 2013-09-05 08:54:44 EDT
Sorry, 
3.9.9-302.fc19.x86_64 = kernel-3.10.10-200
I was wrong when he wrote the report
Comment 30 Stanislaw Gruszka 2013-09-05 11:06:45 EDT
(In reply to Maxim Polyakov from comment #29)
> Sorry, 
> 3.9.9-302.fc19.x86_64 = kernel-3.10.10-200
> I was wrong when he wrote the report
I understand that the kernel from comment 27 (kernel-3.10.10-200.bz1000679.fc19) frezes, correct  ?
Comment 31 Igor Gnatenko 2013-09-05 12:35:34 EDT
(In reply to Stanislaw Gruszka from comment #30)
> (In reply to Maxim Polyakov from comment #29)
> > Sorry, 
> > 3.9.9-302.fc19.x86_64 = kernel-3.10.10-200
> > I was wrong when he wrote the report
> I understand that the kernel from comment 27
> (kernel-3.10.10-200.bz1000679.fc19) frezes, correct  ?
I spoke w/ him at jabber. It's correct.

Stanislaw, also I think more better change status to assign.
Comment 32 Maxim Polyakov 2013-09-06 01:05:59 EDT
 Igor, thank you! Yes, indeed, all right. System freeze without any messages before freeze - kernel-3.10.10-200 (http://koji.fedoraproject.org/koji/taskinfo?taskID=5893626)
Comment 33 Maxim Polyakov 2013-09-06 08:06:35 EDT
I retest kernel-3.10.10-200 - Ok
 
All works in Fedora (3.10.10-200.bz1000679.fc19.x86_64) 19 (Schrödinger’s Cat)

Before checking I removed the kernel:
# yum remove kernel*3.10.10*
and reinstall 3.10.10-200

I apologize for improperly conducted the test
Comment 34 Igor Gnatenko 2013-09-06 11:37:30 EDT
Comment on attachment 793744 [details]
0001-rt2800-change-initialization-sequence.patch

fixes bug.
Comment 35 Stanislaw Gruszka 2013-09-09 06:48:41 EDT
Patch posted here:
http://marc.info/?l=linux-wireless&m=137872301718879&w=2

This is 3.11 version, 3.10 version is attached on comment 27.

Josh, please apply patch as fix for this bug.
Comment 36 Josh Boyer 2013-09-09 08:40:40 EDT
Applied, thanks!
Comment 37 Fedora Update System 2013-09-09 15:07:36 EDT
kernel-3.10.11-200.fc19 has been submitted as an update for Fedora 19.
https://admin.fedoraproject.org/updates/kernel-3.10.11-200.fc19
Comment 38 Fedora Update System 2013-09-09 15:10:00 EDT
kernel-3.10.11-100.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/kernel-3.10.11-100.fc18
Comment 39 Fedora Update System 2013-09-10 21:54:01 EDT
Package kernel-3.10.11-100.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing kernel-3.10.11-100.fc18'
as soon as you are able to, then reboot.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-16336/kernel-3.10.11-100.fc18
then log in and leave karma (feedback).
Comment 40 Fedora Update System 2013-09-12 20:59:45 EDT
kernel-3.10.11-200.fc19 has been pushed to the Fedora 19 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 41 Fedora Update System 2013-09-14 18:31:33 EDT
kernel-3.11.1-300.fc20 has been submitted as an update for Fedora 20.
https://admin.fedoraproject.org/updates/kernel-3.11.1-300.fc20
Comment 42 Fedora Update System 2013-09-15 20:23:49 EDT
kernel-3.10.11-100.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 43 Fedora Update System 2013-09-22 20:35:31 EDT
kernel-3.11.1-300.fc20 has been pushed to the Fedora 20 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.