Bug 451192 - USB Subsystem (or usb-mousedriver) causes hardlocks
USB Subsystem (or usb-mousedriver) causes hardlocks
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
9
i686 Linux
low Severity low
: ---
: ---
Assigned To: Kernel Maintainer List
Fedora Extras Quality Assurance
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2008-06-13 07:03 EDT by Thomas Janssen
Modified: 2008-07-22 14:33 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-07-19 17:17:25 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
This is the make bzImage output with the warnings and errors (132.81 KB, text/plain)
2008-06-15 08:27 EDT, Thomas Janssen
no flags Details
This are the warnings from make oldconfig if needed or interesting. (11.57 KB, text/plain)
2008-06-15 08:44 EDT, Thomas Janssen
no flags Details
That is the dmidecode output of my box. (8.55 KB, text/plain)
2008-06-15 08:49 EDT, Thomas Janssen
no flags Details
And that is the lsusb output. (371 bytes, text/plain)
2008-06-15 08:51 EDT, Thomas Janssen
no flags Details
That is the lspci -vv output of my bad box. (10.55 KB, text/plain)
2008-06-18 10:00 EDT, Thomas Janssen
no flags Details
These area few commits from 2.6.24 changelog @ kernel.org (3.55 KB, text/plain)
2008-06-18 10:05 EDT, Thomas Janssen
no flags Details
lspci and lsusb output from my machine (51.78 KB, text/plain)
2008-07-08 17:19 EDT, Silivrenion
no flags Details

  None (edit)
Description Thomas Janssen 2008-06-13 07:03:58 EDT
Description of problem:
I try to track this hardlocks down for month`s now. It started with Kernel
2.6.24 series and is still there with 2.6.25 kernel series.
The problem exists on my Fujitsu-Siemens Laptop with amd xp-m 2500 cpu (not just
with laptops, there are others on #fedora who has the same problem)
It seems that i finally found what causes this. If you have a Logitech-Usb-mouse
plugged the system will totally hardlock (no ssh in, if you ssh`d in no
response) in between 2 minutes and 18-36 hours. Out of the blue. Nothing to find
in any log.
There is one more who has the same problem. He will add himself when he is back
home after the weekend.

Version-Release number of selected component (if applicable):
Complete Kernel-series: 2.6.24 and 2.6.25
The older 2.6.23 series is working flawless.

How reproducible:
Have the mouse plugged, work on your computer and wait until it freezes.

Steps to Reproduce:
1.
2.
3.
  
Actual results:
Complete system freeze (hardlock) out of the blue.

Expected results:
No system freeze (hardlock)

Additional info:
lsusb output of the mouse:
Bus 002 Device 002: ID 046d:c03d Logitech, Inc. M-BT69a Pilot Optical Mouse

I have plugged and unplugged for testing the last weeks USB-HD`s (Maxtor, WD
Elements) and a DVB-T USB-Box (Avermedia), a Brother HL-5130 USB-Printer and it
seems they are working normal and do not hardlock the system. I will try right
now a last recording to see if it is hardlocking during it. Sorry for my bad
english.
Comment 1 Thomas Janssen 2008-06-13 17:03:48 EDT
Ok, the mouse was not plugged. I had irssi running (#fedora), FF3, Totem
streaming Tiesto`s latest podcast, Kaffeine was recording to my USB-HD from my
Avermedia DVB-T box. I was trying to bring totem in front then Xfce freeze, the
sound still was streaming for about 30 seconds and then starts repeating the
last 1/4 second stuttering and that`s it. Complete freeze. No way in, nothing
out. The magic sysrq keys (REISUB) are not working (i have it enabled).
The only thing that i remember as well is, that i use the powersave-governor
(most time) and one freeze was right after i changed the governor to ondemand.

This really drives me nuts.
Comment 2 Thomas Janssen 2008-06-15 08:27:42 EDT
Created attachment 309392 [details]
This is the make bzImage output with the warnings and errors
Comment 3 Thomas Janssen 2008-06-15 08:40:44 EDT
I try to find the patch who causes the hardlocks now with git bisect.

git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
linux-git
cd linux-git
git bisect start
git bisect bad v2.6.24
git bisect good v2.6.23

The first kernel run was perfect, i booted into the kernel and it took me just 5
minutes to reproduce the hardlock. Heavy usb use, record from usb-dvb-t-box to
usb-hd, usb-mouse and usb-wifi.

I started then the box and booted into my working kernel and have done:
git bisect bad

then i have done:
make mrproper
copied the .config fresh for make oldconfig
make oldconfig
make dep (it told me:)
 scripts/kconfig/conf -s arch/i386/Kconfig
*** Warning: make dep is unnecessary now.
make clean
make bzImage

And trough make bzImage i got lots of warnings and errors (see attachment id=309392)

Even after git bisect skip (3 times) and the same steps to compile the next
kernel, it stops there.
Google knows nothing for: block/compat_ioctrl.c errors, so i have no idea what
to do next to be able to compile the next kernel.

Please help me, to help you, to help me and others with the same problem.

thomasj

P.s. Sorry that the attachment was there before the comment.
Comment 4 Thomas Janssen 2008-06-15 08:44:48 EDT
Created attachment 309393 [details]
This are the warnings from make oldconfig if needed or interesting.
Comment 5 Thomas Janssen 2008-06-15 08:49:42 EDT
Created attachment 309394 [details]
That is the dmidecode output of my box.
Comment 6 Thomas Janssen 2008-06-15 08:51:43 EDT
Created attachment 309395 [details]
And that is the lsusb output.
Comment 7 Albert69 2008-06-15 12:28:55 EDT
here i have the same hardlock, random 5min/40 hours , with different apps
runnig, using or not the pc (and with or without the nvidia driver)
"- AMD® AM2+ Quad-Core CPU/AM2 CPU Support
- AMD 770 / SB600 chipset
- HyperTransport 3.0 and PCIe 2.0 Ready
- Dual Channel DDR2 1066
- ASUS Q-Shield"
ata hd with f9 installed but 3 sata hd present in the system, 1 sata dvd
atheros wifi internal pci card (but usually 99,9% of the time connected by a
wired one)
nvidia pci card (i dont' remember if 8600 or 8400, let me know if can be important)
2gb of ram
usb devices:
logitech mx300 mouse
bluetooth usb device
tdk lpcw-50 (used with virtualbox, if is there any linux app for use it is well
appreciated too )
logitech internet navigator keyboard

i had just few minutes ago another system freeze (running the newst fedora
kernel 2.6.25.6-55.fc9.i686

here too, no chance to find anything in messages, sysreq does not works and is
not possible to connect with a ssh connection

at today i havent' tryed to detach the usb mouse



Comment 8 Albert69 2008-06-15 12:30:15 EDT
lsusb


lsusb
Bus 001 Device 006: ID 0ea0:2126 Ours Technology, Inc. 7-in-1 Card Reader
Bus 001 Device 008: ID 0a5c:2101 Broadcom Corp. A-Link BlueUsbA2 Bluetooth
Bus 001 Device 007: ID 05e3:0606 Genesys Logic, Inc. USB 2.0 Hub / D-Link DUB-H4
USB 2.0 Hub
Bus 001 Device 003: ID 05e3:0606 Genesys Logic, Inc. USB 2.0 Hub / D-Link DUB-H4
USB 2.0 Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 006 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 005 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 003 Device 003: ID 046d:c024 Logitech, Inc. MX300 Optical Mouse
Bus 003 Device 002: ID 07cf:4007 Casio Computer Co., Ltd CW50 Device
Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Bus 002 Device 002: ID 046d:c309 Logitech, Inc. Internet Keyboard
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
Comment 9 Albert69 2008-06-15 12:35:58 EDT
lspci
00:00.0 Host bridge: ATI Technologies Inc RX780/RX790 Chipset Host Bridge
00:02.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (external gfx0
port A)
00:06.0 PCI bridge: ATI Technologies Inc RD790 PCI to PCI bridge (PCI express
gpp port C)
00:12.0 SATA controller: ATI Technologies Inc SB600 Non-Raid-5 SATA
00:13.0 USB Controller: ATI Technologies Inc SB600 USB (OHCI0)
00:13.1 USB Controller: ATI Technologies Inc SB600 USB (OHCI1)
00:13.2 USB Controller: ATI Technologies Inc SB600 USB (OHCI2)
00:13.3 USB Controller: ATI Technologies Inc SB600 USB (OHCI3)
00:13.4 USB Controller: ATI Technologies Inc SB600 USB (OHCI4)
00:13.5 USB Controller: ATI Technologies Inc SB600 USB Controller (EHCI)
00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 14)
00:14.1 IDE interface: ATI Technologies Inc SB600 IDE
00:14.2 Audio device: ATI Technologies Inc SBx00 Azalia
00:14.3 ISA bridge: ATI Technologies Inc SB600 PCI to LPC Bridge
00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:00.0 VGA compatible controller: nVidia Corporation GeForce 8500 GT (rev a1)
02:00.0 Ethernet controller: Attansic Technology Corp. L1 Gigabit Ethernet
Adapter (rev b0)
03:06.0 Ethernet controller: Atheros Communications Inc. AR5212/AR5213
Multiprotocol MAC/baseband processor (rev 01)
Comment 10 Thomas Janssen 2008-06-17 02:39:17 EDT
kernel-2.6.23.15-137.fc8
kernel-2.6.24.3-12.fc8
kernel-2.6.24.3-34.fc8
kernel-2.6.24.3-50.fc8
kernel-2.6.24.4-64.fc8
kernel-2.6.24.5-85.fc8
kernel-2.6.24.7-92.fc8
kernel-2.6.25.4-10.fc8

On Top the last working Kernel from F8, the rest (2.6.24 series) hardlocks. And
the same with 2.6.25 series, same in F9, so i will change the topic to F9.
Just in case :)
Comment 11 Eric Sandeen 2008-06-17 09:37:03 EDT
It would probably be good to try to capture kernel messages at the time of the
hardlock, perhaps you are oopsing?  You could do this with a serial console or
if you're lucky maybe there is something in /var/log/messages.

You could also try some of the earlier fedora 2.6.24 kernels:

http://kojipkgs.fedoraproject.org/packages/kernel/2.6.24/

for example 0.38.rc2.git6.fc9, and sort of bisect your way through by installing
& booting those (if you're lucky and that 0.38 kernel does work for you, find
the first version that stops working)

-Eric
Comment 12 Stefan Richter 2008-06-17 10:24:49 EDT
About bisection:

It is alas an often encountered problem that kernels contain build bugs between
2.6.x and 2.6.(x+1)-rc1.  It's rarer in later -rc's.  You can of course try to
minimize such build failures by reducing the configuration to the bare minimum
that you need to boot and reproduce the bug.  But you probably already did so in
order to reduce compile time.


About capturing crash logs:

A lesser alternative to serial console is netconsole.  Useful if you have a 2nd
PC and can connect both via Ethernet.  There are different ways to capture
netconsole output.  I use as 2nd PC one with syslog-ng and Gentoo; I add (or
remove, to get rid of it again) the line

    udp(ip("192.168.0.42") port(6666));

in the source section for my main log in /etc/syslog-ng/syslog-ng.conf.  Then
make it active with
# /etc/init.d/syslog-ng reload

On the crashing PC, I run
# modprobe netconsole netconsole="@/,@192.168.0.42/"
# dmesg -n 8
then provoke the crash.

The IP address is the one of the 2nd PC with the logger.  The "-n 8" will cause
kernel messages of all log levels, including debug level, to go to the virtual
console and to the netconsole.

You find the netconsole build configuration option in "Device Drivers" -
"Network device support" - "Network console logging support (EXPERIMENTAL)".
Documentation is in linux/Documentation/networking/netconsole.txt.
Comment 13 Thomas Janssen 2008-06-18 09:58:21 EDT
Ok, thx for the suggestions :) Right now i`m on netconsole and see that not all
messages comes trough like they are in dmesg. I dont use record from my
USB-DVB-T Box so the system is now up 20 Hours.

I have done some research in the meantime and will paste some changes i found in
the kernel-changelog from kernel.org.

I had a email from Patrick Boettcher who is the Author of the usb_dvb_a800
driver (my AverMedia dvb-t-box) that the driver was not written from reverse
engineering.
That keeps some surprises out.

Attached will be a lspci -vv output and some commits together in one file.
Comment 14 Thomas Janssen 2008-06-18 10:00:41 EDT
Created attachment 309731 [details]
That is the lspci -vv output of my bad box.
Comment 15 Thomas Janssen 2008-06-18 10:05:56 EDT
Created attachment 309732 [details]
These area  few commits from 2.6.24 changelog @ kernel.org

I know there are way more usb changes in this changelog, but i`m not a kernel
guru nor a usb-subsystem guru. So i have copied what i could imagine it could
be the problem.
Comment 16 Thomas Janssen 2008-06-21 08:44:50 EDT
Ok, the system was running the longest time 3Days and 3Hours until the next
hardlock happened and no output in netconsole anyhow related to the hardlock.
There was about 50 lines dvb error while querying for a remote event. And thats it.

Right now i have compiled a 2.6.25 kernel from kernel.org and have used the make
oldconfig option (copied the .config from fedora kernel). Lets see if it works
without any Fedora specific patches (if there are any).
Comment 17 Thomas Janssen 2008-06-22 10:46:01 EDT
Well.. The 2.6.25 from kernel.org is hardlocking the box as well. I`m running
out of ideas. I will try one from the first 2.6.24 from koji to test if it is
there as well and if i can find one of the 2.6.26rc too. I feel a kind of lonely
here. And all the others who has the same problem dont report the problem here.
They just say it in #fedora but doesn`t file a bug.
Comment 18 Albert69 2008-06-23 08:00:49 EDT
i haven't any dvb devices, but the problem in my pc seems exactly the same 

can be interesting running the same test here
when i'm back to home i try to setup this netconsole
or any better log method if exist 

Comment 19 Thomas Janssen 2008-06-28 08:21:23 EDT
Well.. I`m off of this bugreport. I dont run Fedora anymore. I have openSUSE 11
now running for about 7 days flawless on this machine (kernel 2.6.25). But thx
for trying to help me @ Eric Sandeen and Stefan Richter.
I guess i will try the Fedora 10 live-cd when it is released in a few month and
see what happens.
Comment 20 Stefan Richter 2008-06-28 08:42:19 EDT
If you get bored someday, you could build the openSUSE kernel from source with a
.config from Fedora.  (I suppose you also built the vanilla kernels that crashed
for you with .configs which you derived from Fedora's.)
Comment 21 Thomas Janssen 2008-06-28 10:02:11 EDT
You make me curious again :) Yes, you`re right, i have build the vanillas with
.config from Fedora`s kernel. I will try that. But not tomorrow, there is the
Euro 2008 Final day, and we have a party here ;)
Comment 22 Albert69 2008-06-29 08:20:37 EDT
thx, let us know the results, so maybe, can help some developper for solve this
:(  issue
regards
Comment 23 Andrew Taylor 2008-07-08 17:02:50 EDT
I have a standard desktop setup no servers or the like. I don't use any usb
Devices but I do have a USB Logitec Trackball with a ps2 converter. My PC
probably hard locks about once a week but its a desktop that isn't on all the
time. Hopefully this info helps some.
Comment 24 Silivrenion 2008-07-08 17:19:09 EDT
Created attachment 311316 [details]
lspci and lsusb output from my machine
Comment 25 Silivrenion 2008-07-08 17:19:37 EDT
(In reply to comment #24)
> Created an attachment (id=311316) [edit]
> lspci and lsusb output from my machine
> 

I am having the same issues as described above. System crashes randomly, but
will always crash when using USB at some time.

When a crash occurs, all system input stops. There's a few things that continue
to function:
Power button will bring up the GNOME power options. None of them are selectable
because input won't work.
The USB indicator will flash with data sent/received. I think it's maintaining
connections, but it can't talk to the main system about these connections.

This occurs randomly and at ANY time when a USB device is connected, but
especially when doing data transfer or enumeration.

Attached is the lspci and lsusb outputs for my system.

Fedora 8 - 2.6.25.9-40.fc8
Toshiba Satellite U-205
Comment 26 Chuck Ebbert 2008-07-08 17:35:03 EDT
I think these problems are fixed in the latest update. Please try
2.6.25.10-86.fc9 or 2.6.25.10-47.fc8
Comment 27 Albert69 2008-07-08 19:21:30 EDT
@Silivrenion here the system hangs and eg the clock too stops when the hard lock
become
@chuck ebbert , thx, i will try it (really tired of this issue), but i can
report/test only next week 
regards
Comment 28 Albert69 2008-07-09 07:08:33 EDT
yesterday i have installed the 2.6.25.10-86.fc9 i686 and i'll left the pc up
till saturday: at today was very rare that the pc stay up more then 24 hours
without freeze, nearly impossible reach the 48 hours: so if i found it up
saturday i think that the problem is finally solved

a note: i see a 2.6.25.10-87 (i686 rpm) too, but i cannot install it (i receive
an error executing the rpm), so i have installed the 2.6.25.10-86.fc9 (i686 rpm)
Comment 29 Silivrenion 2008-07-12 20:11:23 EDT
I have just verified that this bug occurs in the default installation of Fedora
9 via LiveCD. I will run yum update to the current updates and test again.
Comment 30 Albert69 2008-07-13 10:31:33 EDT
@chuck ebbert
with 2.6.25.10-86.fc9.i686
all is fine, finally this ******* bug SEEMS gone away!!!!!
the hard lock was random, but maybe this is the first time that this pc runs for
more then 48 hours without freeze with f9 (i'll left another message in the end
of the week for report if all still running fine)
thanks

thanks to thomas janssen too for the accurates info





Comment 31 Albert69 2008-07-16 08:20:02 EDT
still running fine!
i think that this bug can be closed :)
Comment 32 Thomas Janssen 2008-07-18 10:57:00 EDT
Hello, back on F9 with the mentioned Kernel 2.6.25.10-86. The original F9 kernel
was hardlocking after 5 minutes of heavy pc usage. The mentioned one is running
fine for some hours now with heavy pc usage as well. I will report again in a
few days if it still run stable.

Whatever you have done, if it stays stable i own you some beer (or whatever you
drink) :)

Thx a lot to everybody here reporting, helping and fixing.
Comment 33 Albert69 2008-07-22 14:33:39 EDT
bad news :(
more solid then before, but another hard lock come

if can help, it come (about 40 minutes after the pc turn on) during a data copy
from a big (8gb) sdhc flash memory to the pc (using an usb sd reader)
i don't know if it can be related, but i reported it

thx for the support and the attention
regards


Note You need to log in before you can comment on or make changes to this bug.