Bug 238806 - hal kills Acer Aspire 5610 (TSSTcorpCD/DVDW TS-L632D stops responding)
hal kills Acer Aspire 5610 (TSSTcorpCD/DVDW TS-L632D stops responding)
Status: CLOSED NOTABUG
Product: Fedora
Classification: Fedora
Component: kernel (Show other bugs)
8
All Linux
medium Severity medium
: ---
: ---
Assigned To: Kernel Maintainer List
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-05-02 19:49 EDT by Doncho N. Gunchev
Modified: 2008-02-26 17:06 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2008-02-26 17:06:25 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
stripped and gzipped /var/log/messages after reboot (8.25 KB, application/x-Gzip)
2007-12-11 14:16 EST, Doncho N. Gunchev
no flags Details
dmesg from 2.6.24-2.fc9 after updating the firmware to SC04 (8.61 KB, application/x-gzip)
2008-02-24 18:00 EST, Doncho N. Gunchev
no flags Details

  None (edit)
Description Doncho N. Gunchev 2007-05-02 19:49:10 EDT
Description of problem:
My hardware profile is ebf36068-3809-426c-ade0-d0c531b5b402. The laptop is Acer
Aspire 5610 with TSSTcorpCD/DVDW TS-L632D DVD-RW. I have to kill the
hald-addon-storage that pools data from the drive or the machine will freeze in
a few hours (from 5-10 minutes to 20-40 hours). The presence of a DVD/CD in the
device does not change anything. Without hald-addon-storage I have no
notification when a new CD/DVD is inserted, but the system can work for weeks
(hibernate, suspend...) without problems. The kernel reports unknown error code
when this happends.

Version-Release number of selected component (if applicable):
hal-0.5.8.1-6.fc6 and any older hal that comes with FC6.

How reproducible:
Every time, sometimes it just happens slower.

Actual results:
The machine freezes after a fwe hours. If I'm fast enough to 'killall
hald-addon-storage' it returns to normal.

Additional info:
The DMA for the DVD Writer also does not get set by the kernel (does not matter
if it is on/off, hald-addon-storage still freezes the machine). I have to
"hdparm -d1" it to get DVD playing without skipping frames and eating too much
CPU. Looks like the problem is not present with F7t4, but I have not had the
time to fully check.
Comment 1 Doncho N. Gunchev 2007-05-03 03:09:26 EDT
Kernel 2.6.20-1.2948.fc6 does turn the DMA on by default. The machine locked
after  3-4 hours.
Comment 2 David Zeuthen 2007-05-03 12:34:27 EDT
Reassigning to kernel. 

Note that for Fedora 7, there's a new command line utility called
hal-disable-polling, see bug 204969 comment 17 for details. You should be able
to copy/paste the fdi file and put it in /etc/hal/fdi/policy...

FYI, for Fedora 8, I plan to make this even more tunable e.g. you should be able
to configure how often to poll. That might "fix" problems with the broken
hardware and/or broken drivers. The future, as state in bug 204969 brings us
optical drives with asynchronous notification so perhaps in 3-4 years, polling
will be a problem of the past....
Comment 3 Doncho N. Gunchev 2007-05-03 19:02:31 EDT
Thank you very much! Would be nice to have this documented, in the release 
notes maybe?
Comment 4 Doncho N. Gunchev 2007-06-03 16:15:39 EDT
I tried not to use hal-disable-polling with F7 final - locks again :-(

btw: A friend of mine has similar laptop with WinXP - no problems. I'll try run
F7 Live CD there and see what happens.
Comment 5 Doncho N. Gunchev 2007-07-15 17:46:26 EDT
Can't try the other laptop, sorry. Currently, with hal pooling disabled the 
laptop is very stable (except for suspend to RAM) when not using optical 
disks. When the DVD stops responding it affects my HDD also (hope this log 
helps):

Jul 16 00:10:21 laptop2 kernel: ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 
action 0x2 frozen
Jul 16 00:10:52 laptop2 kernel: ata1.01: cmd 
a0/01:00:00:00:00/00:00:00:00:00/b0 tag 0 cdb 0x28 data 129024 in
Jul 16 00:10:52 laptop2 kernel:          res 
40/00:03:00:00:00/00:00:00:00:00/b0 Emask 0x4 (timeout)
Jul 16 00:10:52 laptop2 kernel: ata1: port is slow to respond, please be 
patient (Status 0xd0)
Jul 16 00:10:52 laptop2 kernel: ata1: port failed to respond (30 secs, Status 
0xd0)
Jul 16 00:10:52 laptop2 kernel: ata1: soft resetting port
Jul 16 00:10:52 laptop2 kernel: ata1.00: ata_hpa_resize 1: sectors = 
117210240, hpa_sectors = 117210240
Jul 16 00:10:52 laptop2 kernel: ata1.00: ata_hpa_resize 1: sectors = 
117210240, hpa_sectors = 117210240
Jul 16 00:10:52 laptop2 kernel: ata1.00: configured for UDMA/100
Jul 16 00:10:53 laptop2 kernel: ata1.01: configured for UDMA/25
Jul 16 00:10:53 laptop2 kernel: ata1: EH complete
Jul 16 00:10:53 laptop2 kernel: sd 0:0:0:0: [sda] 117210240 512-byte hardware 
sectors (60012 MB)
Jul 16 00:10:53 laptop2 kernel: sd 0:0:0:0: [sda] Write Protect is off
Jul 16 00:10:53 laptop2 kernel: sd 0:0:0:0: [sda] Write cache: enabled, read 
cache: enabled, doesn't support DPO or FUA
Jul 16 00:10:53 laptop2 kernel: sd 0:0:0:0: [sda] 117210240 512-byte hardware 
sectors (60012 MB)
Jul 16 00:10:53 laptop2 kernel: sd 0:0:0:0: [sda] Write Protect is off
Jul 16 00:10:53 laptop2 kernel: sd 0:0:0:0: [sda] Write cache: enabled, read 
cache: enabled, doesn't support DPO or FUA

at times harder lock happens and here are the extra messages I get:

Jul 16 00:14:18 laptop2 kernel: sr 0:0:1:0: SCSI error: return code = 
0x08000002
Jul 16 00:14:18 laptop2 kernel: sr 0:0:1:0: Sense Key : Hardware Error 
[current]
Jul 16 00:14:18 laptop2 kernel: sr 0:0:1:0: Add. Sense: Logical unit 
communication CRC error (Ultra-DMA/32)
Jul 16 00:14:18 laptop2 kernel: end_request: I/O error, dev sr0, sector 
5258204
Jul 16 00:14:18 laptop2 kernel: printk: 53 messages suppressed.
Jul 16 00:14:18 laptop2 kernel: Buffer I/O error on device sr0, logical block 
1314551
Jul 16 00:14:18 laptop2 kernel: Buffer I/O error on device sr0, logical block 
1314552
...

In the later situation, removing the optical disk helps (the machine stays 
unresponsive for a few minutes).
Comment 6 Doncho N. Gunchev 2007-09-27 13:22:15 EDT
I have no idea why or how, but things are dramatically improving with  
2.6.23-0.202.rc8.fc8... let's hope it all goes right in the final :-)
I've been testing for 15-20 hours and no problems so far, but I'm still a bit 
skeptic. Rebooting to test kernel-2.6.23-0.204.rc8.fc8.
Comment 7 Doncho N. Gunchev 2007-11-12 12:18:59 EST
Yep, things got improved, but just to the point it hangs less and does not 
crash. Had to hal-disable-polling to avoid freezings in FC8 too.
Comment 8 Chuck Ebbert 2007-12-06 18:31:48 EST
ICH7 ATA controller
Comment 9 Doncho N. Gunchev 2007-12-10 09:29:50 EST
(In reply to comment #8)
> ICH7 ATA controller

If this was a question, then yes - my laptop uses ICH7:
http://smolt.fedoraproject.org/show?UUID=ebf36068-3809-426c-ade0-d0c531b5b402

BTW: using external USB disk when writing CD/DVD disks results in faster and 
more reliable operation, so my DVD and HDD seem to be on the same IDE channel. 
Also, kernel-2.6.23.8-63.fc8 kills suspend, so I have not tested it.
Comment 10 Chuck Ebbert 2007-12-10 17:57:29 EST
Yes, the drives are on the same channel. Does make any difference adding this line:

 options libata dma=1

to /etc/modprobe.conf and then rebuilding the initrd?

Initrd rebuild directions are in here:

https://fedoraproject.org/wiki/KernelCommonProblems

Comment 11 Doncho N. Gunchev 2007-12-11 06:24:30 EST
Testing... no lockups, no error messages. On the negative side - noone notices 
new disks in the drive (in KDE at least), DVD reading is at 1.3MB/sec and eats 
all CPU available, writing is at speeds between 0.5 .. 0.7 (all CPU again)... 
With DMA + external USB disk I was able to write at speed of 8 (~11MB/sec). 
For now a harmless (at least looks like) message popped up:

Dec 11 12:17:09 f8 kernel: ata1.01: 16 bytes trailing data
Dec 11 12:17:09 f8 kernel: ata1.01: 16 bytes trailing data

OT: is there hdparm -d0 /dev/XXX like trick available these days, it's 
a "terrible loss" not having it around?

Will report later after more serious testing.
Comment 12 Doncho N. Gunchev 2007-12-11 14:15:32 EST
Nope, did not help :-( I'll attach gzip-ed /var/log/messages since my last 
reboot with dhclient & NetworkManager ones removed (only kernel messages).

I'm disabling polling for now...
hal-disable-polling --udi /org/freedesktop/Hal/devices/storage_model_CD/DVDW_TS_L632D
Comment 13 Doncho N. Gunchev 2007-12-11 14:16:59 EST
Created attachment 284481 [details]
stripped and gzipped /var/log/messages after reboot
Comment 14 Christopher Brown 2008-02-03 16:46:40 EST
Hello,

I'm reviewing this bug as part of the kernel bug triage project, an attempt to
isolate current bugs in the Fedora kernel.

http://fedoraproject.org/wiki/KernelBugTriage

I am CC'ing myself to this bug and will try and assist you in resolving it if I can.

Could you test with a 2.6.24-based kernel when it becomes available? Please
don't attach gzipped information as it only creates additional steps in
troubleshooting the problem. In response to comment #11 you can add:

libata.dma=0

to grub if you wish to disable dma.

Regards
Chris
Comment 15 Christopher Brown 2008-02-03 16:50:19 EST
I've had to obsolete the attachment as I am unable to extract messages from it
and in any case, we would need an updated dmesg output from 2.6.24
Comment 16 Doncho N. Gunchev 2008-02-24 17:53:33 EST
Good news, turns out this is most likely not hal/kernel/whatever problem at 
all. I sucessfully updated my DVD's firmware to TS-L632D_SC04.BIN and no more 
problems. Two weeks and 2-3 days it never locked up. This also means I can no 
longer test the old firmware, there was no way to back it up, sorry.

For reference, I used the info from:
https://bugs.launchpad.net/linux/+bug/75295/comments/97
If someone else has to update his firmware, note the "-nocheck" parameter. One 
has to install Windows or (like me) borrow a bootable CD with Windows (Windows 
PE) or whatever they call it...
Comment 17 Doncho N. Gunchev 2008-02-24 18:00:57 EST
Created attachment 295753 [details]
dmesg from 2.6.24-2.fc9 after updating the firmware to SC04

I have no idea if this attachment will be usefull at all given the fact that I
updated the DVD's firmware and have no problems with it any more...
Comment 18 Christopher Brown 2008-02-26 17:06:25 EST
Okay, thansk for updating Doncho. Clearly a hardware issue so closing NOTABUG.

Note You need to log in before you can comment on or make changes to this bug.