Bug 436314
Summary: | smartmontools-5.37-7.3.fc8 makes SAMSUNG HD161HJ unresponsive | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Patrick C. F. Ernzer <pcfe> | ||||||
Component: | smartmontools | Assignee: | Tomas Smetana <tsmetana> | ||||||
Status: | CLOSED NOTABUG | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | medium | ||||||||
Version: | 8 | ||||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | i386 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2008-06-11 11:31:14 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Patrick C. F. Ernzer
2008-03-06 14:35:31 UTC
Created attachment 297052 [details]
relevant parts from /var/log/messages
I thought that the HD161HJ drives worked well with smartmontools... Do you have experience with this drive connected to a different controller? I'll ask around whether some of my colleagues have a testing machine with this type of hard drive and try to reproduce the problem to find out what I can do. Thanks. I have built testing packages of the smartmontools CVS version (5.38 should be officially out in next few days). You may try to test whether the new version solves the problem (there were some updates regarding Samsung disks): http://tsmetana.fedorapeople.org/smartmontools/ If you decide to test, please let me know. If the problems persist I'll have to report them upstream -- there is little I can do myself with hardware-specific issues. did a short test, been working for 5 hours without failing a drive. have to disable it now as I will be away from this computer for 10 days and I do not want it to fail while I am away. I will re-enable smartd in week 13 and report back how it works when it has been active for a few days I have pushed the official smartmontools-5.38 to F-8 testing. bad news; back from vacation today. Re-enabled smartd from smartmontools-5.38-1.fc8 and the system got into "rejecting I/O to dead device" stage again after just under 4 hours. Will attach relevant bits of syslog. Created attachment 299052 [details]
relevant parts from /var/log/messages
relevant syslog on 2008-03-25
This looks bad... And I don't think I can solve this myself. I'll try to ask about this at smartmontools-support mailing list. There are some problems reported with the Promise controllers, so you probably hit another one. We'll see. Thanks to the people on the list who pointed out the following from the log (well, I saw that also but thought is was OK...): Mar 25 18:11:24 bofferding-pcfe smartd[9845]: Device: /dev/sda, starting scheduled Offline Immediate Test. Mar 25 18:11:24 bofferding-pcfe smartd[9845]: Device: /dev/sdb, starting scheduled Offline Immediate Test. Mar 25 18:11:24 bofferding-pcfe smartd[9845]: Device: /dev/sdc, starting scheduled Offline Immediate Test. The disks are being mounted during the offline test and libata reacts accordingly. So please turn off the offline testing. Please let me know if you make any progress in this issue. Thank you. Hmm, I had the impression offline testing was supposed to be halted if a command to the drive is issued (and all my other drives (on different machines and different controllers) do work fine with offline testing. smartctl -c also tells me "Suspend Offline collection upon new command." under "Offline data collection capabilities" and toggling Automatic Offline Testing with smartctl -o {on|off} shows up as expected in smartctl -c But, as I have automatic offline testing on, I have removed the forced offline test from my smartctl.conf. So now it reads: DEVICESCAN -o on -S on -l error -s (S/../.././10|L/../../1/06) -M daily -m smartd -p I'll let you know how this works (meaning does the drive stay active and do I still get offline tests) Do you have any news or should I can I close this bug (INSUFFICIENT_DATA)? I have tried to ask on smartmontools lists but with not a big success... drive dropped off the bus once after I did the change from Comment #12 I have now simply disabled smartd on this box as I have no way of knowing if the fault lies with smartmontools, my motherboard (it acts up sometimes) or the controller. I suggest we leavi this on NEEDINFO for 4-8 weeks, the aim being to see if drives dropping off the bus does not occur any longer after smartd is off. But you can also close if you want, so far nobody else chimed in so the problem may very well be with my specific configuration (the box is old and already had to have 12 caps replaced as they were leaking) OK. Switching to NEEDINFO again and we'll see... 'lucky' coincidence, it's just done it again. smartd off this time. so it seems this is just a case of heavy disk traffic triggering this :-( Ah well, at least nothing broken with our smartmontools package. closing as notabug |