Bug 352311 - Broken smartd monitoring for disks attached to "cciss" controllers
Broken smartd monitoring for disks attached to "cciss" controllers
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: smartmontools (Show other bugs)
5.0
i386 Linux
low Severity medium
: ---
: ---
Assigned To: Michal Hlavinka
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2007-10-25 09:56 EDT by Carlos Rodrigues
Modified: 2009-01-27 09:05 EST (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-01-27 09:05:34 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
smartd startup log with an HP SmartArray 5i/532 controller (2.81 KB, text/plain)
2007-10-26 10:39 EDT, Carlos Rodrigues
no flags Details

  None (edit)
Description Carlos Rodrigues 2007-10-25 09:56:00 EDT
Description of problem:

SMART monitoring (smartd) for disks connected to HP/Compaq SmartArray
controllers doesn't seem to work correctly. It seems to be monitoring only the
last disk specified in the configuration file.

For example, I have the following in my "/etc/smartd.conf":

/dev/cciss/c0d0 -a -d cciss,0 -s (S/../.././01|L/../../7/06) -m root
/dev/cciss/c0d0 -a -d cciss,1 -s (S/../.././02|L/../../7/05) -m root
/dev/cciss/c0d0 -a -d cciss,2 -s (S/../.././03|L/../../7/04) -m root
/dev/cciss/c0d0 -a -d cciss,3 -s (S/../.././04|L/../../7/03) -m root
/dev/cciss/c0d0 -a -d cciss,4 -s (S/../.././05|L/../../7/02) -m root
/dev/cciss/c0d0 -a -d cciss,5 -s (S/../.././06|L/../../7/01) -m root

There are self-tests configured for all disks, but smartd is running them all on
"cciss,5", as can be seen on this disk's selftest log.

I've seen this on two different machines (one i386, another x86-64), each one
with different controller models (one SCSI, another SAS, both of them "cciss" of
course).

It is important to notice that "smartctl" seems to do the right thing. It's only
"smartd" that doesn't work.
Comment 1 Carlos Rodrigues 2007-10-25 10:20:13 EDT
Just a small note: I've confirmed that it isn't just the self-test functionality
that's broken on "cciss". "smartd" only monitors the last disk, which can be
seen from the messages log, where smartd reports changes in disk temperature for
all disks, with the exact same value (where "smartctl" shows that the disks
really have different values for this attribute).
Comment 2 Tomas Smetana 2007-10-26 07:08:55 EDT
I'm trying to reproduce the bug, but I get this message in the log for all the
disks:

Device: /dev/cciss/c0d0 [cciss_disk_00], does not support SMART Self-Test Log.

I'm using smartmontools-5.36-3.1.el5 on a machine equipped with Compaq Computer
Corporation Smart Array 5i/532.  Could you please provide the part of
/var/log/messages with smartd startup log?  What version of smartmontools do you
use?
Comment 3 Carlos Rodrigues 2007-10-26 10:39:13 EDT
Created attachment 239161 [details]
smartd startup log with an HP SmartArray 5i/532 controller
Comment 4 Carlos Rodrigues 2007-10-26 10:46:46 EDT
The attached file (smart.log) has the information you requested, taken from a
machine with an 5i/532 controller also.

I'm also using smartmontools-5.36-3.1.el5, sorry for not mentioning it before.

Maybe _only_ your last disk (in smartd.conf) doesn't support SMART Self-Test
Log, in which case you are already reproducing the bug. :) Since smartctl works
correctly, it can be used to prove or disprove that.

I've build the SRPM from Fedora 7 updates (smartmontools-5.37-3.2.fc7) and it
works correctly. However, it would be nice to have this patched in RHEL proper,
since these controllers are so popular in datacenters (and some admins, like
myself, aren't very keen on installing all sorts of crap from hardware vendors
just to check if their disks are developing bad sectors...).
Comment 5 Tomas Smetana 2007-10-30 09:01:20 EDT
No.  Even smartcl says "Device does not support Self Test logging" for every
disk.  I'll try to find some other testing machine but this doesn't look too
promising.
Comment 6 Michal Hlavinka 2009-01-27 09:05:34 EST
cciss support has been added in rhel 5.3 update

Note You need to log in before you can comment on or make changes to this bug.