Bug 352311

Summary: Broken smartd monitoring for disks attached to "cciss" controllers
Product: Red Hat Enterprise Linux 5 Reporter: Carlos Rodrigues <cefrodrigues>
Component: smartmontoolsAssignee: Michal Hlavinka <mhlavink>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: 5.0   
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-27 14:05:34 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
smartd startup log with an HP SmartArray 5i/532 controller none

Description Carlos Rodrigues 2007-10-25 13:56:00 UTC
Description of problem:

SMART monitoring (smartd) for disks connected to HP/Compaq SmartArray
controllers doesn't seem to work correctly. It seems to be monitoring only the
last disk specified in the configuration file.

For example, I have the following in my "/etc/smartd.conf":

/dev/cciss/c0d0 -a -d cciss,0 -s (S/../.././01|L/../../7/06) -m root
/dev/cciss/c0d0 -a -d cciss,1 -s (S/../.././02|L/../../7/05) -m root
/dev/cciss/c0d0 -a -d cciss,2 -s (S/../.././03|L/../../7/04) -m root
/dev/cciss/c0d0 -a -d cciss,3 -s (S/../.././04|L/../../7/03) -m root
/dev/cciss/c0d0 -a -d cciss,4 -s (S/../.././05|L/../../7/02) -m root
/dev/cciss/c0d0 -a -d cciss,5 -s (S/../.././06|L/../../7/01) -m root

There are self-tests configured for all disks, but smartd is running them all on
"cciss,5", as can be seen on this disk's selftest log.

I've seen this on two different machines (one i386, another x86-64), each one
with different controller models (one SCSI, another SAS, both of them "cciss" of
course).

It is important to notice that "smartctl" seems to do the right thing. It's only
"smartd" that doesn't work.

Comment 1 Carlos Rodrigues 2007-10-25 14:20:13 UTC
Just a small note: I've confirmed that it isn't just the self-test functionality
that's broken on "cciss". "smartd" only monitors the last disk, which can be
seen from the messages log, where smartd reports changes in disk temperature for
all disks, with the exact same value (where "smartctl" shows that the disks
really have different values for this attribute).

Comment 2 Tomas Smetana 2007-10-26 11:08:55 UTC
I'm trying to reproduce the bug, but I get this message in the log for all the
disks:

Device: /dev/cciss/c0d0 [cciss_disk_00], does not support SMART Self-Test Log.

I'm using smartmontools-5.36-3.1.el5 on a machine equipped with Compaq Computer
Corporation Smart Array 5i/532.  Could you please provide the part of
/var/log/messages with smartd startup log?  What version of smartmontools do you
use?

Comment 3 Carlos Rodrigues 2007-10-26 14:39:13 UTC
Created attachment 239161 [details]
smartd startup log with an HP SmartArray 5i/532 controller

Comment 4 Carlos Rodrigues 2007-10-26 14:46:46 UTC
The attached file (smart.log) has the information you requested, taken from a
machine with an 5i/532 controller also.

I'm also using smartmontools-5.36-3.1.el5, sorry for not mentioning it before.

Maybe _only_ your last disk (in smartd.conf) doesn't support SMART Self-Test
Log, in which case you are already reproducing the bug. :) Since smartctl works
correctly, it can be used to prove or disprove that.

I've build the SRPM from Fedora 7 updates (smartmontools-5.37-3.2.fc7) and it
works correctly. However, it would be nice to have this patched in RHEL proper,
since these controllers are so popular in datacenters (and some admins, like
myself, aren't very keen on installing all sorts of crap from hardware vendors
just to check if their disks are developing bad sectors...).

Comment 5 Tomas Smetana 2007-10-30 13:01:20 UTC
No.  Even smartcl says "Device does not support Self Test logging" for every
disk.  I'll try to find some other testing machine but this doesn't look too
promising.

Comment 6 Michal Hlavinka 2009-01-27 14:05:34 UTC
cciss support has been added in rhel 5.3 update