Bug 133816 - hd failure and other message not reported
Summary: hd failure and other message not reported
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel
Version: 3.0
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Tom Coughlan
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-09-27 18:01 UTC by Giulio Cervera
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2004-12-10 14:28:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
aacraid_enable_printf.patch (509 bytes, patch)
2004-11-23 10:59 UTC, Thomas Uebermeier
no flags Details | Diff
aeventd rpm (72.96 KB, application/octet-stream)
2004-12-10 14:31 UTC, Tom Coughlan
no flags Details

Description Giulio Cervera 2004-09-27 18:01:13 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 
SV1; .NET CLR 1.1.4322)

Description of problem:
Latest aacraid driver 1.1-5[2340] do not report any event from the 
controller (disk failure, Rebuild, Battery status ...)

Alternate module aacraid_10102 work fine

Version-Release number of selected component (if applicable):
kernel-2.4.21-20smp

How reproducible:
Always

Steps to Reproduce:
1.Remove 1 raid disk
2.
3.
    

Actual Results:  Nothing is reported to kern.* or *.emerg

Expected Results:  aacraid:ID(0:01:0); Selection Timeout 
[command:0x2a]
aacraid:ID(0:01:0) - IO failed, Cmd[0x2a]
aacraid:Container 0 failed REBUILD task: I/O error - drive 0:1:0 
failed
aacraid:ID(0:01:0) - Drive spindown failed
aacraid:Drive 0:1:0 returning error
aacraid:Drive 0:1:0 offline on container 0:
aacraid:Mirror Container 0 Drive 0:1:0 Failure
aacraid:Mirror Failover Container 0 no failover assigned
...
aacraid:Container 0 started REBUILD task on drive 0:1:0

Additional info:

Comment 1 Tom Coughlan 2004-10-07 15:40:58 UTC
This bug is marked "Security Sensitive Bug". I do not think this was
intended.  The BZ indicates that when a RAID host bus adapter detects
a failure, the failure is not reported in the system logs.  I don't
see this a confidential security issue.

Is there any objection if I remove this restriction?

Comment 2 Tom Coughlan 2004-10-07 15:50:38 UTC
Giulio,

Are you running any sort of Adaptec management software, like aeventd
for example, that is supposed to detect events reported by the adapter
and post them in the system log? What file is the extract posted above
under "expected results" from? 

Please post the output of sysreport to this BZ (see
/usr/share/doc/sysreport-1.3.7/README).

Thanks,

Tom



Comment 3 Giulio Cervera 2004-10-12 11:45:09 UTC
yes, this is not a Security Sensitive Bug, sorry for my mistake
we are not running any management software, just watching the syslog 
file to trace some events.
The driver itself send message to syslog.
expected result came from messages and is also printed on console, 
but only using alternative driver aacraid_10102 in kernel 2.4.21-
20.ELsmp

we are now using afacli (command line tool) from DELL to get the disk 
status and it work with both version of aacraid

Comment 4 Thomas Uebermeier 2004-11-23 10:59:04 UTC
These messages are being received directly from the controller 
(through the driver). Unfortunately it has been commented out 
(intentionally ?) in the newer version of the driver: 
commsup.c:764 
#if (defined(AAC_PRINTF_ENABLED)) 
[the print commands] 
#endif 
 
The following patch does set this macro, but I don't know if this is 
the preferred way to set this, as this could also be done in the 
Makefile... 

Comment 5 Thomas Uebermeier 2004-11-23 10:59:59 UTC
Created attachment 107288 [details]
aacraid_enable_printf.patch

Comment 6 Tom Coughlan 2004-12-10 14:28:54 UTC
I contacted Mark Salyzyn, the driver maintainer.  He says:

-----

Our new Firmware no longer prints these messages, and uses the AIF
event mechanism to report conditions. Also, we have had a lot of
pressure from OEMs and confused customers to drop the existing
messages as too noisy.

I published an `aeventd' tool for acquiring this information in the
userspace that is provided by the newer firmware.

------

Given this, we do not plan to re-enable the printf's in RHEL. You
should use the attached aeventd utility instead. If you find that this
is not adequate you can re-open this BZ, or take it up with Adaptec
directly.






Comment 7 Tom Coughlan 2004-12-10 14:31:17 UTC
Created attachment 108312 [details]
aeventd rpm

Comment 8 Giulio Cervera 2004-12-10 15:35:53 UTC
thank's for the information

Comment 9 Giulio Cervera 2004-12-20 13:31:40 UTC
I have tried aeventd on DELL PE 2650 with latest BIOS A20 Adaptec 
Firmware 2.8.0 build 6092 but seems to be broken ...

when it start i get this line:
Dec 20 14:11:09 test01 aacraid: aeventd(v1.0-4): Startup /dev/aac0
(v1.1-5[2340])
and after the daemon die
i have also checked all the parameter ( aeventd -? ) without any 
success 

I known this is not RH fault, i just replied to this thread to inform 
u 

atm the only working sames the annoying afacli from dell (i grab the 
output then make a diff with a good state)


Note You need to log in before you can comment on or make changes to this bug.