Bug 133816 - hd failure and other message not reported
hd failure and other message not reported
Status: CLOSED NOTABUG
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
3.0
i686 Linux
medium Severity medium
: ---
: ---
Assigned To: Tom Coughlan
Brian Brock
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2004-09-27 14:01 EDT by Giulio Cervera
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2004-12-10 09:28:54 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
aacraid_enable_printf.patch (509 bytes, patch)
2004-11-23 05:59 EST, Thomas Uebermeier
no flags Details | Diff
aeventd rpm (72.96 KB, application/octet-stream)
2004-12-10 09:31 EST, Tom Coughlan
no flags Details

  None (edit)
Description Giulio Cervera 2004-09-27 14:01:13 EDT
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; 
SV1; .NET CLR 1.1.4322)

Description of problem:
Latest aacraid driver 1.1-5[2340] do not report any event from the 
controller (disk failure, Rebuild, Battery status ...)

Alternate module aacraid_10102 work fine

Version-Release number of selected component (if applicable):
kernel-2.4.21-20smp

How reproducible:
Always

Steps to Reproduce:
1.Remove 1 raid disk
2.
3.
    

Actual Results:  Nothing is reported to kern.* or *.emerg

Expected Results:  aacraid:ID(0:01:0); Selection Timeout 
[command:0x2a]
aacraid:ID(0:01:0) - IO failed, Cmd[0x2a]
aacraid:Container 0 failed REBUILD task: I/O error - drive 0:1:0 
failed
aacraid:ID(0:01:0) - Drive spindown failed
aacraid:Drive 0:1:0 returning error
aacraid:Drive 0:1:0 offline on container 0:
aacraid:Mirror Container 0 Drive 0:1:0 Failure
aacraid:Mirror Failover Container 0 no failover assigned
...
aacraid:Container 0 started REBUILD task on drive 0:1:0

Additional info:
Comment 1 Tom Coughlan 2004-10-07 11:40:58 EDT
This bug is marked "Security Sensitive Bug". I do not think this was
intended.  The BZ indicates that when a RAID host bus adapter detects
a failure, the failure is not reported in the system logs.  I don't
see this a confidential security issue.

Is there any objection if I remove this restriction?
Comment 2 Tom Coughlan 2004-10-07 11:50:38 EDT
Giulio,

Are you running any sort of Adaptec management software, like aeventd
for example, that is supposed to detect events reported by the adapter
and post them in the system log? What file is the extract posted above
under "expected results" from? 

Please post the output of sysreport to this BZ (see
/usr/share/doc/sysreport-1.3.7/README).

Thanks,

Tom

Comment 3 Giulio Cervera 2004-10-12 07:45:09 EDT
yes, this is not a Security Sensitive Bug, sorry for my mistake
we are not running any management software, just watching the syslog 
file to trace some events.
The driver itself send message to syslog.
expected result came from messages and is also printed on console, 
but only using alternative driver aacraid_10102 in kernel 2.4.21-
20.ELsmp

we are now using afacli (command line tool) from DELL to get the disk 
status and it work with both version of aacraid
Comment 4 Thomas Uebermeier 2004-11-23 05:59:04 EST
These messages are being received directly from the controller 
(through the driver). Unfortunately it has been commented out 
(intentionally ?) in the newer version of the driver: 
commsup.c:764 
#if (defined(AAC_PRINTF_ENABLED)) 
[the print commands] 
#endif 
 
The following patch does set this macro, but I don't know if this is 
the preferred way to set this, as this could also be done in the 
Makefile... 
Comment 5 Thomas Uebermeier 2004-11-23 05:59:59 EST
Created attachment 107288 [details]
aacraid_enable_printf.patch
Comment 6 Tom Coughlan 2004-12-10 09:28:54 EST
I contacted Mark Salyzyn, the driver maintainer.  He says:

-----

Our new Firmware no longer prints these messages, and uses the AIF
event mechanism to report conditions. Also, we have had a lot of
pressure from OEMs and confused customers to drop the existing
messages as too noisy.

I published an `aeventd' tool for acquiring this information in the
userspace that is provided by the newer firmware.

------

Given this, we do not plan to re-enable the printf's in RHEL. You
should use the attached aeventd utility instead. If you find that this
is not adequate you can re-open this BZ, or take it up with Adaptec
directly.




Comment 7 Tom Coughlan 2004-12-10 09:31:17 EST
Created attachment 108312 [details]
aeventd rpm
Comment 8 Giulio Cervera 2004-12-10 10:35:53 EST
thank's for the information
Comment 9 Giulio Cervera 2004-12-20 08:31:40 EST
I have tried aeventd on DELL PE 2650 with latest BIOS A20 Adaptec 
Firmware 2.8.0 build 6092 but seems to be broken ...

when it start i get this line:
Dec 20 14:11:09 test01 aacraid: aeventd(v1.0-4): Startup /dev/aac0
(v1.1-5[2340])
and after the daemon die
i have also checked all the parameter ( aeventd -? ) without any 
success 

I known this is not RH fault, i just replied to this thread to inform 
u 

atm the only working sames the annoying afacli from dell (i grab the 
output then make a diff with a good state)

Note You need to log in before you can comment on or make changes to this bug.