Bug 139874 - aacraid driver in 2.1ES causes data corruption on HP 4M RAID card
aacraid driver in 2.1ES causes data corruption on HP 4M RAID card
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 2.1
Classification: Red Hat
Component: kernel (Show other bugs)
2.1
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Don Howard
Brian Brock
:
Depends On:
Blocks: 132992
  Show dependency treegraph
 
Reported: 2004-11-18 11:14 EST by Alan Ferrier
Modified: 2007-11-30 17:06 EST (History)
4 users (show)

See Also:
Fixed In Version: 2.4.9-e.40
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-02-03 16:35:13 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alan Ferrier 2004-11-18 11:14:34 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5)
Gecko/20041107 Firefox/1.0

Description of problem:
The aacraid driver included in all 2.1 ES kernel versions from
2.4.9-e.24 -> 2.4.9-e.49 causes massive data corruption on HP 4M Raid
controllers (and, we suspect, other hardware relying on this driver.)

We are using Oracle 8i on this kit, and have conducted comprehensive
tests on kernels from 2.4.9-e.3 -> 2.4.9-e.49. Patching the kernel
with the 1.1.4 aacraid driver from Adaptec fixes this issue, but we
note that the problematic 2.1 ES kernels still include a 0.9.9-test6
aacraid version number.

This problem does not occur on Redhat 3ES, but again the aacraid
driver is at a higher patch version.

Version-Release number of selected component (if applicable):
kernel-2.4.9-e.24 and above

How reproducible:
Always

Steps to Reproduce:
dbverify is an Oracle supplied tool that examines Oracle datafiles for
block corruptions.

With kernel-2.4.9-e.24 or higher of Redhat ES 2.1, dbverify fails with
Oracle datablock corruption issues:

DBVERIFY: Release 8.1.7.3.0 - Production on Sat Nov 13 22:47:02 2004

(c) Copyright 2000 Oracle Corporation.  All rights reserved.

DBVERIFY - Verification starting : FILE = /data/ukhot8i/sms01.dbf
Page 3935 is influx - most likely media corrupt
***
Corrupt block relative dba: 0x02400f5f (file 0, block 3935)
Fractured block found during dbv:
Data in bad block -
type: 6 format: 2 rdba: 0x02400f5f
last change scn: 0x0000.00007c5a seq: 0x1 flg: 0x00
consistency value in tail: 0x07706378
check value in block header: 0x0, block checksum disabled
spare1: 0x0, spare2: 0x0, spare3: 0x0
***

The media corruption is sometimes fixed by a filesystem journal
rebuild on reboot, but occasionally this does not resolve the problem
- leaving the Oracle data corrupt. 
    

Additional info:
Comment 1 Tom Coughlan 2004-12-16 15:12:02 EST
We will investigate this, and will update the driver in U7 as appropriate.
Comment 3 Don Howard 2006-02-03 16:35:13 EST
It looks like this was corrected prior to e.40

From the pensacola changelog: 

* Fri Jun 18 2004 Jason Baron <jbaron@redhat.com>
- update mpt fusion to version 2.05.16, 2.05.11 to backup (Adam Manthei)
- update ips to v. 7.00.15, 6.11.07 to backup (Adam Manthei)
- update aacraid to 1.1.5-2440, backup 0.9.9 (Adam Manthei)

Note the old driver is still shipped, in drivers/addon.  The more recent driver
can be found in drivers/scsi.

Note You need to log in before you can comment on or make changes to this bug.