109251 – Kickstart raid5 parity not properly initialized

Bug 109251 - Kickstart raid5 parity not properly initialized

Summary: Kickstart raid5 parity not properly initialized

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	anaconda
Sub Component:
Version:	9
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Jeremy Katz
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	108613 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-11-06 03:58 UTC by Hrunting Johnson
Modified:	2007-04-18 16:59 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-10-05 03:34:22 UTC
Embargoed:

Attachments	(Terms of Use)
Patch to force raid5 initializations to resync (1.66 KB, patch) 2003-11-06 18:24 UTC, Hrunting Johnson	no flags	Details \| Diff
View All

Description Hrunting Johnson 2003-11-06 03:58:42 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5a)
Gecko/20030811 Mozilla Firebird/0.6.1

Description of problem:
Using a kickstart installation file, one can set up a simple raid5
across many drives, with no spares.  This raid5 configuration goes
awfully fast.  It turns out that the parity information isn't
initialized on the disks.  When the first disk fails out, either
manually, or because of a disk error, the array is almost always
corrupt.  If you recreate the array using mdadm, the very first thing
the raid does is do a parity reconstruct (or resync, if you use
--force).  Fail out a job from that array, and things are good.

Version-Release number of selected component (if applicable):
anaconda-9.0-4

How reproducible:
Always

Steps to Reproduce:
1. create a kickstart installation file that sets up a raid5 array
2. install system using kickstart file
3. use raidsetfaulty to fail out one disk from raid5
4. umount and fsck the raid array
    

Actual Results:  Corruption on the filesystem, numerous warnings about
trying to access beyond the end of the raid array, etc.

Expected Results:  Raid operates just fine in degraded mode,
filesystem integrity is preserved.

Additional info:

Comment 1 Hrunting Johnson 2003-11-06 18:24:43 UTC

Created attachment 95767 [details]
Patch to force raid5 initializations to resync

Comment 3 Hrunting Johnson 2003-11-06 22:55:25 UTC

*** Bug 108613 has been marked as a duplicate of this bug. ***

Comment 4 Hrunting Johnson 2004-03-24 21:10:39 UTC

Has anyone had a chance to review this?  This affects all versions of
Anaconda I've used, from Redhat 9 up to Fedora Core 2 Test 1.  I
supplied a patch, but this is fairly serious.  Anyone who creates a
RAID5 filesystem with Anaconda has a ticking time-bomb causing
immediate data loss if they lose one drive.  The patch is simple.  For
RAID5 filesystems, don't enable the (aptly named)
'--dangerous-no-resync' option.

For RAID5, unless you wipe the disks first (ie. put them into a known
state by writing all 0's or all 1's to them), you MUST do an initial
parity sync.  Otherwise, you have no idea what your initial parity
state is and when the drive fails, you're essentially reconstructing
the bits from possibly/probably/almost-always unknown (ie. wrong)
parity data.  In my experience, it aways leads to corruption.

Comment 5 Jeremy Katz 2004-10-05 03:34:22 UTC

We're using mdadm now which does this automatically.

Note You need to log in before you can comment on or make changes to this bug.