Bug 198845

Summary: initrd's raidautorun brings up degraded raid devices because of limit on 127 auto-detect raid members
Product: [Fedora] Fedora Reporter: Alexandre Oliva <oliva>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WORKSFORME QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: triage, wtogami
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: bzcl34nup
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-04-04 05:53:30 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch that fixes the bug
none
Patch that avoids crashes on out-of-memory none

Description Alexandre Oliva 2006-07-14 00:16:08 UTC
Description of problem:
md/md.c has a limit of 127 detected devices.  I've accidentally exceeded this
limit by one, and now my box always comes up with the last raid6 array missing
one disk.  Would it be a bad idea to turn this fixed-size array into a
dynamically-allocated list?

Version-Release number of selected component (if applicable):
kernel-2.6.17-1.2366.fc6

How reproducible:
Every time

Steps to Reproduce:
1.Create exactly 128 partitions of about the same size, and set up say 64 pairs
of raid 1 devices
2.Arrange for raidautorun to be in initrd's init
3.Reboot
  
Actual results:
The last array will be missing a member

Expected results:
It shouldn't

Additional info:

Comment 1 Alexandre Oliva 2006-07-30 06:10:14 UTC
Created attachment 133288 [details]
Patch that fixes the bug

Turning the fixed-size array into a list works.  It also fixes a race condition
in the increment-and-use-old-value of dev_cnt.	The compiler certainly won't do
it atomically, and it's actually expensive to generate code that does this
atomically.

Comment 2 Alexandre Oliva 2006-07-30 06:37:37 UTC
Created attachment 133289 [details]
Patch that avoids crashes on out-of-memory

I realized I failed to check the result of kzalloc.  This patch fixes it,
adjusting for an error message with the name of the device to be printed when
this happens (which AFAICT only the callers have).

Comment 3 Dave Jones 2007-04-23 18:45:05 UTC
please post this upstream to linux-kernel.org

Comment 4 Bug Zapper 2008-04-03 17:47:25 UTC
Based on the date this bug was created, it appears to have been reported
against rawhide during the development of a Fedora release that is no
longer maintained. In order to refocus our efforts as a project we are
flagging all of the open bugs for releases which are no longer
maintained. If this bug remains in NEEDINFO thirty (30) days from now,
we will automatically close it.

If you can reproduce this bug in a maintained Fedora version (7, 8, or
rawhide), please change this bug to the respective version and change
the status to ASSIGNED. (If you're unable to change the bug's version
or status, add a comment to the bug and someone will change it for you.)

Thanks for your help, and we apologize again that we haven't handled
these issues to this point.

The process we're following is outlined here:
http://fedoraproject.org/wiki/BugZappers/F9CleanUp

We will be following the process here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this
doesn't happen again.

Comment 5 Alexandre Oliva 2008-04-04 05:53:30 UTC
Patch was posted upstream, I don't know whether it made it, but we now use
mdadm, which doesn't run into the limitations, so we're fine.