Description of problem: md/md.c has a limit of 127 detected devices. I've accidentally exceeded this limit by one, and now my box always comes up with the last raid6 array missing one disk. Would it be a bad idea to turn this fixed-size array into a dynamically-allocated list? Version-Release number of selected component (if applicable): kernel-2.6.17-1.2366.fc6 How reproducible: Every time Steps to Reproduce: 1.Create exactly 128 partitions of about the same size, and set up say 64 pairs of raid 1 devices 2.Arrange for raidautorun to be in initrd's init 3.Reboot Actual results: The last array will be missing a member Expected results: It shouldn't Additional info:
Created attachment 133288 [details] Patch that fixes the bug Turning the fixed-size array into a list works. It also fixes a race condition in the increment-and-use-old-value of dev_cnt. The compiler certainly won't do it atomically, and it's actually expensive to generate code that does this atomically.
Created attachment 133289 [details] Patch that avoids crashes on out-of-memory I realized I failed to check the result of kzalloc. This patch fixes it, adjusting for an error message with the name of the device to be printed when this happens (which AFAICT only the callers have).
please post this upstream to linux-kernel.org
Based on the date this bug was created, it appears to have been reported against rawhide during the development of a Fedora release that is no longer maintained. In order to refocus our efforts as a project we are flagging all of the open bugs for releases which are no longer maintained. If this bug remains in NEEDINFO thirty (30) days from now, we will automatically close it. If you can reproduce this bug in a maintained Fedora version (7, 8, or rawhide), please change this bug to the respective version and change the status to ASSIGNED. (If you're unable to change the bug's version or status, add a comment to the bug and someone will change it for you.) Thanks for your help, and we apologize again that we haven't handled these issues to this point. The process we're following is outlined here: http://fedoraproject.org/wiki/BugZappers/F9CleanUp We will be following the process here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping to ensure this doesn't happen again.
Patch was posted upstream, I don't know whether it made it, but we now use mdadm, which doesn't run into the limitations, so we're fine.