Bug 501126

Summary: checking software raid arrays on fedora 2.6.29 kernels locks up system
Product: [Fedora] Fedora Reporter: Kyle Liddell <bugs>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: low    
Version: 11CC: itamar, kernel-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-06-28 12:34:28 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
lspci -v
none
/proc/mdstat none

Description Kyle Liddell 2009-05-16 18:21:09 UTC
Created attachment 344292 [details]
lspci -v

Description of problem:
On Fedora 11 (and possibly 10 - had some issues there, but never narrowed them down), if a software raid array is marked as dirty, or is forced to do a resync, the system locks up.  This also happens in the installer.
I'm not sure it's in the kernel itself - I'm filing this under kernel, but it might be in the initrd or some such.
This happens with several fedora kernels.  I also have debian on the machine, and raid works with debian kernel 2.6.26, and more interestingly, on 2.6.29-2-686.

This looks similar to Bug 484743 (but I'm creating a new bug since I'm seeing this on F11).

Version-Release number of selected component (if applicable):
2.6.29.1-102.fc11.i686.PAE
2.6.29.3-140.fc11.i686.PAE

How reproducible:
Always happens on resync.
  
Attaching some system info (from the debian system).  I can chroot into Fedora and collect more info if needed.

Comment 1 Kyle Liddell 2009-05-16 18:22:21 UTC
Created attachment 344293 [details]
/proc/mdstat

Comment 2 Kyle Liddell 2009-05-18 01:00:37 UTC
I've tried a few things, maybe I'm running into something known already?

I've built a copy of 2.6.29.3 with no outside patches, and it works in debian (using debian's mkinitrd), and does not work in fedora (using fedora's mkinitrd).  Didn't test .2 on debian, but it also doesn't work on fedora.

If I build a fedora initrd for the debian kernel 2.6.29-2-686, it works, so far anyway.  (Usually, starting a check on all the md arrays fails immediately, and this one has been making progress for about 10 minutes -  I'll know more once all the arrays have finished checking in a couple hours.)
I suppose I'll try running the fedora kernel in debian next.

Maybe there's some patch that is applied to the debian kernel only?  I'll try to check into that later.  For a dumb question, would having differing versions of mdadm installed make a difference?  Debian has 2.something, and fedora has 3.something.  It seems that array checking wouldn't involve anything outside the kernel...

Comment 3 Kyle Liddell 2009-05-18 08:05:35 UTC
fedora on debian's 2.6.29-2-686 has worked for several hours now.
debian on fedora's 2.6.29.3-140.fc11.i686.PAE does not work.

I'm pretty much out of ideas now.

Comment 4 Chuck Ebbert 2009-05-19 22:23:46 UTC
Did you try the i586 fedora kernel?

Comment 5 Kyle Liddell 2009-05-20 00:45:16 UTC
Just tried it, and still locks up.
To fill in a hole in testing, vanilla 2.6.29.2 (compiled with debian's .config) works.

Comment 6 Kyle Liddell 2009-05-23 05:12:44 UTC
Still seeing the bug on 2.6.29.3-155.fc11.i686.PAE.

Comment 7 Kyle Liddell 2009-05-25 09:33:50 UTC
I thought I'd give something destructive a try, and it turned out to be pretty destructive.

Tried to do a fresh install of F11.  Deleted all partitions on all disks, created new partition tables, RAID arrays, and LVM stuff.
(By the way, using the RAID - Clone device to create RAID device button crashes installer here.)

Setup all the disks with no trouble, and clicked next, yes, etc.  Wrote the partition tables, and created the first md device, and then the system locked up.  (I'm assuming creating md0 worked - the progress bar was nearly full, then the colored bar went away, and then it locked up.)

Once I get this system bootable again, I'll try F10.

Comment 8 Kyle Liddell 2009-05-25 22:46:05 UTC
Fedora 10 locks up at around the same point (while working with partitions and such).  I'm going to have to give up on fedora on this machine.
If I can help in tracking this down more (aside from re-installing again), let me know.

Comment 9 Bug Zapper 2009-06-09 15:57:04 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 11 development cycle.
Changing version to '11'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 10 Bug Zapper 2010-04-27 14:21:57 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Bug Zapper 2010-06-28 12:34:28 UTC
Fedora 11 changed to end-of-life (EOL) status on 2010-06-25. Fedora 11 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.