Bug 487761 - mdadm fails to create/build/assemble raid arrays under kernel-2.6.27.15-170.2.24
Summary: mdadm fails to create/build/assemble raid arrays under kernel-2.6.27.15-170.2.24
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 10
Hardware: x86_64
OS: Linux
low
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-02-27 19:18 UTC by Greg Huber
Modified: 2009-12-18 08:55 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-12-18 08:55:35 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Greg Huber 2009-02-27 19:18:47 UTC
Description of problem:
Under kernel-2.6.27.12-170.2.5 I can use mdadm-2.6.7.1-1 to create/build,
maintain and auto mount on boot resident raid arrays. After updating to
kernel-2.6.27.15-170.2.24 no mdadm operations on raid drives work, and mdadm
aborts in udev on boot. Under failing kernel mdadm claims 1 or more drives
are in use (they fail exclusive open), yet the system is freshly booting and
nothing should be holding the drives, although they could be zombied by the
failed mdadm command.


Version-Release number of selected component (if applicable):
working kernel: kernel-2.6.27.12-170.2.5
failing kernel: kernel-2.6.27.15-170.2.24
mdadm (both)  : mdadm-2.6.7.1-1

How reproducible:
Every time.

Steps to Reproduce:
Environment is  Fedora 10 x86_64 - fully up-to-date,
My raid drives are /dev/sdb and /dev/sdc, substitute yours as needed.

1. Boot kernel-2.6.27.12-170.2.5
2. initialize and partition 2 drives (I use the entire drive in 1 partition)
3. use mdadm to create a simple raid drive. 
   I use: "mdadm --create /dev/md0 --level=0 --raid-devices=2 --auto=md
   /dev/sdb1 /dev/sdc1"
4. make an ext3 filesystem on /dev/md0 - "mkfs.ext3 /dev/md0"
4a. go get some coffee...
5. use:  "mdadm --detail --scan" to get the ARRAY line for the mdadm.conf file
6. create /etc/mdadm.conf 

   DEVICE /dev/sdb1 /dev/sdc1
   ARRAY /dev/md0 level=raid0 num-devices=2 UUID=5a530c80:4819869b:d2cc1675:6719d07e devices=/dev/sdb1,/dev/sdc1

substituting your drives and the ARRAY line from step 5. 

7. add line to /etc/fstab to mount /dev/md0 on boot.
8. reboot and verify that raid drive is assembled and mounted properly.
9. install kernel-2.6.27.15-170.2.24 (keep previous kernel)
10. reboot to new kernel.

Grub should allow you to bounce between working and non-working kernels.
  
Actual results:
mdadm aborts during udev on boot
when attempting to rebuild array, mdadm claims 1 or both drives are busy.

Expected results:
udev should mention that /dev/mdo was started with 2 drives.
array should be mounted without incident.

Additional info:
occasionally removing /var/run/mdadm/map will allow the array to be built, but
it can never be mounted or used. even fsck on the raw device fails.

mdadm-3.0-0.devel2.2.fc11.x86_64 exhibits the same behaviour as mdadm-2.6.7.1-1.fc10.x86_64.

Information form working kernel follows... 
(information from failing kernel will be added after I reboot)
----------------------------------------------------
>>> cat /proc/mdstat

Personalities : [raid0] 
md0 : active raid0 sdb1[0] sdc1[1]
      976767872 blocks 64k chunks
      
unused devices: <none>

----------------------------------------------------
>>> mdadm --detail /dev/md0

/dev/md0:
        Version : 0.90
  Creation Time : Thu Feb 26 21:06:23 2009
     Raid Level : raid0
     Array Size : 976767872 (931.52 GiB 1000.21 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Feb 26 21:06:23 2009
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 64K

           UUID : 5a530c80:4819869b:d2cc1675:6719d07e
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       8       17        0      active sync   /dev/sdb1
       1       8       33        1      active sync   /dev/sdc1

----------------------------------------------------

Comment 1 Damian Brasher 2009-03-02 20:52:18 UTC
This udev rule may be responsible, I has a similar problem with an upgrade, the ext2 partition on md_d0 on sdc1 'fd Linux raid auto' was not usable after boot, removing the rule completely allowed me to assemble md_d0 manually after boot first time but using the modified rule below md_d0 assembled properly on boot:

/etc/udev/rules.d/70-mdadm.rules

SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \
        RUN+="/sbin/mdadm -I --auto=yes $root/%k"

I changed this to:

SUBSYSTEM=="block", ACTION=="add|change", ENV{ID_FS_TYPE}=="linux_raid*", \
	RUN+="/sbin/mdadm --assemble /dev/md_d0 /dev/sdb1 /dev/sdc1 $root/%k"

However this is not a generic rule and specific to my system.

Comment 2 Greg Huber 2009-03-03 19:06:59 UTC
Thanks Damian
Tried your suggestion and it mostly worked. The problem I seem to be having
now is that the drives are moving around on every reboot. I tried using the
/dev/disk/... drive links, but they don't appear to be valid at the point when
UDEV runs. I also tried kernel-2.6.27.19-170.2.24 which just came out. Except
for the drives bouncing around, both UDEV rules seem to work with this kernel
as long as the drive locations are correct.

Thanks again for your help.

Greg

Comment 3 Chuck Ebbert 2009-03-05 04:17:17 UTC
(In reply to comment #2)
> Thanks Damian
> Tried your suggestion and it mostly worked. The problem I seem to be having
> now is that the drives are moving around on every reboot. I tried using the
> /dev/disk/... drive links, but they don't appear to be valid at the point when
> UDEV runs. I also tried kernel-2.6.27.19-170.2.24 which just came out. Except
> for the drives bouncing around, both UDEV rules seem to work with this kernel
> as long as the drive locations are correct.

Does that mean this bug is fixed in kernel-2.6.27.19-170.2.24?

Comment 4 Greg Huber 2009-03-05 14:32:56 UTC
Looks like one problem was fixed in kernel-2.6.27.19-170.2.24. If the array
fails to start because the drives moved, you can stop the failed array and
rebuild it. The previously captured drives are no longer locked up.

The second problem, the drives bouncing around on boot, seems to be a more
serious problem. I may have to dynamically create the mdadm.conf file prior
to the execution of the UDEV mdadm assemble command.

Comment 5 Chuck Ebbert 2009-03-26 22:06:34 UTC
(In reply to comment #4)
> 
> The second problem, the drives bouncing around on boot, seems to be a more
> serious problem. I may have to dynamically create the mdadm.conf file prior
> to the execution of the UDEV mdadm assemble command.  

My mdadm.conf doesn't list any partitions at all:


DEVICE partitions
MAILADDR root

ARRAY /dev/md0 level=raid0 num-devices=2 UUID=ac3336b3:b6900dfe:be5d147a:8ddc3694
ARRAY /dev/md1 level=raid0 num-devices=2 UUID=d66e2256:f58953a9:fa1248fb:883774c6
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=26e1fa44:e557d955:3f47da4e:396d75e0
ARRAY /dev/md3 level=raid1 num-devices=2 UUID=9176f5cb:2272ecfc:1f4059bd:ee29140e

Comment 6 Greg Huber 2009-03-27 14:55:27 UTC
Thanks Chuck,
I restructured my mdadm.conf file to use "DEVICE partitions"
and removed the devices attribute from the ARRAY line. Reboot
without adding drives worked, reboot with firewire drive also
worked, but the drive was placed after the existing drives.
I'll try again with a USB drive, they seem to get placed 
earlier in the chain.

Comment 7 Bug Zapper 2009-11-18 09:53:27 UTC
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '10'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 10's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 10 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 8 Bug Zapper 2009-12-18 08:55:35 UTC
Fedora 10 changed to end-of-life (EOL) status on 2009-12-17. Fedora 10 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.