655387 – raid array automatically degraded on boot

Bug 655387 - raid array automatically degraded on boot

Summary: raid array automatically degraded on boot

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mdadm
Sub Component:
Version:	14
Hardware:	Unspecified
OS:	Unspecified
Priority:	low
Severity:	high
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-11-20 19:05 UTC by Andrew McNabb
Modified:	2011-01-13 18:54 UTC (History)
CC List:	1 user (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2011-01-13 04:03:28 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Andrew McNabb 2010-11-20 19:05:24 UTC

On startup, my raid device is automatically started in degraded mode using only one of the two attached disks.  This behavior is incorrect for two reasons:

1) Both disks should be used--it shouldn't ignore one of them.

2) If there is some reason that the array appears to be degraded on boot, it shouldn't automatically run the array; instead it should come up in read-only mode or fall to a emergency console or something.

In any case, both disks were available, so there was no reason that the array should be degraded in the first place.

I noticed a few unusual lines in /var/log/messages related to the array (/dev/md/big aka /dev/md127):

Nov 20 10:39:56 mcbain kernel: [    8.968786] md: array md127 already has disks!
...
Nov 20 10:39:56 mcbain kernel: [   14.794615] md/raid1:md127: active with 1 out of 2 mirrors
Nov 20 10:39:56 mcbain kernel: [   14.796011] md127: detected capacity change from 0 to 1500299129856
Nov 20 10:39:56 mcbain kernel: [   14.798371]  md127: unknown partition table

In boot.log, I see the following line just after "Setting hostname":
mdadm: started array /dev/md/big

The mdadm.conf reads as follows (the file was autogenerated except for the MAILADDR line and the last two lines):

# mdadm.conf written out by anaconda
MAILADDR amcnabb
AUTO +imsm +1.x -all
ARRAY /dev/md1 level=raid1 num-devices=2 UUID=eb215d71:b2842b79:bfe78010:bc810f04
ARRAY /dev/md2 level=raid1 num-devices=2 UUID=716da2d6:578c1864:bfe78010:bc810f04
ARRAY /dev/md3 level=raid1 num-devices=2 UUID=4e0fba26:dcaa14d1:bfe78010:bc810f04
DEVICE partitions
CREATE metadata=1.1 auto=md
ARRAY /dev/md/big name=big auto=md

I'm not aware of any obvious user error in the above--it shouldn't be this easy for things to break. :)  Please let me know if there is any other information that would be helpful.

Comment 1 Andrew McNabb 2011-01-12 16:26:03 UTC

This happened again when the machine rebooted Monday night.  All of the disks were available, but one of the arrays was started in degraded mode.  Fortunately, I was lucky and didn't lose any data.  Is there any way to change this behavior?

Comment 2 Doug Ledford 2011-01-13 04:03:28 UTC

You need to update to the latest mdadm package and then rebuild your initramfs using dracut.

As for starting in degraded mode, a redundant array wouldn't be much good if the system didn't keep running even if it's degraded.  And the md raid stack wouldn't be much good if it couldn't transition from degraded to clean/active without corrupting data.

Comment 3 Andrew McNabb 2011-01-13 17:46:34 UTC

Could you point me to a bug report that shows what was fixed (and in which version of mdadm), just so I can understand better what happened?  Thanks.

I'm fine with it running when it degrades.  However, it's really frustrating for it to automatically degrade an array at boot that isn't actually missing any disks.  If I remember correctly (and I probably don't), there was once a way to make it not automatically assemble a degraded array without user interaction.  This is a nice behavior because it doesn't affect what happens if a running array loses a disk, but it does help avoid weird problems at boot time.

Comment 4 Doug Ledford 2011-01-13 18:54:14 UTC

Bug 616596

Note You need to log in before you can comment on or make changes to this bug.