613186 – MD

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 613186 - MD

Summary: MD

Keywords:
Status:	CLOSED NEXTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux 6
Classification:	Red Hat
Component:	doc-Migration_Guide
Sub Component:
Version:	6.0
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Laura Bailey
QA Contact:	ecs-bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-07-09 21:44 UTC by Doug Ledford
Modified:	2013-02-05 23:57 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-07-18 23:13:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Doug Ledford 2010-07-09 21:44:39 UTC

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:

Comment 2 Doug Ledford 2010-07-09 22:26:49 UTC

Oops, hit return by accident while entering the subject and it entered the bug
:-/

Anyway, for RHEL6, although we don't support upgrading from a dmraid raid set
to an mdraid raid set, we do support both upgrades on existing mdraid raid sets
and creation of new mdraid raid sets.  Unfortunately, these are handled
differently in that newly created mdraid raid sets will, by default, use a
different version of md raid superblock.  Previous versions of rhel defaulted
to the old version 0.90 md raid superblock, which was at the end of the
partition/disk and which means that the data was at the beginning and visible
whether the raid array was running or not (aka, if you had a raid1 array and it
was not running, then both of the disks that make up the raid1 array would be
visible to the system and things like lvm or filesystem mount commands could
accidentally mount the bare disks instead of the raid array).  The new default
superblock format (used on all devices except when creating a raid1 /boot
partition) is at the beginning of the array and therefore any filesystem or lvm
data if offset from the beginning of the partition.  That means that when the
array is not running, lvm and filesystem mount commands will not see the device
as having valid lvm or filesystem data.  This is intentional.  It means that if
you want to mount a single disk of a raid1 array, you need to start the array
with only that single disk in it, and then mount the array.  You can *not*
mount just the bare disk.  This change has been made because mounting the bare
disk is an easy way to silently corrupt a raid1 array if you forget to force a
resync from the disk you mounted to the other raid disk the next time you
assemble the raid array.  By starting the array in degraded mode with only one
disk present, it will update the generation count in the superblock on that
disk and the raid subsystem will then know that the other disk is out of date. 
This also means that on subsequent reboots the raid system will consider the
disk that was left out of the array as "non-fresh" and kick that device from
the array.  This is intentional.  When you are ready to re-add the other disk
back into the array, use the mdadm command to hot add the disk to the array, a
resync of the changed parts of the disk (if you have write intent bitmaps) or
the whole disk (if there is no bitmap) will be performed, and the array will
once again be in sync and in the future the array will be assembled properly
again instead of the old "non-fresh" drive being kicked from the array.

Secondly, the new raid superblock supports the concept of named md raid arrays.
 Named md raid arrays do not depend on the old method of array enumeration (for
instance, /dev/md0 then /dev/md1, etc.) for distinguishing between arrays. 
Instead it allows you to pick an arbitrary name to call the array (such as home
or data or opt).  If you were to create an array with the --name=home option
added, then the default name of the device file you would use to mount the
array would be /dev/md/home.  Whatever name you give the array, that name will
be created in /dev/md/ (unless you specify a full path in the name, then that
path will be created, or unless you specify a simple number, such as 0, in
which case mdadm will attempt to start the array using the old /dev/md# scheme
and assign it the number in the name field).  The rhel6 anaconda installer does
not yet allow for the setting of array names an instead just uses the simple
number scheme to emulate how arrays were created in the past.  We expect this
to change in the future.

Third, the new md raid arrays support the use of write intent bitmaps.  These
bitmaps help the system know which parts of an array are dirty and which parts
are clean so that in the event of an unclean shutdown, only the dirty portion
of an array needs to be resynced.  This can drastically speed up the resync
process (from days to just minutes or at worst hours on petabyte sized arrays).
 Newly created arrays will automatically have a write intent bitmap added to
them when it makes sense (arrays used for swap, and very small arrays such as
/boot partition arrays do not benefit from write intent bitmaps).  It may be
possible to add a write intent bitmap to your previously existing arrays after
the upgrade is complete by manually issuing an mdadm --grow command on the
device (the alignment of the superblock on the pre-existing raid array
determines whether or not there is room for an internal write intent bitmap,
see the mdadm man page for full details on Grow mode and how to add a write
intent bitmap to an array using Grow mode).  However, write intent bitmaps do
incur a modest performance penalty (about 3-5% at a bitmap chunk size of 65536,
but can go up to 10% or more at small bitmap chunk sizes such as 8192), so if
you add a write intent bitmap to an array it's best to keep the chunk size
reasonable large.  We recommend using the default of 65536 for the bitmap
chunk.

Comment 3 Scott Radvan 2010-07-18 23:13:45 UTC

Have added this to the Migration Guide and changes will appear on next publish. Thank you for so much detail, it's appreciated!

Comment 4 Scott Radvan 2010-07-18 23:14:02 UTC

Have added this to the Migration Guide and changes will appear on next publish. Thank you for so much detail, it's appreciated!

Note You need to log in before you can comment on or make changes to this bug.