Bug 735306 - mdadm may grow an array beyond supported limits
Summary: mdadm may grow an array beyond supported limits
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: mdadm
Version: 15
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Jes Sorensen
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: 748731 748732
TreeView+ depends on / blocked
 
Reported: 2011-09-02 09:18 UTC by Pim Zandbergen
Modified: 2013-05-22 09:29 UTC (History)
5 users (show)

Fixed In Version: mdadm-3.2.2-15.fc15
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 748731 748732 (view as bug list)
Environment:
Last Closed: 2011-12-14 23:37:07 UTC


Attachments (Terms of Use)

Description Pim Zandbergen 2011-09-02 09:18:09 UTC
Description of problem:
Current kernel/mdadm code does not support members >2TB when 0.9X metadata is used. Yet mdadm does not refuse to grow an existing array with this metadata version to >2TB members. Attempting to do so will render the array unusable.


Version-Release number of selected component (if applicable):
mdadm-3.2.2-6.fc15.x86_64

How reproducible:
Always

Steps to Reproduce:
1. create an array with 0.9x metadata using <2TB members
2. grow/replace the members to >2TB
3. mdadm --grow /dev/md0 --size max
  
Actual results:
mdadm does not refuse the operation or warn the user
and renders the array unusable


Expected results:
mdadm refuses the operation


Additional info:
kernel-2.6.38.6-27.fc15.x86_64

Comment 1 Pim Zandbergen 2011-09-02 12:32:57 UTC
It has been suggested on the linux-raid mailinglist that mdadm ought to have
complained in step 2 as well, the step where >2TB members were added to the array.

Comment 2 Jes Sorensen 2011-10-20 13:39:17 UTC
Pim,

When you're saying 2TB members, do you mean individual drives > 2TB or
a raid assembly that is larger than 2TB?

Jes

Comment 3 Pim Zandbergen 2011-10-20 14:50:29 UTC
Jes,

My point is that mdadm should refuse to add drives >2TB to an array with 0.9X metadata and/or should refuse to grow this array in a way that would require individual drives to be >2TB.

However, it now turns out that the real limit for individual drives in a 0.9X metadata array is 4TB. The fact that damage occurs long before that is another bug.

See this thread in the linux-raid mailinglist
http://marc.info/?l=linux-raid&m=131489162105812&w=2

Pim

Comment 4 Jes Sorensen 2011-10-25 08:11:49 UTC
Pim,

Thanks for the details, looks like we need to apply the following three
patches to mdadm in F15 and F16 and one patch to the F15 kernel (below).

I am listing them all here for reference before cloning the bug to the
right release versions.

Jes


commit 20a4675688e0384a1b4eac61b05f60fbf7747df9
Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 8 13:08:51 2011 +1000

    Grow: refuse to grow a 0.90 array beyond 2TB
    
    A kernel bug makes handling for arrays using more than 2TB per device
    incorrect, and the kernel doesn't stop an array from growing beyond
    any limit.
    This is fixed in 3.1
    
    So prior to 3.1, make sure not to ask for an array to grow bigger than
    2TB per device.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit 11b391ece9fa284a151362537af093aa44883696
Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 8 13:05:31 2011 +1000

    Discourage large devices from being added to 0.90 arrays.
    
    0.90 arrays can only use up to 4TB per device.  So when a larger
    device is added, complain a bit.  Still allow it if --force is given
    as there could be a valid use.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

commit 01619b481883926f13da2b1b88f3125359a6a08b
Author: NeilBrown <neilb@suse.de>
Date:   Thu Sep 8 12:20:36 2011 +1000

    Fix component size checks in validate_super0.
    
    A 0.90 array can use at most 4TB of each device - 2TB between
    2.6.39 and 3.1 due to a kernel bug.
    
    The test for this in validate_super0 is very wrong.  'size' is sectors
    and the number it is compared against is just confusing.
    
    So fix it all up and correct the spelling of terabytes and remove
    a second redundant test on 'size'.
    
    Signed-off-by: NeilBrown <neilb@suse.de>

=========================================================================

This patch is also needed for the F15 kernel:

From 27a7b260f71439c40546b43588448faac01adb93 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Sat, 10 Sep 2011 17:21:28 +1000
Subject: [PATCH] md: Fix handling for devices from 2TB to 4TB in 0.90
 metadata.

0.90 metadata uses an unsigned 32bit number to count the number of
kilobytes used from each device.
This should allow up to 4TB per device.
However we multiply this by 2 (to get sectors) before casting to a
larger type, so sizes above 2TB get truncated.

Also we allow rdev->sectors to be larger than 4TB, so it is possible
for the array to be resized larger than the metadata can handle.
So make sure rdev->sectors never exceeds 4TB when 0.90 metadata is in
used.

Also the sanity check at the end of super_90_load should include level
1 as it used ->size too. (RAID0 and Linear don't use ->size at all).

Reported-by: Pim Zandbergen <P.Zandbergen@macroscoop.nl>
Cc: stable@kernel.org
Signed-off-by: NeilBrown <neilb@suse.de>
---
 drivers/md/md.c |   12 ++++++++++--
 1 files changed, 10 insertions(+), 2 deletions(-)

Comment 5 Jes Sorensen 2011-10-25 08:36:56 UTC
Pim,

I have created a test rpm with the upstream fixes for F15. If you have time
and disk space to test it, that would be great. If not, I will try to find
another way to test that it does the right thing.

I don't have any 3TB drives around, though I should probably order a couple.

http://koji.fedoraproject.org/koji/taskinfo?taskID=3459052

Cheers,
Jes

Comment 6 Jes Sorensen 2011-10-31 09:45:05 UTC
Pim,

I am having problems reproducing this one - how does the problem show itself?
Does the raid become unusuable the moment you issued the grow command or
only after reboot?

I ordered two 3TB drives and did the following:

mdadm -Cv /dev/md64 --raid-devices=2 -z 16G --level 1 --metadata=0.9 /dev/sdh1 /dev/sdi1 

created a partition on that, then ran:
mdadm --grow /dev/md64 --size max

I am then able to create devices on the new space without any problems.
After a reboot it looks to be all fine as well.

Jes

Comment 7 Fedora Update System 2011-11-10 09:56:45 UTC
mdadm-3.2.2-14.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/mdadm-3.2.2-14.fc15

Comment 8 Fedora Update System 2011-11-11 01:26:11 UTC
Package mdadm-3.2.2-14.fc15:
* should fix your issue,
* was pushed to the Fedora 15 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mdadm-3.2.2-14.fc15'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2011-15749
then log in and leave karma (feedback).

Comment 9 Pim Zandbergen 2011-11-14 12:37:33 UTC
(In reply to comment #5)
> If you have time and disk space to test it, that would be great.

Sorry, I was on vacation. Also, I have rebuilt my hosed array using 1.X metadata, so there's no way I can repeat this action.

I only made this report as to prevent others from happening what happened to me.

Comment 10 Fedora Update System 2011-11-23 10:49:43 UTC
mdadm-3.2.2-15.fc15 has been submitted as an update for Fedora 15.
https://admin.fedoraproject.org/updates/mdadm-3.2.2-15.fc15

Comment 11 Fedora Update System 2011-12-14 23:37:07 UTC
mdadm-3.2.2-15.fc15 has been pushed to the Fedora 15 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.