Bug 830542

Summary: cannot add device to auto-read-only array
Product: [Fedora] Fedora Reporter: Michal Schmidt <mschmidt>
Component: mdadmAssignee: Jes Sorensen <Jes.Sorensen>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 17CC: agk, dledford, Jes.Sorensen
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-01 17:11:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michal Schmidt 2012-06-10 13:58:53 UTC
Description of problem:
I have a RAID5 array of 3 devices. I assemble it incrementally by adding only two of the devices. Then I let the array start in auto-read-only mode using "mdadm -IRs". The bug is that then I cannot add the 3rd device. This is contrary to the mdadm manpage which states:

  [mdadm -IRs] will  try  to start all arrays that are being incrementally
  assembled. They are started in "read-auto" mode [...]
  Further devices that are found before the first write can still be added
  safely.

Version-Release number of selected component (if applicable):
mdadm-3.2.5-1.fc17.x86_64
kernel-3.4.0-1.fc17.x86_64

How reproducible:
always

Steps to Reproduce:

#!/bin/sh
# Let's have three block devices for testing
for i in 1 2 3; do
  dd if=/dev/zero of=/root/disk$i bs=1M count=100
  losetup /dev/loop$i /root/disk$i
done
# Create a RAID5 array of them
mdadm --create /dev/md/test --level=5 --raid-devices=3 /dev/loop[123]
# Stop the array
mdadm --stop /dev/md/test
# Incrementally assemble enough devices to start the array degraded
mdadm -I /dev/loop1
mdadm -I /dev/loop2
mdadm -IRs
# See that the array is running auto-read-only
cat /proc/mdstat
# Try adding the 3rd device
mdadm -I /dev/loop3
  

Actual results:
...
mdadm: not adding /dev/loop3 to active array (without --run) /dev/md/test

Expected results:
/dev/loop3 should be added to the array, because it's still auto-read-only and all the component devices have the same event count.

Additional info:
"mdadm -IR /dev/loop3" results in:
  mdadm: failed to add /dev/loop3 to /dev/md/test: Invalid argument.
... and the degraded array switches to read-write.

"mdadm /dev/md/test --add /dev/loop3" results in successful adding of the device to the array, but it will do an unnecessary resync.

Comment 1 Jes Sorensen 2012-07-18 11:24:48 UTC
Note here, that mdadm does return the correct information when you try to
add the third device, ie:
mdadm: not adding /dev/loop3 to active array (without --run) /dev/md/test

If you add the third device this way:

> mdadm -IR /dev/loop3

it does get added correctly, but since you started the array with the -IRs
previously the check is happening there too. If you do all three using just
-I it works as expected.

Note the man page states that one can safely add the additional disks to
the array, it doesn't state that one can do it without -R so I think it is
correct as is.

Why it is resulting in an additional resync when you start the array and it
is sitting in auto-read-only mode is a little puzzling. I will check with Neil
about that.

Cheers,
Jes

Comment 2 Michal Schmidt 2012-07-18 15:28:27 UTC
Jes,

as discussed on IRC, you and I were seeing different behaviour in response to "mdadm -IR ..." when the array was running as auto-read-only. We suspected the difference might be due to my use of loop devices, while you used real disks.

I have now discovered that the problem on my side was that I did not wait long enough after the initial creation of the array (i.e. between "mdadm --create ..." and "mdadm --stop ..."). Thus my array was not fully synced. If I wait properly before stopping the array, I get the same results as you do. Sorry for this confusion.

This leaves us with the following questions:
 1. Should "mdadm -I /dev/loop3" add the component to the running auto-read-only
    array? You are right that the manpage does not say whether -R is necessary.
    I really do not see why it should not work, but I can accept the opposite
    view.
 2. There's the unnecessary resync after "mdadm -IR /dev/loop3", which is
    puzzling, as we both agree.
 3. In my original example where I did the stupid thing (forgot to wait for the
    sync after the initial creation), why did the state of the array change
    from auto-read-only to read-write, even though the command I ran failed
    with "Invalid argument"? I'd expect no change in the state of the array.

Comment 3 Fedora End Of Life 2013-07-04 05:54:08 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 4 Fedora End Of Life 2013-08-01 17:11:45 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.