Bug 886123

Summary:

[Intel F18 Bug] Failed disk is still available in volume/container

Product:

[Fedora] Fedora

Reporter:

Maciej Patelczyk <maciej.patelczyk>

Component:

mdadm

Assignee:

Jes Sorensen <Jes.Sorensen>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

medium

Docs Contact:

Priority:

unspecified

Version:

CC:

agk, dledford, ed.ciechanowski, Jes.Sorensen, lukasz.dorau, marcin.tomczak

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2013-01-11 23:18:13 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
udev rules patch	none

Description Maciej Patelczyk 2012-12-11 14:59:39 UTC

Created attachment 661502 [details]
udev rules patch

Description of problem:
When we have a raid volume and one of disks fails, the failed disk is still present in volume and container. The raid volume is in normal state (should be  degraded) and rebuild cannot start. 

How reproducible:
Always

Steps to Reproduce:
mdadm -Ss
mdadm --zero-superblock /dev/sd[b-d]
mdadm -C /dev/md/imsm0 -amd -e imsm -n 3 /dev/sdb /dev/sdc /dev/sdd -R
mdadm -C /dev/md/raid5 -amd -l5 -n 3 /dev/sdb /dev/sdc /dev/sdd -R
mdadm --wait /dev/md/raid5
# power off a raid member disk (e.g. /dev/sdd)

Actual results:
The failed disk is still present in container/volume. State in 'mdadm -D /dev/md/raid5' output is 'clean'.

Expected results:
The failed disk should disappear from container and volume. State in 'mdadm -D /dev/md/raid5' output should be 'clean, degraded'.

Additional info:
When one of disks fails udev runs "/usr/lib/udev/rules.d/65-md-incremental.rules". The "65-md-incremental.rules" script has following rules:
SUBSYSTEM=="block", ACTION=="remove", ENV{ID_FS_TYPE}=="linux_raid_member", \
        RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"
SUBSYSTEM=="block", ACTION=="remove", ENV{ID_FS_TYPE}=="isw_raid_member", \
        RUN+="/sbin/mdadm -If $name --path $env{ID_PATH}"

If a disk fails and "$env{ID_PATH}" is null, udev runs "/sbin/mdadm -If sdd --path" (what does nothing - it is an incorrect mdadm's command), instead of "/sbin/mdadm -If sdd".

The following patch fixes this bug:
correct-65-md-incremental-rules-in-case-a-raid-disk-fails.patch

Comment 1 Fedora Update System 2012-12-11 16:25:48 UTC

mdadm-3.2.6-7.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/mdadm-3.2.6-7.fc18

Comment 2 Fedora Update System 2012-12-11 20:04:45 UTC

Package mdadm-3.2.6-7.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mdadm-3.2.6-7.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2012-20173/mdadm-3.2.6-7.fc18
then log in and leave karma (feedback).

Comment 3 Lukasz Dorau 2012-12-13 08:39:27 UTC

Intel has tested the package mdadm-3.2.6-7.fc18 and confirms the bug is fixed in this build.

Comment 4 Fedora Update System 2013-01-11 23:18:15 UTC

mdadm-3.2.6-7.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.