Bug 150421
Summary: | --fail: fails more devices than specified. (IDE RAID5) | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Need Real Name <raymond> |
Component: | mdadm | Assignee: | Doug Ledford <dledford> |
Status: | CLOSED WORKSFORME | QA Contact: | |
Severity: | high | Docs Contact: | |
Priority: | medium | ||
Version: | 3 | ||
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2005-10-17 17:38:28 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Need Real Name
2005-03-06 02:19:12 UTC
To move a drive from one controller to another doesn't require removing/adding the drive from the array. You simply shut down the machine, move the drive, then at startup it detects that you have moved the drive and puts it back in the array from the new device location. If you are wanting to change the physical disk that the data resides on, then you have to do what you tried to do. However, a word of caution: IDE drives now a days are, unfortunately, not what I would call high reliability devices. Any time you move data from one drive to another like this, you are taking a device offline and forcing the array into degraded mode, at which point it is no longer fault tolerant, and then telling it to rebuild onto a different drive. The risk is that something will go wrong during that rebuild. For IDE drives, I recommend that prior to doing something like this, you always do something like dd if=/dev/hda of=/dev/null and do that for each drive you have in your current array as a quick read test to make sure there are no bad blocks hiding in rarely/never used parts of the drive. Now, to your specific case, when you added /dev/hdg1 it should have just became a hot spare. Once you then removed /dev/hdd5, it should have been marked as Failed, and reconstruction should have started on /dev/hdg1. At that point, the raid subsystem would have to read every single block on /dev/hda5 and /dev/hdc5 in order to reconstruct /dev/hdg1, and if /dev/hdc5 had any bad sectors, then it would end up failing as a result and taking the array offline. I'm guessing that's what happened here. If you still can, check your logs for any error messages indicating I/O errors to /dev/hdc5. If that's what happened, then your next option is to reboot into rescue more and use mdadm to manually assemble the raid5 array. To do that, do something like: mdadm -A /dev/md7 --force --run --update=summaries /dev/hda5 /dev/hdc5 failed I wouldn't try to add /dev/hdd5 back into the array, I would just try to get it back into the degraded state it was in before. However, if you know for certain that you didn't write to the array after failing /dev/hdd5, then you could bring the array back up with all three devices. The problem is, if the array was still active after you removed /dev/hdd5, then any writes that would have went to /dev/hdd5 would have been stored in parity blocks on /dev/hda5 and /dev/hdc5 instead, and if you bring /dev/hdd5 back into the array as a clean device, we'll read from it instead of the parity blocks and get stale data, possibly resulting in a corrupted filesystem. Instead, you have to readd /dev/hdd5 as a new disk and let it get rebuilt (although since you have to rebuild a drive anyway, rebuilding /dev/hdg1 makes more sense than rebuilding to /dev/hdd5 and having to start the move process over again). Hope that helps. No activity in multiple months, closing. |