832616 – RAID1 "failure" not reported, silently split into "competing" instances

Bug 832616 - RAID1 "failure" not reported, silently split into "competing" instances

Summary: RAID1 "failure" not reported, silently split into "competing" instances

Keywords:
Status:	CLOSED NOTABUG
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	mdadm
Sub Component:
Version:	17
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Jes Sorensen
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	838957
TreeView+	depends on / blocked

Reported:	2012-06-15 23:48 UTC by Gareth Jones
Modified:	2015-07-20 15:16 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Clones:	838957 (view as bug list)
Environment:
Last Closed:	2012-07-10 14:02:11 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Logs relating to RAID (12.58 KB, application/octet-stream) 2012-06-28 17:54 UTC, Gareth Jones	no flags	Details
Complete /var/log/messages of first boot, up to the "firstboot" set-up screen. (14.67 KB, application/octet-stream) 2012-07-02 15:25 UTC, Gareth Jones	no flags	Details
View All

Description Gareth Jones 2012-06-15 23:48:34 UTC

Description of problem:

Summary:
The (non-hardware) fail of an encrypted RAID1 mirrored partition was not reported to the user (outside of /var/log/messages and /proc/mdstat).  A subsequent unknown change that lead to the "failed" partition being revived and the more up-to-date partition being dropped was not reported either.  This potentially resulted in significant loss of data, which could have been avoided if it had been reported as hardware disk errors are.

For a blow-by-blow account, see http://forums.fedoraforum.org/showthread.php?t=281211.  Otherwise:

(1) Firstly the set-up:
/boot: ext4 partition sda2;
/: md-raid RAID0 (striped) ext4 of sda3 & sdb1;
/var & /tmp: encrypted ext4, sda5 & sdb3 respectively;
/home: encrypted md-raid RAID1 (mirrored) ext4 of sda6 & sdb4;
(Other partitions for BIOS boot and 2 x swap).

(2) At some point, possibly due to a crash (I don't know), sdb4 became regarded as out-of-sync with its mirror sda6.  Only sda6 was used and sdb4 was left to drift further out-of-sync:
    Jun 10 08:26:55 gareth-desktop kernel: [   21.679114] md: bind<sda6>
    Jun 10 08:26:55 gareth-desktop kernel: [   21.679768] md: kicking non-fresh sdb4 from array!
Aside from these lines in /var/log/messages on every boot, this was not reported to the user, so I was completely unaware of it.

(3) At a later point, again for reasons wholly unknown (definitely not a crash), the system decided to use sdb4 instead of sda6, silently swapping the file-system being mounted on /home and losing recent files as a result.  At this point, becoming aware of the problem, I could re-sync the disks, but because some new files were on one file-system and some on the other, a manual merge process was needed first.

(4) The hardware is fine, both SMART and the RAID0 root file-system across both disks is fine.  The file-systems are also both fine, but diverged.

There are two aspects to this bug: firstly, that nothing was reported, and the only visible effect was the sudden apparently inexplicable disappearance of recent files; and secondly, the apparently random switch between which file-system was actually used.


Steps to Reproduce:
I'm not sure how to simulate this situation artificially, as from my perspective it just happened.

Comment 1 Jes Sorensen 2012-06-18 15:58:47 UTC

Please attach /proc/mdstat output, info from /var/log/messages, your
/etc/mdadm.conf, and partition information.

Please also make sure you have the latest updated mdadm - currently
mdadm-3.2.5 is sitting in testing-updates

The fact that the disks get kicked off the raid like this repeatedly
sounds like you are having a hardware problem. If the disks are sound,
this really shouldn't happen.

Jes

Comment 2 Gareth Jones 2012-06-18 17:07:41 UTC

I'm away from home at the moment so I won't be able to get at the logs or config etc. until next week.

I remember that after I noticed the problem (when the missing/working partitions had already swapped, and sdb4 was now active), /proc/mdstat looked normal, except for the absence of sda6 and "_" instead of "U" in the corresponding status.  I didn't see mdstat when sdb4 first went offline before the swap.  After re-adding sda6, and successfully re-syncing the array over-night, the problem recurred after a reboot - sdb4 was dropped.  This time no message about it being kicked in /var/log/messages, but mdstat showed only sda6 as present and "_" for sdb4's status, even though it had been used as the source mirror when re-adding sda6 just the night before.  At no pointed through any of this did I change /etc/mdadm.conf, it was as Anaconda created it.

SMART reported both disks as perfectly healthy, and / (RAID0 across the same disks) and all other partitions on both disks are fine.  Neither of the file-systems on the mirrored devices were broken either, at least not beyond ext4's journalling abilities.  (I mounted the lost sda6 outside of RAID to retrieve the missing files to sdb4 before re-syncing it to the array.)

I'll get logs etc. next week.

Comment 3 Jes Sorensen 2012-06-20 08:15:27 UTC

Ok, I am curious to see how your mdadm.conf file looks.

The normal way for mdadm to report failures is via an email sent to the
email address specified in /etc/mdadm.conf using the MAILADDR variable.
If Anaconda didn't set one, then I don't think mdadm will mail out warnings
in case of error. Looking briefly through the code, that is what it looks
like at least.

If there is a MAILADDR entry and no mail was sent out when the failures were
detected, that would be a real issue.

If there is no MAILADDR entry in the config file, then I would say this is
an Anaconda bug that should be addressed there.

Cheers,
Jes

Comment 4 Gareth Jones 2012-06-28 17:28:41 UTC

Unfortunately, due to the recurrence of this and me needing this machine to work, I gave up on md-raid and switched to Btrfs/RAID instead, which so far is working fine.  I no longer have /etc/mdadm, but from what I remember, the email line contained "root" as the address, without any "@localhost" or similar.  Please take this with a pinch of salt, as it's from my memory, and it might be what is intended anyway.  I did save the logs though.  I would suggest that a local email is not a particularly good way to report RAID problems on a desktop in any case.

To check the hardware before reinstalling I ran a "badblocks" pass on both drives and rechecked the SMART data, and both drives are perfectly healthy.  Btrfs RAID is working perfectly fine.

I'm just going through the log files now.

Comment 5 Gareth Jones 2012-06-28 17:54:21 UTC

Created attachment 595098 [details]
Logs relating to RAID

Generated with: cat messages* | grep -i '![ae]md\|mdadm\|md0\|md1\|raid\|sda\|sdb' > messages.txt

Notes:
Line 964: Last complete RAID1 array.
Line 1019: First degraded array (sda6 only), sdb4 not mentioned.
Line 1073: Kicking stale sdb4.
Line 2189: Switch from sda6 to sdb4, no mention of sda6, missing files.
Line 2462: Around here I rebuilt the array using sdb4 as source (after separately mounting sda6 and copying missing files).
Line 2559: sda6 only again, no mention of sdb4.

Comment 6 Jes Sorensen 2012-07-02 14:16:48 UTC

Gareth,

It's puzzling the drives get kicked off like that. One question, are they
both connected to the same SATA controller?

The logs you posted didn't include info about the probing of the drives.

Thanks,
Jes

Comment 7 Gareth Jones 2012-07-02 15:24:19 UTC

I think they are on the same controller – I'm using an ASUS P6T Deluxe motherboard, which has three controllers, but only one of them is plain SATA (6 ports), the others being 2xSAS/SATA and PATA+eSATA.  I'll attach a log of a complete boot in a moment, I didn't realize I'd filtered the probing out, sorry!

Comment 8 Gareth Jones 2012-07-02 15:25:43 UTC

Created attachment 595761 [details]
Complete /var/log/messages of first boot, up to the "firstboot" set-up screen.

Comment 9 Jes Sorensen 2012-07-02 20:20:30 UTC

An update on F17 and raid error reporting.

I did a fresh install on a test system here and created a raid device
during the installation. I verified that /etc/mdadm.conf does indeed get
the correct MAILADDR line added.

I then tried to fail a drive on the array and as expected the error message
shows up in root's mail folder.

We can certainly discuss whether just defaulting to root is the right thing
to do. However if Anaconda should be made to ask for an email address, then
that really should be filed as an RFE against Anaconda.

I am still curious why your drives will get kicked out of the array though.

Jes

Comment 10 Gareth Jones 2012-07-02 21:10:06 UTC

Me too.  If there's any other information I can provide just ask.

As for the error reporting, while email makes sense for a server, it doesn't seem right for a desktop, where the local email system isn't really connected to anything anyway.  A direct on-screen notification would make more sense, but I'm not sure how practical that is to implement.

Comment 11 Jes Sorensen 2012-07-09 12:21:49 UTC

Gareth,

Thanks for the log - I looked at it, there is about a 1 second delay 
between the probing of the two SATA drives, with the DVD drive showing
up in the middle. This really shouldn't make a difference (I have seen
issues where some of the drives are on a separate controller and the
probe delay is > 10 seconds). It could be a try to move the DVD drive
to a different port so it is found after the harddrives, but it really
shouldn't matter.

That said, everything here is pointing at mdadm having reported the
errors as expected, but they were noticed since you weren't monitoring
the root mail address (like most users as you rightfully point out).

Now the issue is how/where to address it. Doing an mdadm specific tool
that pops up a warning would be rather silly. What really needs to be
implemented would be some daemon level thing that can monitor all the
different types of storage and report errors that way.

To be honest, I don't know what is currently happening for other things,
like SMART, dm-raid, fail-over etc. so not sure where we should file this
RFE.

Cheers,
Jes

Comment 12 Jes Sorensen 2012-07-10 14:02:11 UTC

Gareth,

I have created a new bug to handle the issue about what email address to
send the error messages to. I think we should start there with the issue,
I will also try and start a discussion on how to handle this on a broader scale.

I am going to close this bug since the problem itself doesn't seem to be
in mdadm.

Cheers,
Jes

Note You need to log in before you can comment on or make changes to this bug.