116685 – reboot while raid 1 primary waits to be reconstructed brings array up fully synced

Bug 116685 - reboot while raid 1 primary waits to be reconstructed brings array up fully synced

Summary: reboot while raid 1 primary waits to be reconstructed brings array up fully s...

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	rawhide
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Doug Ledford
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	129608 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2004-02-24 12:10 UTC by Alexandre Oliva
Modified:	2007-11-30 22:10 UTC (History)
CC List:	3 users (show)
Fixed In Version:	2.6.9-1.715_FC3
Clone Of:
Environment:
Last Closed:	2005-10-10 17:16:32 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
extract from FC1 /var/log/messages with the unsyncable/unremovable spare (hdm6) (6.24 KB, text/plain) 2004-02-24 12:12 UTC, Alexandre Oliva	no flags	Details
View All

Description Alexandre Oliva 2004-02-24 12:10:44 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040217

Description of problem:
I had two raid 1 devices across two different hard disks.  One of the
disks had a temporary failure, and the devices got degraded.

After a reboot, I raidhotadded the two failed partitions, one to each
raid device, and they started syncing.

Figuring this would be a good opportunity to test raid 1 features in
the new kernel, I thought I'd raidsetfaulty the newly-added
partitions.  The one that was syncing was marked as faulty, but
syncing continued.  The other was marked as faulty, but it couldn't be
raidhotremoved any more.  After a reboot into FC1, the one that was
syncing when marked as faulty is brought back on-line and finishes
syncing.  The other never completes syncing, and produces a lot of
noise in /var/log/messages (I'll attach one of them; the raid devices
are <hdc5><hdm5> and <hdc6><hdm6>).  hdm6 is the one that won't be
activated and, if marked as faulty, can't be removed (both mdadm and
raidhotremove report the syscall to do so failed).  The only way I
could find to get the array back to a functional state was to mark the
partitions with some type other than raid auto-detect, such that the
arrays would be brought up as failed, reboot into the latest FC1
kernel and then change the partition types back and raidhotadd the
devices.

Version-Release number of selected component (if applicable):
kernel-2.6.3-1.97

How reproducible:
Didn't try

Steps to Reproduce:
1.raidhotadd partitions to two degraded raid1 devices that can't be
synced at the same time
2.mark the one that hasn't started syncing yet as faulty
3.try to raidhotremove it

Actual Results:  Operation failed

Expected Results:  It should be removed, not brought back as a spare
after reboot, that never gets synced to.

Additional info:

Comment 1 Alexandre Oliva 2004-02-24 12:12:10 UTC

Created attachment 97985 [details]
extract from FC1 /var/log/messages with the unsyncable/unremovable spare (hdm6)

Comment 2 Alexandre Oliva 2004-03-11 04:11:45 UTC

Here's a 100% reliable procedure to duplicate the problem:

- create two separate, degraded raid 1 devices, using partitions from
the same disk

- raidhotadd one partition to each raid 1 device.  one of them will
start syncing, and the other will wait for it

- before the first raid 1 device finishes syncing, raidsetfaulty the
partition raidhotadded to the second raid 1 device.

- attempt to raidhotremove the partition marked as faulty: it will fail

Another, probably related problem:

- create two raid devices, just like in the previous case.

- raidhotadd the two devices, just like in the previous case.

- reboot before the first one finishes syncing

- the second raid device will be marked as fully synced, even though
it was never synced.   Potential for data loss, or even failure to
boot, is pretty high, depending on how critical the filesystem is.

It seems to me that the problem has to do with the resync thread
already marking the devices as active, such that when they come back
up after a reboot, they appear to be good.

Comment 3 Alexandre Oliva 2004-04-09 07:07:23 UTC

FWIW, the `probably related problem' above (that an unfinished sync to
spare will not restart syncing after reboot) is fixed in 2.6.4-1.305.
 I haven't checked the first problem yet.

Comment 4 Jeremy Katz 2004-08-10 21:16:08 UTC

*** Bug 129608 has been marked as a duplicate of this bug. ***

Comment 5 Alexandre Oliva 2004-08-22 22:02:50 UTC

Looks like this is all fixed in rawhide.

Comment 6 Alexandre Oliva 2004-09-05 14:40:51 UTC

I was mistaken.  The problem of rebooting while one array syncs and
another waits for syncing, such that the array comes back up as if the
sync had completed, is still present.

Comment 7 Dave Jones 2004-12-07 05:31:35 UTC

still a problem with the 2.6.9 based update ?

Comment 8 Alexandre Oliva 2004-12-14 16:43:10 UTC

2.6.9-1.715_FC3 gets the reboot-with-array-waiting-to-resync case correctly. 
Thanks.

Comment 9 Alexandre Oliva 2005-03-03 08:35:36 UTC

Ugh.  The problem in which a reboot while there's an array waiting to
reconstruct its primary copy causes the array to come back up fully
synced, even though nothing was actually synced, is back in
2.6.10-1.1155_FC4 (or maybe it was never fixed, and I goofed when
testing :-(  I got it on x86_64 this time.  Odds are that it affects
RHEL4 as well :-(

Comment 10 Kasper Dupont 2005-06-25 21:44:38 UTC

I have seen similar symptoms in FC3 with kernel-2.6.11-1.35_FC3. This bug should
have higher priority as it may cause major data loss.

I have a system with a failing disk, which I wanted to replace with a raid on
two new disks. I did the following.

First I partitioned one of the new disks and created degraded raid-1 arrays. I
copied the data from the old disk to the raid and shut down. While working with
the degraded raid, I noticed that when the processes are signaled during
shutdown, some "immediate safe mode" messages are produced by md.

Then I replaced the old disk with another new disk which I partitioned and added
to the raids. Before recovery had completed, I shut down the system. At shutdown
a lot of error messages were produced, but they didn't get logged.

After this, the system couldn't boot. It would fail to mount the root fs and
panic. I put the old disk in a tray to boot from it. I found that all the raid
devices for which the recovery had been delayed had been marked as fully synced,
and reading from the noninitialized disk had caused mounting the root device to
fail.

I was able to recover the raid by marking the noninitialized partitions as
faulty. But if the problem is not noticed and solved immediately, it may cause
serious data loss.

Comment 11 Kasper Dupont 2005-07-06 14:02:24 UTC

I noticed that this does not only affect new disks. An unclean raid-1 with
multiple disks is not synced on boot like it was on FC1.

Comment 12 Alexandre Oliva 2005-07-07 06:34:56 UTC

Kasper, I think what you're observing is just the result of an optimization in
the RAID subsystem, that's present in newer kernels: when all members of a RAID
1 device are stable and up-to-date, they're marked as clean, such that, should
an unexpected reboot ensue, they don't have to be resynced, since they're known
to be in sync.

Comment 13 Kasper Dupont 2005-07-07 20:05:51 UTC

It is possible that the missing sync on unexpected reboot is an uptimization. I
haven't yet had the option to investigate that further. How soon after the last
write is it supposed to mark the raid clean?

The missing sync of a newly added disk however is clearly a bug.

Comment 14 Dave Jones 2005-10-06 00:24:30 UTC

Alex does this still affect current rawhide ?

Comment 15 Alexandre Oliva 2005-10-10 17:16:32 UTC

I tried to trigger it again, and this time it worked, although I had to follow a
slightly different procedure (run mdadm -A by hand to bring it up, since
initrd.img will no longer bring up all raids it can find).  I guess this means
it's fixed.

Note You need to log in before you can comment on or make changes to this bug.