Hide Forgot
Description of problem: Running a raid check, either via raid-check or manually by echo check > sync_action causes the array to begin, and run the check, but without causing any I/O. The check runs at the full limit speed of 200MB/s, even though the devices cannot run that fast. Running iostat shows zero I/O while /proc/mdstat reports a check in progress. Version-Release number of selected component (if applicable): How reproducible: Every time. Steps to Reproduce: 1. run raid-check 2. run iostat 3. confirm /prod/mdstat shows check in progress and at 200MB/s 4. confirm no disk io Actual results: The check runs to completion but the array is not actually checked. Expected results: The check should actually check the consistency of the array. Additional info: I reported this to linux-raid and a bug was identified and a patch apparently submitted. I first noticed this phenomenon with kernel 3.3.0-4 and have confirmed it is still occurring with 3.3.0-8. Here's the email I got from linux-raid: From 4d79586ebffac308ba11b363d81525882fdf6abe Mon Sep 17 00:00:00 2001 From: majianpeng <majianpeng> Date: Thu, 29 Mar 2012 11:12:59 +0800 Subject: [PATCH] md/raid5:Fix a bug about judging the operation is syncing or replaing in analyse_stripe(). When create a raid5 using assume-clean and echo check or repair to sync_action.Then component disks did not operated IO but the raid check/resync faster than normal. Because the judgement in function analyse_stripe(): if (do_recovery || sh->sector >= conf->mddev->recovery_cp) s->syncing = 1; else s->replacing = 1; When check or repair,the recovery_cp == MaxSectore,so syncing equal zero not one. Signed-off-by: majianpeng <majianpeng> --- drivers/md/raid5.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 23ac880..4d43ad3 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3276,12 +3276,14 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s) /* If there is a failed device being replaced, * we must be recovering. * else if we are after recovery_cp, we must be syncing + * else if MD_RECOVERY_REQUESTED is set,we all in syning. * else we can only be replacing * sync and recovery both need to read all devices, and so * use the same flag. */ if (do_recovery || - sh->sector >= conf->mddev->recovery_cp) + sh->sector >= conf->mddev->recovery_cp || + test_bit(MD_RECOVERY_REQUESTED, &(conf->mddev->recovery))) s->syncing = 1; else s->replacing = 1; -- 1.7.5.4 -------------- majianpeng 2012-03-29
Larkin, Can you provide me with details on how you created and re-created this array for the error to occur? I tried creating a raid5 array and re-creating it with --assume-clean here but was not able to reproduce the problem you are reporting. Thanks, Jes
Larkin, Actually ignore me - I can reproduce it, I was testing against the wrong kernel :( I check the upstream kernel tree and the fix is in Linus' tree as c6d2e084c7411f61f2b446d94989e5aaf9879b0f and I have just requested it to go into stable-3.3. It should ripple into Fedora automatically after that. Cheers, Jes
This seems to me to be fixed in 3.3.6-3.
Per Benjamin's comment, closing.