Bug 191723 - device-mapper mirror: Need proper notification of sync status chage on write failure
device-mapper mirror: Need proper notification of sync status chage on write ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
4.0
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jonathan Earl Brassow
Brian Brock
: Regression
Depends On:
Blocks: 181409
  Show dependency treegraph
 
Reported: 2006-05-15 10:26 EDT by Jonathan Earl Brassow
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2006-08-10 19:17:36 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
supplimental patch (1.61 KB, patch)
2006-05-15 10:26 EDT, Jonathan Earl Brassow
no flags Details | Diff

  None (edit)
Description Jonathan Earl Brassow 2006-05-15 10:26:24 EDT
Description of problem:
This bug addresses a problem with the fix for bug 186004 - specifically when
considering cluster mirroring.
Problem statement: cluster mirroring log server does not get proper notification
when a region goes out-of-sync due to a failed write.

The original bug (186004) addressed the issue that the mirror log did not
properly handle the sync status of a region where a write failed to some (but
not all) of the mirror devices.  However, it used a mechanism to notify the
logging code that was only suitable to single machine mirroring.

For cluster mirroring, it is not acceptable to perform a suspend/resume cycle to
make the logging code pick up new out-of-sync regions.  This is because a
machine that does the suspend/resume cycle may not be the machine acting as the
cluster log server.  A more direct (and correct) method of informing the logging
code must be used when a region goes out-of-sync due to write failures.

This more direct method is done by replacing the suspend/resume cycle with a
call to complete_resync_work(/*failure*/).  By going this route, we are using
the action of the function to get the desired result, rather that getting the
desired result as a side effect.

Version-Release number of selected component (if applicable):
RHEL4 U4
Comment 1 Jonathan Earl Brassow 2006-05-15 10:26:24 EDT
Created attachment 129065 [details]
supplimental patch
Comment 5 Jason Baron 2006-05-22 15:11:30 EDT
committed in stream U4 build 36.1. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 8 Red Hat Bugzilla 2006-08-10 19:17:38 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html

Note You need to log in before you can comment on or make changes to this bug.