Bug 191723 - device-mapper mirror: Need proper notification of sync status chage on write failure
device-mapper mirror: Need proper notification of sync status chage on write ...
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
All Linux
medium Severity medium
: ---
: ---
Assigned To: Jonathan Earl Brassow
Brian Brock
: Regression
Depends On:
Blocks: 181409
  Show dependency treegraph
Reported: 2006-05-15 10:26 EDT by Jonathan Earl Brassow
Modified: 2007-11-30 17:07 EST (History)
4 users (show)

See Also:
Fixed In Version: RHSA-2006-0575
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2006-08-10 19:17:36 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
supplimental patch (1.61 KB, patch)
2006-05-15 10:26 EDT, Jonathan Earl Brassow
no flags Details | Diff

External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2006:0575 normal SHIPPED_LIVE Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4 2006-08-10 00:00:00 EDT

  None (edit)
Description Jonathan Earl Brassow 2006-05-15 10:26:24 EDT
Description of problem:
This bug addresses a problem with the fix for bug 186004 - specifically when
considering cluster mirroring.
Problem statement: cluster mirroring log server does not get proper notification
when a region goes out-of-sync due to a failed write.

The original bug (186004) addressed the issue that the mirror log did not
properly handle the sync status of a region where a write failed to some (but
not all) of the mirror devices.  However, it used a mechanism to notify the
logging code that was only suitable to single machine mirroring.

For cluster mirroring, it is not acceptable to perform a suspend/resume cycle to
make the logging code pick up new out-of-sync regions.  This is because a
machine that does the suspend/resume cycle may not be the machine acting as the
cluster log server.  A more direct (and correct) method of informing the logging
code must be used when a region goes out-of-sync due to write failures.

This more direct method is done by replacing the suspend/resume cycle with a
call to complete_resync_work(/*failure*/).  By going this route, we are using
the action of the function to get the desired result, rather that getting the
desired result as a side effect.

Version-Release number of selected component (if applicable):
Comment 1 Jonathan Earl Brassow 2006-05-15 10:26:24 EDT
Created attachment 129065 [details]
supplimental patch
Comment 5 Jason Baron 2006-05-22 15:11:30 EDT
committed in stream U4 build 36.1. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/
Comment 8 Red Hat Bugzilla 2006-08-10 19:17:38 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.