Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 4 product line. The current stable release is 4.9. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 191723

Summary: device-mapper mirror: Need proper notification of sync status chage on write failure
Product: Red Hat Enterprise Linux 4 Reporter: Jonathan Earl Brassow <jbrassow>
Component: kernelAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.0CC: agk, jbaron, jbrassow, rkenna
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHSA-2006-0575 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2006-08-10 23:17:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 181409    
Attachments:
Description Flags
supplimental patch none

Description Jonathan Earl Brassow 2006-05-15 14:26:24 UTC
Description of problem:
This bug addresses a problem with the fix for bug 186004 - specifically when
considering cluster mirroring.
Problem statement: cluster mirroring log server does not get proper notification
when a region goes out-of-sync due to a failed write.

The original bug (186004) addressed the issue that the mirror log did not
properly handle the sync status of a region where a write failed to some (but
not all) of the mirror devices.  However, it used a mechanism to notify the
logging code that was only suitable to single machine mirroring.

For cluster mirroring, it is not acceptable to perform a suspend/resume cycle to
make the logging code pick up new out-of-sync regions.  This is because a
machine that does the suspend/resume cycle may not be the machine acting as the
cluster log server.  A more direct (and correct) method of informing the logging
code must be used when a region goes out-of-sync due to write failures.

This more direct method is done by replacing the suspend/resume cycle with a
call to complete_resync_work(/*failure*/).  By going this route, we are using
the action of the function to get the desired result, rather that getting the
desired result as a side effect.

Version-Release number of selected component (if applicable):
RHEL4 U4

Comment 1 Jonathan Earl Brassow 2006-05-15 14:26:24 UTC
Created attachment 129065 [details]
supplimental patch

Comment 5 Jason Baron 2006-05-22 19:11:30 UTC
committed in stream U4 build 36.1. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 8 Red Hat Bugzilla 2006-08-10 23:17:38 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html