Red Hat Bugzilla – Bug 191723
device-mapper mirror: Need proper notification of sync status chage on write failure
Last modified: 2007-11-30 17:07:25 EST
Description of problem: This bug addresses a problem with the fix for bug 186004 - specifically when considering cluster mirroring. Problem statement: cluster mirroring log server does not get proper notification when a region goes out-of-sync due to a failed write. The original bug (186004) addressed the issue that the mirror log did not properly handle the sync status of a region where a write failed to some (but not all) of the mirror devices. However, it used a mechanism to notify the logging code that was only suitable to single machine mirroring. For cluster mirroring, it is not acceptable to perform a suspend/resume cycle to make the logging code pick up new out-of-sync regions. This is because a machine that does the suspend/resume cycle may not be the machine acting as the cluster log server. A more direct (and correct) method of informing the logging code must be used when a region goes out-of-sync due to write failures. This more direct method is done by replacing the suspend/resume cycle with a call to complete_resync_work(/*failure*/). By going this route, we are using the action of the function to get the desired result, rather that getting the desired result as a side effect. Version-Release number of selected component (if applicable): RHEL4 U4
Created attachment 129065 [details] supplimental patch
committed in stream U4 build 36.1. A test kernel with this patch is available from http://people.redhat.com/~jbaron/rhel4/
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0575.html