Bug 199622

Summary: failed mirror log causes mirror I/O errors
Product: Red Hat Enterprise Linux 4 Reporter: Corey Marthaler <cmarthal>
Component: kernelAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: high    
Version: 4.0CC: agk, dwysocha, jbaron, jbrassow, k.georgiou, mbroz, rkenna
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0304 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-08 02:52:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 218946    
Bug Blocks:    
Attachments:
Description Flags
Patch to fix problem - pending dm-suspend-add-noflush-pushback.patch none

Description Corey Marthaler 2006-07-20 19:46:20 UTC
Description of problem:
We started I/O to a cmirror and then caused the mirror log device to fail. That
failure then caused our I/O to fail due to I/O errors. With both legs of the
mirror still fine, the user shouldn't have it's application fail. It can slow
down or hang for a couple seconds while the log converts to core, but it should
fail. 


[root@taft-02 ~]# lvs -a -o +devices
  LV                VG   Attr   LSize Origin Snap%  Move Log         Copy% 
Devices                          
  mirror            vg   mwi-a- 1.00G                    mirror_mlog 100.00
mirror_mimage_0(0),mirror_mimage_1(0)
  [mirror_mimage_0] vg   iwi-ao 1.00G                                      
/dev/sdb1(0)                     
  [mirror_mimage_1] vg   iwi-ao 1.00G                                      
/dev/sdc1(0)                     
  [mirror_mlog]     vg   lwi-ao 4.00M                                      
/dev/sdd1(0)                     


Our application (b_iogen/b_doio):
<xior magic="0xfeed10"><write
syscall="writev"><path>/dev/vg/mirror</path><oflags>O_RDWR|O_SYNC</oflags><offset>437221822</offset><count>4883685</count><pattern>S</pattern></write></xior>
<xior magic="0xfeed10"><write
syscall="writev"><path>/dev/vg/mirror</path><oflags>O_RDWR|O_SYNC</oflags><offset>825501307</offset><count>3613689</count><pattern>P</pattern></write></xior>

Can not writev() 3613689 bytes to 825501307 on /dev/vg/mirror: Input/output error

[...]
Jul 20 09:45:17 taft-04 dmeventd[4446]: vgreduce operation failed (5).
Jul 20 09:45:17 taft-04 dmeventd[4446]: Failed to remove faulty devices in vg-mirror


[...]
Jul 20 08:37:32 taft-02 kernel: end_request: I/O error, dev sdd, sector 63
Jul 20 08:37:32 taft-02 lvm[4446]: /dev/sdd1: read failed after 0 of 2048 at 0:
Input/output error
Jul 20 08:37:32 taft-02 kernel: device-mapper: sync_count (2049) > region_count
(2048) - (Rzl7P97b)!
Jul 20 08:37:32 taft-02 kernel: device-mapper: I'm the cluster mirror log server
for HaPLS7br
Jul 20 08:37:32 taft-02 kernel: device-mapper: Disk Resume::
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Live nodes        :: 4
Jul 20 08:37:32 taft-02 kernel: device-mapper:   In-Use Regions    :: 0
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Good IUR's        :: 0
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Bad IUR's         :: 0
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Sync count        :: 0
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Disk Region count :: 0
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Region count      :: 2048
Jul 20 08:37:32 taft-02 kernel: device-mapper:   NOTE:  Mapping has changed.
Jul 20 08:37:32 taft-02 kernel: device-mapper: Marked regions::
Jul 20 08:37:32 taft-02 kernel: device-mapper:   0 - 2048
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Total = 2048
Jul 20 08:37:32 taft-02 kernel: device-mapper: Out-of-sync regions::
Jul 20 08:37:32 taft-02 kernel: device-mapper:   0 - 2048
Jul 20 08:37:32 taft-02 kernel: device-mapper:   Total = 2048
Jul 20 08:37:32 taft-02 dmeventd[4579]: Monitoring mirror device, vg-mirror for
events
Jul 20 08:37:34 taft-02 kernel: device-mapper: 2000 out-of-sync regions
remaining for HaPLS7br.
Jul 20 08:37:48 taft-02 kernel: device-mapper: 1000 out-of-sync regions
remaining for HaPLS7br.
Jul 20 08:37:50 taft-02 kernel: device-mapper: 900 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:51 taft-02 kernel: device-mapper: 800 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:52 taft-02 kernel: device-mapper: 700 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:54 taft-02 kernel: device-mapper: 600 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:55 taft-02 kernel: device-mapper: 500 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:57 taft-02 kernel: device-mapper: 400 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:58 taft-02 kernel: device-mapper: 300 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:59 taft-02 kernel: device-mapper: 200 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:37:59 taft-02 kernel: device-mapper: 175 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:00 taft-02 kernel: device-mapper: 150 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:00 taft-02 kernel: device-mapper: 125 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:01 taft-02 kernel: device-mapper: 100 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:01 taft-02 kernel: device-mapper: 75 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:02 taft-02 kernel: device-mapper: 50 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:02 taft-02 kernel: device-mapper: 25 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 kernel: device-mapper: 5 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 kernel: device-mapper: 4 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 kernel: device-mapper: 3 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 kernel: device-mapper: 2 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 kernel: device-mapper: 1 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 kernel: device-mapper: 0 out-of-sync regions remaining
for HaPLS7br.
Jul 20 08:38:03 taft-02 dmeventd[4579]: vg-mirror is now in-sync



[root@taft-03 ~]# lvs
  LV     VG   Attr   LSize Origin Snap%  Move Log         Copy%
  mirror vg   mwi-a- 1.00G                    mirror_mlog 100.00
[root@taft-03 ~]# lvs
  /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error
  /dev/sdd2: read failed after 0 of 2048 at 0: Input/output error
  LV     VG   Attr   LSize Origin Snap%  Move Log Copy%
  mirror vg   mwi-a- 1.00G                        100.00


Version-Release number of selected component (if applicable):
[root@taft-03 ~]# rpm -q lvm2
lvm2-2.02.06-6.0.RHEL4
[root@taft-03 ~]# rpm -q lvm2-cluster
lvm2-cluster-2.02.06-6.0.RHEL4
[root@taft-03 ~]# uname -ar
Linux taft-03 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64
x86_64 GNU/Linux


How reproducible:
very

Comment 1 Jonathan Earl Brassow 2006-07-20 20:01:53 UTC
This is a known issue; and unfortunately, that's the way it is suppose to work.

The log entry _must_ be written before a write can happen.  If the log can not
be written, the write has three options:
1) delay until mirror is reconfigured - this would be the best, but requires
core device-mapper changes to allow requeuing of I/O

2) continue before mirror is reconfigured - this has the potential of causing
pain.  If the write continues and the machine doing the write dies before both
mirrors are written _and_ the log device comes back to life on the reboot, there
*might* be corruption.

3) return EIO

Because #1 is not an option, and #3 is safer than #2, we default to #3. 
HOWEVER, since #2 is such a contrived case, we offer the ability to use #2.  You
need only specify 'dm_mirror_error_on_log_failure = 0' as a module parameter
when loading dm-mirror.  Then you will not see the EIOs.


Comment 3 Kiersten (Kerri) Anderson 2006-09-06 20:08:10 UTC
Defect fix required in the base kernel device mapper module.  Changing product
to RHEL and kernel component.  Also will set the 4.5 appropriate flags for
inclusion in the next update release.

Comment 4 Kiersten (Kerri) Anderson 2006-09-06 20:10:56 UTC
Defect fix required in the base kernel device mapper module.  Changing product
to RHEL and kernel component.  Also will set the 4.5 appropriate flags for
inclusion in the next update release.

Comment 6 Jonathan Earl Brassow 2006-11-28 19:49:02 UTC
This cannot be fixed until the ability to resubmit bios to the device-mapper
core exists.  Agk has added this to rhel5, but I don't see it in rhel4.

Until this happens, people will have to make due with the workaround in comment #1



Comment 7 Kiersten (Kerri) Anderson 2006-11-29 19:37:40 UTC
Moving this out to 4.6.  We will need to figure out how to address this in a
rhel4 environment as well as rhel5.

Comment 8 Kiersten (Kerri) Anderson 2006-11-29 19:46:57 UTC
Okay, there is a fix already for this in rhel5, moving this back onto the 4.5
list due and making it a blocker.  We need to do this for rhel4.5.  So,
reverting all of my previous changes.

Comment 9 Kiersten (Kerri) Anderson 2006-11-29 19:49:07 UTC
IRC discussion regarding fix for rhel4.5
<visegrips_> agk_: do we plan on putting the requeue I/O patch into RHEL4.5?  If
not, why not?
<agk_> so you can see what operation actually happened to trigger the failure -
then we need to find where that happens in the code
<agk_> and check the locking there
<agk_> the requeue thing:  it's up to mbroz if he has time after doing the other
patches still to go to 4.5
<visegrips_> agk_: k.  I was just having a conversation about it with kanderso
on #cluster
<agk_> the interface change was backwardly compatible
<agk_> so I can think of no reason not to do it except for lack of time
<visegrips_> agk_: there is a mirror bug that hinges on it
<visegrips_> (a pretty important one)
<agk_> then create a bug for tracking getting those patches in and make it a 4.5
blocker?
<kanderso> okay - am putting it back on the blocker list - it was already there
prior to our discussion 
<agk_> it seems the multipath tools change to make use of it was only one new
line of code, so that's easy to do too

Comment 10 Jonathan Earl Brassow 2006-11-29 19:59:19 UTC
Reassigning to mbroz because the fix depends on the requeue patch for core
device-mapper (which is already done in RHEL5).

When that part of the fix is submitted, I'll take the bug back and push the
changes to dm-raid1.c


Comment 11 Jonathan Earl Brassow 2006-11-29 21:34:58 UTC
Created attachment 142436 [details]
Patch to fix problem - pending dm-suspend-add-noflush-pushback.patch

Comment 12 Milan Broz 2006-12-09 13:25:32 UTC
Patches preapared and sent for review to Jon, to the rhekrnel will be sent as
soon s possible in patchset resolving bug 218946 (making this bug depends on it).

Reassigning back to jbrassow for dm raid1 patch.

Comment 15 Jonathan Earl Brassow 2006-12-12 17:37:46 UTC
Yes, I have a patch that seems to work.  I am continuing to test it and have
sent it to mbroz and agk for review before posting.

I plan to post on 12/13/2006 unless more problems are uncovered.


Comment 17 Jason Baron 2007-01-08 19:19:31 UTC
committed in stream U5 build 42.39. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 19 Corey Marthaler 2007-03-20 19:37:41 UTC
fix verified. 

Comment 21 Red Hat Bugzilla 2007-05-08 02:52:41 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html