193728 – A write to a cluster mirror volume not in sync will hang and also cause the sync to hang as well

Bug 193728 - A write to a cluster mirror volume not in sync will hang and also cause the sync to hang as well

Summary: A write to a cluster mirror volume not in sync will hang and also cause the s...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 4
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	4.0
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Jonathan Earl Brassow
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	180185 181411
TreeView+	depends on / blocked

Reported:	2006-05-31 21:21 UTC by Corey Marthaler
Modified:	2007-11-30 22:07 UTC (History)
CC List:	4 users (show)
Fixed In Version:	RHSA-2006-0575
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2006-08-10 23:25:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
Potential Patch (waiting for a little feedback before posting) (1.42 KB, patch) 2006-06-15 13:29 UTC, Jonathan Earl Brassow	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2006:0575	0	normal	SHIPPED_LIVE	Important: Updated kernel packages available for Red Hat Enterprise Linux 4 Update 4	2006-08-10 04:00:00 UTC

Description Corey Marthaler 2006-05-31 21:21:57 UTC

Description of problem:
This may just be a different version of bz 193704, but in this issue, the mirror
syncing also stopped/hung along with the write attempt.

[root@taft-04 ~]# lvcreate -L 750M -m 1 -n coreymirror mirror_1
  Rounding up size to full physical extent 752.00 MB
  Logical volume "coreymirror" created



May 31 11:16:02 taft-04 lvm[5195]: mirror_1-coreymirror is now in-sync
May 31 11:20:00 taft-04 kernel: device-mapper: A node has left the cluster.
May 31 11:20:10 taft-04 kernel: device-mapper: Cluster log server is shutting down.
May 31 11:21:00 taft-04 kernel: device-mapper: I'm the cluster log server for
LVM-0I9prx81Ohk
May 31 11:21:00 taft-04 kernel: device-mapper: Disk Resume::
May 31 11:21:00 taft-04 kernel: device-mapper:   Live nodes        :: 1
May 31 11:21:00 taft-04 kernel: device-mapper:   In-Use Regions    :: 0
May 31 11:21:00 taft-04 kernel: device-mapper:   Good IUR's        :: 0
May 31 11:21:00 taft-04 kernel: device-mapper:   Bad IUR's         :: 0
May 31 11:21:00 taft-04 kernel: device-mapper:   Sync count        :: 0
May 31 11:21:00 taft-04 kernel: device-mapper:   Disk Region count ::
18446744073709551615
May 31 11:21:00 taft-04 kernel: device-mapper:   Region count      :: 1504
May 31 11:21:00 taft-04 kernel: device-mapper:   NOTE:  Mapping has changed.
May 31 11:21:00 taft-04 kernel: device-mapper: Marked regions::
May 31 11:21:00 taft-04 kernel: device-mapper:   0 - -1
May 31 11:21:00 taft-04 kernel: device-mapper:   Total = -1
May 31 11:21:00 taft-04 kernel: device-mapper: Out-of-sync regions::
May 31 11:21:00 taft-04 kernel: device-mapper:   0 - -1
May 31 11:21:00 taft-04 kernel: device-mapper:   Total = -1
May 31 11:21:01 taft-04 lvm[5311]: Monitoring mirror device,
mirror_1-coreymirror for events



[root@taft-04 ~]# lvs
  LV          VG       Attr   LSize   Origin Snap%  Move Log              Copy%
  coreymirror mirror_1 mwi-a- 752.00M                    coreymirror_mlog  39.36
[root@taft-04 ~]# lvs
  LV          VG       Attr   LSize   Origin Snap%  Move Log              Copy%
  coreymirror mirror_1 mwi-a- 752.00M                    coreymirror_mlog  47.34
[root@taft-04 ~]# lvs
  LV          VG       Attr   LSize   Origin Snap%  Move Log              Copy%
  coreymirror mirror_1 mwi-a- 752.00M                    coreymirror_mlog  49.47

Different node:
[root@taft-03 ~]# gfs_mkfs -j 4 -p lock_dlm -t TAFT_CLUSTER:mirror
/dev/mirror_1/coreymirror -O
[HANG]

Caused the syncing to be stuck now as well:
[root@taft-04 ~]# lvs
  LV          VG       Attr   LSize   Origin Snap%  Move Log              Copy%
  coreymirror mirror_1 mwi-a- 752.00M                    coreymirror_mlog  49.47
[root@taft-04 ~]# lvs
  LV          VG       Attr   LSize   Origin Snap%  Move Log              Copy%
  coreymirror mirror_1 mwi-a- 752.00M                    coreymirror_mlog  49.47
[root@taft-04 ~]# lvs
  LV          VG       Attr   LSize   Origin Snap%  Move Log              Copy%
  coreymirror mirror_1 mwi-a- 752.00M                    coreymirror_mlog  49.47


May 31 10:08:04 taft-02 lvm[5280]: Monitoring mirror device,
mirror_1-coreymirror for events
May 31 10:12:09 taft-02 kernel: device-mapper: A node has left the cluster.
May 31 10:12:19 taft-02 last message repeated 2 times
May 31 10:12:24 taft-02 kernel: device-mapper: Cluster log server is shutting down.
May 31 10:13:28 taft-02 lvm[5396]: Monitoring mirror device,
mirror_1-coreymirror for events



Version-Release number of selected component (if applicable):
[root@taft-04 ~]# rpm -q cmirror-kernel-smp
cmirror-kernel-smp-2.6.9-4.2

Comment 1 Corey Marthaler 2006-05-31 22:36:05 UTC

Looks like I/O will hang to even an in sync mirror:

May 31 11:34:26 taft-03 clvmd: Activating VGs: succeeded
device-mapper: unable to get server (2) to mark region (120)
device-mapper: Reason :: 1
May 31 11:35:52 taft-03 lvm[4538]: mirror_1-coreymirror is now in-sync
May 31 11:35:52 taft-03 kernel: device-mapper: unable to get server (2) to mark
region (120)
May 31 11:35:52 taft-03 kernel: device-mapper: Reason :: 1
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 128
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 128
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 256
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 384
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 512
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 512
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 640
May 31 11:35:57 taft-03 kernel: device-mapper: clear_region_count :: 640
May 31 11:35:58 taft-03 kernel: device-mapper: clear_region_count :: 768

Comment 2 Corey Marthaler 2006-06-01 15:34:43 UTC

I created a GFS filesystem on a 900M cmirror last night and attempted to run
simple I/O on it, and this morning it was all hung.

Comment 4 Jonathan Earl Brassow 2006-06-15 13:24:26 UTC

Switching this to RHEL4/kernel

Comment 5 Jonathan Earl Brassow 2006-06-15 13:29:33 UTC

Created attachment 130973 [details]
Potential Patch (waiting for a little feedback before posting)

The problem was in drivers/md/dm-raid1.c:do_writes, where I was adding writes
that are blocked behind remote recovering regions ('requeue') to a bio list
('writes') that was only valid in that function.  The result is that the memory
is leaked and the request lost - stalling all I/O for that process as it
waits for the write to complete.

The fix simply adds the bio to the main write queue via queue_bio - effectively
causing the write to be deferred until the region is recovered.

The reason for the change to do_mirror is that the call to 'wake' in queue_bio
is impotent - given the fact that it is effectively telling itself to wake up. 
(That wake up call is disregarded because the thread is already running.)  The
while loop continues until do_writes does not requeue bios due to remote
recovery.

You may ask why it is ok to check 'ms->writes.head' without holding the spin
lock.  The reason is because do_mirror is the only one that can clear the list,
and any additions to the list which we are concerned about happen in do_writes
(that is, before the check in the same thread).

Comment 9 Jason Baron 2006-06-23 18:39:32 UTC

committed in stream U4 build 39.2. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/

Comment 13 Red Hat Bugzilla 2006-08-10 23:25:48 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2006-0575.html

Note You need to log in before you can comment on or make changes to this bug.