Bug 199635 - I/O from multiple nodes during mirror log failure causes clvmd to hang
Summary: I/O from multiple nodes during mirror log failure causes clvmd to hang
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: cmirror
Version: 4
Hardware: All
OS: Linux
high
medium
Target Milestone: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-07-20 21:01 UTC by Corey Marthaler
Modified: 2010-01-12 02:01 UTC (History)
4 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2008-08-05 21:32:47 UTC
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2006-07-20 21:01:08 UTC
Description of problem:
We tried the mirror log failure test case again, this time with I/O running from
all nodes in the cluster. When we failed the log, we hit the I/O error issue (bz
199622) and shortly afterwards clvmd hung. From all nodes we see the following
messages over and over:

device-mapper: Failed to receive election results from server
device-mapper: Failed to receive election results from server
device-mapper: Failed to receive election results from server

[...]
stat("/dev/sdd2", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 50), ...}) = 0
stat("/dev/sdd2", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 50), ...}) = 0
open("/dev/sdd2", O_RDONLY|O_DIRECT|O_NOATIME) = 4
fstat(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 50), ...}) = 0
ioctl(4, BLKBSZGET, 0x68d810)           = 0
lseek(4, 0, SEEK_SET)                   = 0
read(4, 0x7fbfffa200, 2048)             = -1 EIO (Input/output error)
write(2, "  ", 2  )                       = 2
write(2, "/dev/sdd2: read failed after 0 o"..., 63/dev/sdd2: read failed after 0
of 2048 at 0: Input/output error) = 63
write(2, "\n", 1
)                       = 1
close(4)                                = 0
stat("/proc/lvm/VGs/vg", 0x7fbfffb500)  = -1 ENOENT (No such file or directory)
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
write(3, "3\1\377\277\0\0\0\0\0\0\0\0\7\0\0\0\0\1\4V_vg\0\0", 25) = 25
read(3, 

[root@taft-03 ~]# dmsetup ls
vg-mirror_mimage_1      (253, 4)
vg-mirror_mimage_0      (253, 3)
vg-mirror       (253, 5)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)
vg-mirror_mlog  (253, 2)


Version-Release number of selected component (if applicable):
[root@taft-03 ~]# uname -ar
Linux taft-03 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64
x86_64 GNU/Linux
[root@taft-03 ~]# rpm -q lvm2
lvm2-2.02.06-6.0.RHEL4
[root@taft-03 ~]# rpm -q lvm2-cluster
lvm2-cluster-2.02.06-6.0.RHEL4
[root@taft-03 ~]# rpm -q cmirror
cmirror-1.0.1-0
[root@taft-03 ~]# rpm -q cmirror-kernel
cmirror-kernel-2.6.9-10.2

Comment 2 Kiersten (Kerri) Anderson 2006-09-20 16:08:47 UTC
Devel ACK.

Comment 3 Jonathan Earl Brassow 2007-01-08 23:07:54 UTC
This should be fixed with the latest kernel changes.

There are some outstanding LVM patches that are slated to go in that are
required so that mirrors do not return EIO on log failure, but this bug is not
about that.


Comment 4 Corey Marthaler 2007-03-20 19:35:19 UTC
verified fix. clvmd no longer hangs after mirror log failure.

Comment 5 Chris Feist 2008-08-05 21:32:47 UTC
Closing as this has been fixed in the current (4.7) release.


Note You need to log in before you can comment on or make changes to this bug.