Bug 199635

Summary: I/O from multiple nodes during mirror log failure causes clvmd to hang
Product: [Retired] Red Hat Cluster Suite Reporter: Corey Marthaler <cmarthal>
Component: cmirrorAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED CURRENTRELEASE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: high    
Version: 4CC: agk, cfeist, dwysocha, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-08-05 21:32:47 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2006-07-20 21:01:08 UTC
Description of problem:
We tried the mirror log failure test case again, this time with I/O running from
all nodes in the cluster. When we failed the log, we hit the I/O error issue (bz
199622) and shortly afterwards clvmd hung. From all nodes we see the following
messages over and over:

device-mapper: Failed to receive election results from server
device-mapper: Failed to receive election results from server
device-mapper: Failed to receive election results from server

[...]
stat("/dev/sdd2", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 50), ...}) = 0
stat("/dev/sdd2", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 50), ...}) = 0
open("/dev/sdd2", O_RDONLY|O_DIRECT|O_NOATIME) = 4
fstat(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 50), ...}) = 0
ioctl(4, BLKBSZGET, 0x68d810)           = 0
lseek(4, 0, SEEK_SET)                   = 0
read(4, 0x7fbfffa200, 2048)             = -1 EIO (Input/output error)
write(2, "  ", 2  )                       = 2
write(2, "/dev/sdd2: read failed after 0 o"..., 63/dev/sdd2: read failed after 0
of 2048 at 0: Input/output error) = 63
write(2, "\n", 1
)                       = 1
close(4)                                = 0
stat("/proc/lvm/VGs/vg", 0x7fbfffb500)  = -1 ENOENT (No such file or directory)
rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0
write(3, "3\1\377\277\0\0\0\0\0\0\0\0\7\0\0\0\0\1\4V_vg\0\0", 25) = 25
read(3, 

[root@taft-03 ~]# dmsetup ls
vg-mirror_mimage_1      (253, 4)
vg-mirror_mimage_0      (253, 3)
vg-mirror       (253, 5)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)
vg-mirror_mlog  (253, 2)


Version-Release number of selected component (if applicable):
[root@taft-03 ~]# uname -ar
Linux taft-03 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:32:02 EDT 2006 x86_64 x86_64
x86_64 GNU/Linux
[root@taft-03 ~]# rpm -q lvm2
lvm2-2.02.06-6.0.RHEL4
[root@taft-03 ~]# rpm -q lvm2-cluster
lvm2-cluster-2.02.06-6.0.RHEL4
[root@taft-03 ~]# rpm -q cmirror
cmirror-1.0.1-0
[root@taft-03 ~]# rpm -q cmirror-kernel
cmirror-kernel-2.6.9-10.2

Comment 2 Kiersten (Kerri) Anderson 2006-09-20 16:08:47 UTC
Devel ACK.

Comment 3 Jonathan Earl Brassow 2007-01-08 23:07:54 UTC
This should be fixed with the latest kernel changes.

There are some outstanding LVM patches that are slated to go in that are
required so that mirrors do not return EIO on log failure, but this bug is not
about that.


Comment 4 Corey Marthaler 2007-03-20 19:35:19 UTC
verified fix. clvmd no longer hangs after mirror log failure.

Comment 5 Chris Feist 2008-08-05 21:32:47 UTC
Closing as this has been fixed in the current (4.7) release.