Bug 966943

Summary: [cluster] [mirror] lvcreate-ing cluster mirror hangs: after timeout it's going into infinite cycle: Log already exists, unable to create.
Product: Red Hat Enterprise Linux 7 Reporter: Marian Csontos <mcsontos>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: Default / Unclassified QA Contact: cluster-qe <cluster-qe>
Status: CLOSED DUPLICATE Docs Contact:
Severity: high    
Priority: unspecified CC: agk, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, zkabelac
Version: 7.0   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-05-29 13:33:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marian Csontos 2013-05-24 10:50:58 UTC
Description of problem:

    May 24 04:00:31 zaphodc1-node03 qarshd[2686]: Running cmdline: lvconvert --mirrorlog core -m 1 centipede/centi_base /dev/vdc1
    May 24 04:00:31 zaphodc1-node03 kernel: [  110.731370] device-mapper: dm-log-userspace: version 1.1.0 loaded

After 15 seconds it times out and retries:

    May 24 04:00:46 zaphodc1-node03 kernel: [  125.798115] device-mapper: dm-log-userspace: [IFDjvUNL] Request timed out: [1/0] - retrying

But the log is already created, and instead of raising error goes into infinite cycle:

    May 24 04:00:46 zaphodc1-node03 cmirrord[2458]: [IFDjvUNL/18446612133002470656u] Log already exists, unable to create.
    May 24 04:00:46 zaphodc1-node03 cmirrord[2458]: Failed to create cluster log (LVM-4lLfMUwq3Pb3X0dNEPVYnNrGEQngSJ3eO3p3hq1dT3r361odF5qRgHMSIFDjvUNL)
    May 24 04:00:46 zaphodc1-node03 cmirrord[2458]: argv[0] = clustered-core
    May 24 04:00:46 zaphodc1-node03 cmirrord[2458]: argv[1] = 1024

Nothing on the console except the above kernel messages.

Version-Release number of selected component (if applicable):

kernel-3.9.0-0.55.el7.x86_64
lvm2-2.02.99-0.36.el7.x86_64
dlm-4.0.1-1.el7.x86_64
corosync-2.3.0-3.el7.x86_64
pcs-0.9.41-1.el7.x86_64
pacemaker-1.1.10-1.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. create a mirror: either a new one or lvconvert (or pvmove)

Actual results:
- cluster and lvm hangs

Expected results:
- either the mirror should be created or error risen in finite time

Additional info:

It may look like another instance of Bug 948073, except selinux is not involved:

    [20130523-104957] zaphodc1-node03: lvcreate -m 4 -n cascading_splits -L 300M split_image
    [20130523-105158]   Error locking on node 2: Command timed out
    [20130523-105158]   Error locking on node 1: Command timed out
    [20130523-105158]   Error locking on node 3: Command timed out
    [20130523-105158]   Failed to activate new LV.
    [20130523-105358]   Error locking on node 2: Command timed out
    [20130523-105358]   Error locking on node 1: Command timed out
    [20130523-105358]   Error locking on node 3: Command timed out
    [20130523-105358]   Unable to deactivate failed new LV. Manual intervention required.
    [20130523-105558]   Error locking on node 2: Command timed out
    [20130523-105558]   Error locking on node 1: Command timed out
    [20130523-105558]   Error locking on node 3: Command timed out

I will gather more logs.

Comment 1 Marian Csontos 2013-05-28 07:01:35 UTC
Actually it may be selinux related. After turning it off the test is passing.
Will switch to permissive mode and upload AVCs.

Comment 2 Marian Csontos 2013-05-29 13:33:59 UTC
After all it IS related to selinux, just my selinux-fu is weak.
Following suggests it is a duplicate of Bug 948073. Closing as such.

(08:28:44) [root@zaphodc2-node03:~]$ ausearch -m avc
----
time->Wed May 29 08:13:08 2013
type=SYSCALL msg=audit(1369833188.408:65): arch=c000003e syscall=2 success=yes exit=7 a0=7ffffcf74340 a1=4002 a2=c8 a3=7ffffcf73ee0 items=0 ppid=1 pid=2451 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 ses=4294967295 tty=(none) comm="cmirrord" exe="/usr/sbin/cmirrord" subj=system_u:system_r:cmirrord_t:s0 key=(null)
type=AVC msg=audit(1369833188.408:65): avc:  denied  { open } for  pid=2451 comm="cmirrord" path="/dev/dm-3" dev="devtmpfs" ino=29752 scontext=system_u:system_r:cmirrord_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
type=AVC msg=audit(1369833188.408:65): avc:  denied  { read write } for  pid=2451 comm="cmirrord" name="dm-3" dev="devtmpfs" ino=29752 scontext=system_u:system_r:cmirrord_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file
----
time->Wed May 29 08:13:08 2013
type=SYSCALL msg=audit(1369833188.418:66): arch=c000003e syscall=2 success=yes exit=9 a0=7ffffcf6f270 a1=2 a2=180 a3=7ffffcf6ee90 items=0 ppid=1 pid=2451 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 ses=4294967295 tty=(none) comm="cmirrord" exe="/usr/sbin/cmirrord" subj=system_u:system_r:cmirrord_t:s0 key=(null)
type=AVC msg=audit(1369833188.418:66): avc:  denied  { open } for  pid=2451 comm="cmirrord" path="/dev/shm/qb-cpg-request-2377-2451-29-header" dev="tmpfs" ino=29817 scontext=system_u:system_r:cmirrord_t:s0 tcontext=system_u:object_r:cluster_tmpfs_t:s0 tclass=file
type=AVC msg=audit(1369833188.418:66): avc:  denied  { read write } for  pid=2451 comm="cmirrord" name="qb-cpg-request-2377-2451-29-header" dev="tmpfs" ino=29817 scontext=system_u:system_r:cmirrord_t:s0 tcontext=system_u:object_r:cluster_tmpfs_t:s0 tclass=file

*** This bug has been marked as a duplicate of bug 948073 ***