Bug 455453

Summary: RHEL5 cmirror tracker: small mirror creation fails
Product: Red Hat Enterprise Linux 5 Reporter: Corey Marthaler <cmarthal>
Component: cmirrorAssignee: Jonathan Earl Brassow <jbrassow>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: urgent    
Version: 5.3CC: agk, ccaulfie, dwysocha, edamato, heinzm, mbroz
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 21:25:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2008-07-15 15:57:46 UTC
Description of problem:
[root@hayes-02 ~]# lvcreate -m 1 -n mirrorC -L 10M hayes
  Rounding up size to full physical extent 12.00 MB
  Error locking on node hayes-02: Command timed out
  Aborting. Failed to activate new LV to wipe the start of it.
  Error locking on node hayes-03: Command timed out
  Error locking on node hayes-02: Command timed out
  Unable to deactivate failed new LV. Manual intervention required.

Jul 15 09:54:43 hayes-01 clogd[3152]: [Qs5RVdFb] Failed to open checkpoint for 3
Jul 15 09:54:43 hayes-01 clogd[3152]: Failed to export checkpoint
Jul 15 09:54:43 hayes-01 clogd[3152]: [Qs5RVdFb] Failed to open checkpoint for 3
Jul 15 09:54:43 hayes-01 clogd[3152]: Failed to export checkpoint
Jul 15 09:54:43 hayes-01 clogd[3152]: [Qs5RVdFb] Failed to open checkpoint for 3   


Jul 15 10:50:14 hayes-02 clogd[3144]: cpg_message_callback:  Preallocated transfer
device-mapper: dm-log-clustered: [Qs5RVdFb] Request timed out: [DM_CLOG_RESUME/112g
Jul 15 10:50:29 hayes-02 kernel: device-mapper: dm-log-clustered: [Qs5RVdFb] Requeg
Jul 15 10:50:29 hayes-02 clogd[3144]: kernel_recv:  Preallocated transfer structs
Jul 15 10:50:29 hayes-02 clogd[3144]: cpg_message_callback:  Preallocated transfer 


Jul 15 10:48:35 hayes-03 clogd[3132]: cpg_message_callback:  Preallocated transfer
Jul 15 10:48:35 hayes-03 kernel: device-mapper: dm-log-clustered: [Qs5RVdFb] Requeg
Jul 15 10:48:35 hayes-03 clogd[3132]: kernel_recv:  Preallocated transfer structs
Jul 15 10:48:35 hayes-03 clogd[3132]: cpg_message_callback:  Preallocated transfer 


Version-Release number of selected component (if applicable):
2.6.18-92.1.5.el5

lvm2-2.02.39-2.el5    BUILT: Wed Jul  9 07:26:29 CDT 2008
lvm2-cluster-2.02.39-1.el5    BUILT: Thu Jul  3 09:31:57 CDT 2008
device-mapper-1.02.27-1.el5    BUILT: Thu Jul  3 03:22:29 CDT 2008
cmirror-1.1.19-2.el5    BUILT: Tue Jul  8 11:15:54 CDT 2008
kmod-cmirror-0.1.10-1.el5    BUILT: Tue May 20 14:55:48 CDT 2008


How reproducible:
Everytime

Comment 1 Corey Marthaler 2008-07-15 16:06:03 UTC
*** Bug 454337 has been marked as a duplicate of this bug. ***

Comment 2 Corey Marthaler 2008-07-15 16:10:42 UTC
Non mirrors still work with small sizes:

[root@hayes-02 ~]# lvcreate -n linear -L 5M hayes
  Rounding up size to full physical extent 8.00 MB
  Logical volume "linear" created

Looks like the bug exists between 30M - 40M:
[root@hayes-02 ~]# lvcreate -m 1 -n mirror1 -L 100M hayes
  Logical volume "mirror1" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror2 -L 90M hayes
  Rounding up size to full physical extent 92.00 MB
  Logical volume "mirror2" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror3 -L 80M hayes
  Logical volume "mirror3" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror4 -L 70M hayes
  Rounding up size to full physical extent 72.00 MB
  Logical volume "mirror4" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror5 -L 60M hayes
  Logical volume "mirror5" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror6 -L 50M hayes
  Rounding up size to full physical extent 52.00 MB
  Logical volume "mirror6" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror7 -L 40M hayes
  Logical volume "mirror7" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirror8 -L 30M hayes
  Rounding up size to full physical extent 32.00 MB
[ FAIL ]                                                         

Comment 3 Jonathan Earl Brassow 2008-07-15 16:26:48 UTC
Here are the failure messages:
Jul 15 11:12:02 bp-xen-03 clogd[2176]: [1ir9GXk6] Checkpoint prepared for 2
Jul 15 11:12:02 bp-xen-03 clogd[2176]: [1ir9GXk6] Checkpoint data available for
node 2
Jul 15 11:12:02 bp-xen-03 clogd[2176]: Sending checkpointed data to 2
Jul 15 11:12:02 bp-xen-03 clogd[2176]: [1ir9GXk6] Failed to open checkpoint for
2:  Reason = 7
Jul 15 11:12:02 bp-xen-03 clogd[2176]: Failed to export checkpoint

'Reason = 7' stands for 'SA_AIS_ERR_INVALID_PARAM'...
So, I will need to check with steve dake to figure out the proper ranges for the
parameters...  everything should be the same, except the size of the sections.


Comment 4 Corey Marthaler 2008-07-15 16:28:47 UTC
The magic number seems to be either 39M or 40M. 
40M worked in the example above, but not here:

[root@hayes-02 ~]# lvcreate -m 1 -n mirrorA -L 50M hayes
  Rounding up size to full physical extent 52.00 MB
  Logical volume "mirrorA" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirrorB -L 41M hayes
  Rounding up size to full physical extent 44.00 MB
  Logical volume "mirrorB" created
[root@hayes-02 ~]# lvcreate -m 1 -n mirrorC -L 40M hayes
  Error locking on node hayes-03: Command timed out
  Error locking on node hayes-02: Command timed out
  Error locking on node hayes-01: Command timed out
  Aborting. Failed to activate new LV to wipe the start of it.

Comment 5 Jonathan Earl Brassow 2008-07-15 16:59:51 UTC
commit 6c8d7408095782bb00b5361a7df5973f3dcda183
Author: Jonathan Brassow <jbrassow>
Date:   Tue Jul 15 11:58:26 2008 -0500

    clogd:  Fix for bug 455453: small mirror creation fails

    Was setting the checkpoint attribute 'attr.maxSectionSize'
    with the size of the bitmap.  However, when mirrors are
    really small (<= 30M) other sections may have a larger
    size and need to considered.


Comment 7 Corey Marthaler 2008-07-15 22:22:14 UTC
Fix verified in cmirror-1.1.20-1.el5/kmod-cmirror-0.1.11-2.el5.

Comment 9 errata-xmlrpc 2009-01-20 21:25:20 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHEA-2009-0158.html