Bug 702065 - HA LVM mirror repair fails: Unable to send cluster log request [DM_CLOG_CTR] to server: -3
Summary: HA LVM mirror repair fails: Unable to send cluster log request [DM_CLOG_CTR] ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2
Version: 5.6
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Jonathan Earl Brassow
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 786154
TreeView+ depends on / blocked
 
Reported: 2011-05-04 16:55 UTC by Corey Marthaler
Modified: 2012-02-21 06:04 UTC (History)
12 users (show)

Fixed In Version: lvm2-2.02.88-7.el5
Doc Type: Bug Fix
Doc Text:
LVM mirrors can be operated in a single-machine or cluster-aware mode. The cluster-aware mode requires a service daemon and an additional kernel module. HA LVM makes use of the single-machine mode and should not require the extra daemon or module as long as the LVM mirrors are active exclusively. Cause: This bug is caused by a failure to maintain the exclusive activation of a mirrored logical volume when that volume is being repaired or converted. When the exclusive nature of the activation is lost, the cluster-aware mode is attempted. This fails due to the lack of the necessary daemon or kernel module. Consequence: Mirrors that are managed by HA-LVM are unable to handle device failures. Fix: The commands that are used to repair or convert LVM mirrors were fixed in order to preserve exclusive activation. Result: When a mirror is repaired or converted, devices are added or removed. The kernel's representation of the device is changed when this happens. The fix to preserve exclusive activation ensures that only the single-machine kernel representation is used and the extra daemon and kernel module necessary for cluster-aware operation - which may not be present - are not invoked.
Clone Of:
Environment:
Last Closed: 2012-02-21 06:04:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
disk -> core log convert of exclusively active mirror in a cluster (58.99 KB, text/plain)
2011-06-20 22:27 UTC, Jonathan Earl Brassow
no flags Details
Patch to preserve local kernel target upon resume of exclusive LV (5.35 KB, patch)
2011-06-21 18:22 UTC, Jonathan Earl Brassow
no flags Details | Diff
Patch to preserve local kernel target upon suspend of exclusive LV (4.52 KB, patch)
2011-06-21 18:23 UTC, Jonathan Earl Brassow
no flags Details | Diff
Patch to preserve local kernel target when up-converting exclusive cluster mirror (2.76 KB, patch)
2011-06-22 00:59 UTC, Jonathan Earl Brassow
no flags Details | Diff
Rework of 2nd patch so it applies smoothly to current code base (3.48 KB, patch)
2012-01-20 00:39 UTC, Jonathan Earl Brassow
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2012:0161 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2012-02-20 15:07:59 UTC

Description Corey Marthaler 2011-05-04 16:55:41 UTC
Description of problem:

[root@taft-01 ~]# lvs -a -o +devices
  LV             VG    Attr   LSize  Log      Copy%  Devices
  ha1            TAFT  Mwi-ao 25.00G ha1_mlog 100.00 ha1_mimage_0(0),ha1_mimage_1(0)
  [ha1_mimage_0] TAFT  iwi-ao 25.00G                 /dev/sdb1(0)
  [ha1_mimage_1] TAFT  iwi-ao 25.00G                 /dev/sdc1(0)
  [ha1_mlog]     TAFT  lwi-ao  4.00M                 /dev/sdh1(0)

[root@taft-01 ~]# clustat
Cluster Status for TAFT @ Wed May  4 10:42:13 2011
Member Status: Quorate

 Member Name    ID   Status
 ------ ----    ---- ------
 taft-01           1 Online, Local, rgmanager
 taft-02           2 Online, rgmanager
 taft-03           3 Online, rgmanager
 taft-04           4 Online, rgmanager

 Service Name    Owner (Last)   State
 ------- ----    ----- ------   -----
 service:halvm   taft-01        started


[cmarthal@silver bin]$ ./revolution_9 -l /home/msp/cmarthal/work/rhel5/sts-root -r /usr/tests/sts-rhel5.6 -R ../../var/share/resource_files/taft.xml -v TAFT -A -w NONE
mirror device(s) (ha1) found in TAFT on taft-01

================================================================================
Iteration 0.1 started at Wed May  4 10:43:10 CDT 2011
================================================================================
Scenario: Kill random devices (log and legs)

********* Mirror info for this scenario *********
* mirrors:            ha1
* leg devices:        /dev/sdb1 /dev/sdc1
* log devices:        /dev/sdh1
* failpv(s):          /dev/sdh1
* failnode(s):        taft-01
* leg fault policy:   remove
* log fault policy:   allocate

* HA mirror:          YES
* HA service name:    halvm
* HA current owner:   taft-01
*************************************************

PV=/dev/sdh1
        ha1_mlog: 1.2
PV=/dev/sdh1
        ha1_mlog: 1.2


Disabling device sdh on taft-01

Attempting I/O to cause mirror down conversion(s) on taft-01
1+0 records in
1+0 records out
512 bytes (512 B) copied, 3.8e-05 seconds, 13.5 MB/s
[DEADLOCK]


taft-01 qarshd[24091]: Running cmdline: echo offline > /sys/block/sdh/device/state &
taft-01 qarshd[24097]: Running cmdline: dd if=/dev/zero of=/dev/TAFT/ha1 count=1
taft-01 qarshd[24099]: Running cmdline: sync
taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device
taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device
taft-01 lvm[22876]: Log device 253:2 has failed (D).
taft-01 lvm[22876]: Device failure in TAFT-ha1.
taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device
[...]
taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device
taft-01 lvm[22876]: Mirror log status: 1 of 1 images failed - switching to core
taft-01 kernel: device-mapper: dm-log-clustered: (built Dec 22 2009 14:39:43) installed
taft-01 kernel: device-mapper: dm-log-clustered: Unable to send cluster log request [DM_CLOG_CTR] to server: -3
taft-01 kernel: device-mapper: dm-log-clustered:  Userspace cluster log server not found
taft-01 kernel: device-mapper: table: 253:5: mirror: Error creating mirror dirty log
taft-01 kernel: device-mapper: ioctl: error adding target to table
taft-01 lvm[22876]: Error locking on node taft-01: device-mapper: reload ioctl failed: Invalid argument
taft-01 lvm[22876]: Failed to lock ha1
taft-01 lvm[22876]: Repair of mirrored LV TAFT/ha1 failed.
taft-01 lvm[22876]: Failed to remove faulty devices in TAFT-ha1.
taft-01 kernel: INFO: task pdflush:303 blocked for more than 120 seconds.
taft-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
taft-01 kernel: pdflush       D ffffffff801540f1     0   303     87           304   302 (L-TLB)
taft-01 kernel:  ffff81021f53bd80 0000000000000046 ffff81021f53be3c 000000028006ec3f
taft-01 kernel:  ffff81021f53be30 000000000000000a ffff8101ffde3100 ffff8101fff24080
taft-01 kernel:  00003d20702f39b6 0000000000000d66 ffff8101ffde32e8 0000000200000000



Version-Release number of selected component (if applicable):
2.6.18-256.el5

lvm2-2.02.84-3.el5    BUILT: Wed Apr 27 03:42:24 CDT 2011
lvm2-cluster-2.02.84-3.el5    BUILT: Wed Apr 27 03:42:43 CDT 2011
device-mapper-1.02.63-2.el5    BUILT: Fri Mar  4 10:23:17 CST 2011
device-mapper-event-1.02.63-2.el5    BUILT: Fri Mar  4 10:23:17 CST 2011
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 1 Milan Broz 2011-06-06 11:31:21 UTC
Why is cmirror user here (HA)?
I'll reassing it to cmirror for now, problably something Jon could disentangle :-)

Comment 2 Jonathan Earl Brassow 2011-06-20 18:06:19 UTC
I was just going to ask the same question.  Perhaps the 'exclusive' flag is being lost upon disk -> core log conversion?

Comment 3 Jonathan Earl Brassow 2011-06-20 20:36:45 UTC
also, does this happen everytime?

Comment 4 Jonathan Earl Brassow 2011-06-20 21:09:58 UTC
simplified steps to reproduce:

1) create disk log mirror (in a cluster)
2) activate 'aey'
3) killall clogd on all machines
4) convert to core log mirror

The above should have no reason to contact clogd, because local targets should be used.

Comment 5 Jonathan Earl Brassow 2011-06-20 22:27:04 UTC
An exclusively activated disk-logged mirror that converts to corelog keeps the exclusivity, BUT uses the cluster target instead of the local target.

will attach verbose output of lvconvert...

Comment 6 Jonathan Earl Brassow 2011-06-20 22:27:57 UTC
Created attachment 505714 [details]
disk -> core log convert of exclusively active mirror in a cluster

Comment 7 Jonathan Earl Brassow 2011-06-20 22:44:57 UTC
convert of the log from either direction leaves the resulting target as the cluster-aware target (i.e. core <-> disk)

Comment 8 Jonathan Earl Brassow 2011-06-20 22:45:46 UTC
does it still fail if 'clogd' is properly running on all cluster machines?

Comment 9 Jonathan Earl Brassow 2011-06-20 22:50:24 UTC
ANY convert (i.e +/- images or log converts) results in using cluster-aware target.

Comment 10 Jonathan Earl Brassow 2011-06-20 22:51:13 UTC
Will change subject or open new bug depending on answer to comment #8

Comment 11 Corey Marthaler 2011-06-20 23:23:17 UTC
Jon I had forgotten that the test specifically kills cmirror/clogd once creation is finished in order to attempt to use local targets. I'm now running the tests with cmirror/clogd allowed to run...

Comment 12 Jonathan Earl Brassow 2011-06-21 03:08:40 UTC
ok, so those two factors explain the problem.

It seems that RHEL5 did not pull in the changes that were made to
daemons/clvmd/lvm-functions.c:do_resume_lv() that ensures that the exclusive attribute is passed down.

Comment 13 Jonathan Earl Brassow 2011-06-21 03:10:41 UTC
(By the way, it was a good thing you were killing the log server after creation...

Comment 14 Jonathan Earl Brassow 2011-06-21 17:01:37 UTC
it's more than just commit 8f1cc91a848e90e7e1c03bf141cf1a88ac17e6e4 (from read-only git repo of LVM2)... it is still loading the cluster targets.

Still looking...

Comment 15 Jonathan Earl Brassow 2011-06-21 18:21:14 UTC
Two patches are needed.  The one mentioned in comment #14 preserves the loading of the local kernel target when resuming an exclusive LV.  It was originally necessary for snapshots of mirrors (something that came in RHEL6, which is why the patch was not thought to be necessary in rhel5).

When converting mirrors, a second patch is necessary - one that preserves the local kernel target when /suspending/ an exclusive LV.  This is because a suspend can perform a preload of the kernel target and therefore must be aware of the fact that the LV is exclusive.

I will attach the 2 patches below, as they are to be applied to the rhel5 LVM2 tree.

Comment 16 Jonathan Earl Brassow 2011-06-21 18:22:26 UTC
Created attachment 505875 [details]
Patch to preserve local kernel target upon resume of exclusive LV

Comment 17 Jonathan Earl Brassow 2011-06-21 18:23:08 UTC
Created attachment 505876 [details]
Patch to preserve local kernel target upon suspend of exclusive LV

Comment 18 Jonathan Earl Brassow 2011-06-21 18:27:01 UTC
A further patch may be needed, to clean things up entirely.

If you convert an exclusive 2-way mirror to an exclusive 3-way mirror, I think the temporary mirror that is present while converting may not have the exclusive flag and may therefore be a cluster-aware kernel target.

Comment 19 Jonathan Earl Brassow 2011-06-22 00:59:29 UTC
Created attachment 505905 [details]
Patch to preserve local kernel target when up-converting exclusive cluster mirror

Comment 20 Jonathan Earl Brassow 2011-06-22 01:00:41 UTC
Need to forward port patches now.  The attached patches above are for rhel5.

Comment 23 Jonathan Earl Brassow 2011-06-23 14:05:03 UTC
Patches have been forward ported.  Committed into LVM 2.02.86, entry:
"Fix to preserve exclusive activation of mirror while up-converting."

Comment 30 Corey Marthaler 2012-01-19 16:25:26 UTC
This bug still exists in the latest rpms.

2.6.18-301.el5

lvm2-2.02.88-6.el5    BUILT: Wed Jan 18 03:34:29 CST 2012
lvm2-cluster-2.02.88-6.el5    BUILT: Wed Jan 18 03:33:26 CST 2012
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-14.el5    BUILT: Wed Nov  2 17:25:33 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009



================================================================================
Iteration 0.1 started at Wed Jan 18 16:56:36 CST 2012
================================================================================
Scenario kill_log: Kill log

********* Mirror info for this scenario *********
* mirrors:            ha1
* leg devices:        /dev/sdb1 /dev/sdc1
* log devices:        /dev/sdh1
* failpv(s):          /dev/sdh1
* failnode(s):        taft-02
* leg fault policy:   allocate
* log fault policy:   allocate

* HA mirror:          YES
* HA service name:    halvm
* HA current owner:   taft-02
*************************************************

Mirror Structure(s):
  LV             Attr   LSize  Copy%  Devices                        
  ha1            Mwi-ao  5.00G 100.00 ha1_mimage_0(0),ha1_mimage_1(0)
  [ha1_mimage_0] iwi-ao  5.00G        /dev/sdb1(0)                   
  [ha1_mimage_1] iwi-ao  5.00G        /dev/sdc1(0)                   
  [ha1_mlog]     lwi-ao  4.00M        /dev/sdh1(0)                   

PV=/dev/sdh1
        ha1_mlog: 1.2
PV=/dev/sdh1
        ha1_mlog: 1.2


Disabling device sdh on taft-02

Attempting I/O to cause mirror down conversion(s) on taft-02
1+0 records in
1+0 records out
512 bytes (512 B) copied, 3.7e-05 seconds, 13.8 MB/s

Comment 31 Jonathan Earl Brassow 2012-01-20 00:38:18 UTC
You don't need to use HA LVM to expose this bug, you can simply activate a mirror exclusively in a cluster and then attempt to convert it.


BEFORE FIX:
[root@taft-01 LVM2.2.02.88]# lvchange -aey vg/lv
[root@taft-01 LVM2.2.02.88]# devices vg
  LV            Copy%  Devices                      
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]        /dev/sdb1(0)                 
  [lv_mimage_1]        /dev/sdc1(0)                 
  [lv_mlog]            /dev/sdh1(0)                 
[root@taft-01 LVM2.2.02.88]# lvconvert --corelog vg/lv
  Error locking on node taft-01: device-mapper: reload ioctl failed: Invalid argument
  Failed to lock lv
[root@taft-01 LVM2.2.02.88]# tail /var/log/messages 
Jan 19 18:36:57 taft-01 kernel: dlm: got connection from 3
Jan 19 18:36:57 taft-01 kernel: dlm: connecting to 2
Jan 19 18:36:57 taft-01 kernel: dlm: got connection from 2
Jan 19 18:36:58 taft-01 clvmd: Cluster LVM daemon started - connected to CMAN
Jan 19 18:37:13 taft-01 lvm[7189]: Monitoring mirror device vg-lv for events.
Jan 19 18:37:13 taft-01 lvm[7189]: vg-lv is now in-sync.
Jan 19 18:37:37 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send cluster log request [DM_CLOG_CTR] to server: -3
Jan 19 18:37:37 taft-01 kernel: device-mapper: dm-log-clustered:  Userspace cluster log server not found
Jan 19 18:37:37 taft-01 kernel: device-mapper: table: 253:5: mirror: Error creating mirror dirty log
Jan 19 18:37:37 taft-01 kernel: device-mapper: ioctl: error adding target to table


AFTER FIX:
[root@taft-01 LVM2.2.02.88]# lvchange -aey vg/lv
[root@taft-01 LVM2.2.02.88]# devices 
  LV            Copy%  Devices                      
  LogVol00             /dev/sda2(0)                 
  LogVol01             /dev/sda2(1868)              
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]        /dev/sdb1(0)                 
  [lv_mimage_1]        /dev/sdc1(0)                 
  [lv_mlog]            /dev/sdh1(0)                 
[root@taft-01 LVM2.2.02.88]# lvconvert --corelog vg/lv
  Logical volume lv converted.
[root@taft-01 LVM2.2.02.88]# devices 
  LV            Copy%  Devices                      
  LogVol00             /dev/sda2(0)                 
  LogVol01             /dev/sda2(1868)              
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]        /dev/sdb1(0)                 
  [lv_mimage_1]        /dev/sdc1(0)

Comment 32 Jonathan Earl Brassow 2012-01-20 00:39:49 UTC
Created attachment 556431 [details]
Rework of 2nd patch so it applies smoothly to current code base

Comment 33 Milan Broz 2012-01-20 20:16:52 UTC
Fixed in lvm2-2.02.88-7.el5.

Comment 35 Corey Marthaler 2012-01-25 23:14:31 UTC
Fix verified in the latest rpms.

2.6.18-305.el5

lvm2-2.02.88-7.el5    BUILT: Fri Jan 20 14:19:36 CST 2012
lvm2-cluster-2.02.88-7.el5    BUILT: Fri Jan 20 14:21:13 CST 2012
device-mapper-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
device-mapper-event-1.02.67-2.el5    BUILT: Mon Oct 17 08:31:56 CDT 2011
cmirror-1.1.39-13.el5    BUILT: Wed Oct 26 14:50:40 CDT 2011
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009



Scenario kill_log: Kill log

********* Mirror info for this scenario *********
* mirrors:            ha1
* leg devices:        /dev/sdb1 /dev/sdc1
* log devices:        /dev/sdh1
* failpv(s):          /dev/sdh1
* failnode(s):        taft-01
* leg fault policy:   remove
* log fault policy:   allocate

* HA mirror:          YES
* HA service name:    halvm
* HA current owner:   taft-01
*************************************************

Mirror Structure(s):
  LV             Attr   LSize  Copy%  Devices                        
  ha1            Mwi-ao  5.00G 100.00 ha1_mimage_0(0),ha1_mimage_1(0)
  [ha1_mimage_0] iwi-ao  5.00G        /dev/sdb1(0)                   
  [ha1_mimage_1] iwi-ao  5.00G        /dev/sdc1(0)                   
  [ha1_mlog]     lwi-ao  4.00M        /dev/sdh1(0)                   

PV=/dev/sdh1
        ha1_mlog: 1.2
PV=/dev/sdh1
        ha1_mlog: 1.2


Disabling device sdh on taft-01

Attempting I/O to cause mirror down conversion(s) on taft-01
1+0 records in
1+0 records out
512 bytes (512 B) copied, 3.7e-05 seconds, 13.8 MB/s

RELOCATING halvm FROM taft-01 TO taft-02

Verifying current sanity of lvm after the failure

Mirror Structure(s):
  LV             Attr   LSize  Copy%  Devices                        
  ha1            Mwi-ao  5.00G 100.00 ha1_mimage_0(0),ha1_mimage_1(0)
  [ha1_mimage_0] iwi-ao  5.00G        /dev/sdb1(0)                   
  [ha1_mimage_1] iwi-ao  5.00G        /dev/sdc1(0)                   
  [ha1_mlog]     lwi-ao  4.00M        /dev/sdg1(0)                   

Verify that each of the mirror repairs finished successfully

Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s)
olog: 2.2
Verifying LOG device(s) /dev/sdh1 ARE *NOT* in the mirror/linear(s)
Verifying LEG device /dev/sdb1 *IS* in the volume(s)
Verifying LEG device /dev/sdc1 *IS* in the volume(s)
verify the newly allocated dm devices were added as a result of the failures
Checking EXISTENCE of ha1_mlog on:  taft-02

Verify that the mirror image order remains the same after the down conversion
Enabling device sdh on taft-01

Up converting linear(s) back to mirror(s) on taft-02...
taft-02: lvconvert --mirrorlog disk -m 1 -b TAFT/ha1 /dev/sdb1:0-20000 /dev/sdc1:0-20000 /dev/sdh1:0-150
Verifying the up conversions by checking if all original leg devices are back in the mirror(s)
Verifying device /dev/sdb1 *IS* one of the legs in the mirror(s)
Verifying device /dev/sdc1 *IS* one of the legs in the mirror(s)

Waiting until all mirrors become fully syncd...
   1/1 mirror(s) are fully synced: ( 100.00% )

Comment 36 Jonathan Earl Brassow 2012-02-01 15:37:23 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
LVM mirrors can be operated in a single-machine or cluster-aware mode.  The cluster-aware mode requires a service daemon and an additional kernel module.  HA LVM makes use of the single-machine mode and should not require the extra daemon or module as long as the LVM mirrors are active exclusively.

Cause:
This bug is caused by a failure to maintain the exclusive activation of a mirrored logical volume when that volume is being repaired or converted.  When the exclusive nature of the activation is lost, the cluster-aware mode is attempted.  This fails due to the lack of the necessary daemon or kernel module.

Consequence:
Mirrors that are managed by HA-LVM are unable to handle device failures.

Fix:
The commands that are used to repair or convert LVM mirrors were fixed in order to preserve exclusive activation.

Result:
When a mirror is repaired or converted, devices are added or removed.  The kernel's representation of the device is changed when this happens.  The fix to preserve exclusive activation ensures that only the single-machine kernel representation is used and the extra daemon and kernel module necessary for cluster-aware operation - which may not be present - are not invoked.

Comment 37 errata-xmlrpc 2012-02-21 06:04:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0161.html


Note You need to log in before you can comment on or make changes to this bug.