Description of problem: [root@taft-01 ~]# lvs -a -o +devices LV VG Attr LSize Log Copy% Devices ha1 TAFT Mwi-ao 25.00G ha1_mlog 100.00 ha1_mimage_0(0),ha1_mimage_1(0) [ha1_mimage_0] TAFT iwi-ao 25.00G /dev/sdb1(0) [ha1_mimage_1] TAFT iwi-ao 25.00G /dev/sdc1(0) [ha1_mlog] TAFT lwi-ao 4.00M /dev/sdh1(0) [root@taft-01 ~]# clustat Cluster Status for TAFT @ Wed May 4 10:42:13 2011 Member Status: Quorate Member Name ID Status ------ ---- ---- ------ taft-01 1 Online, Local, rgmanager taft-02 2 Online, rgmanager taft-03 3 Online, rgmanager taft-04 4 Online, rgmanager Service Name Owner (Last) State ------- ---- ----- ------ ----- service:halvm taft-01 started [cmarthal@silver bin]$ ./revolution_9 -l /home/msp/cmarthal/work/rhel5/sts-root -r /usr/tests/sts-rhel5.6 -R ../../var/share/resource_files/taft.xml -v TAFT -A -w NONE mirror device(s) (ha1) found in TAFT on taft-01 ================================================================================ Iteration 0.1 started at Wed May 4 10:43:10 CDT 2011 ================================================================================ Scenario: Kill random devices (log and legs) ********* Mirror info for this scenario ********* * mirrors: ha1 * leg devices: /dev/sdb1 /dev/sdc1 * log devices: /dev/sdh1 * failpv(s): /dev/sdh1 * failnode(s): taft-01 * leg fault policy: remove * log fault policy: allocate * HA mirror: YES * HA service name: halvm * HA current owner: taft-01 ************************************************* PV=/dev/sdh1 ha1_mlog: 1.2 PV=/dev/sdh1 ha1_mlog: 1.2 Disabling device sdh on taft-01 Attempting I/O to cause mirror down conversion(s) on taft-01 1+0 records in 1+0 records out 512 bytes (512 B) copied, 3.8e-05 seconds, 13.5 MB/s [DEADLOCK] taft-01 qarshd[24091]: Running cmdline: echo offline > /sys/block/sdh/device/state & taft-01 qarshd[24097]: Running cmdline: dd if=/dev/zero of=/dev/TAFT/ha1 count=1 taft-01 qarshd[24099]: Running cmdline: sync taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device taft-01 lvm[22876]: Log device 253:2 has failed (D). taft-01 lvm[22876]: Device failure in TAFT-ha1. taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device [...] taft-01 kernel: sd 1:0:0:7: rejecting I/O to offline device taft-01 lvm[22876]: Mirror log status: 1 of 1 images failed - switching to core taft-01 kernel: device-mapper: dm-log-clustered: (built Dec 22 2009 14:39:43) installed taft-01 kernel: device-mapper: dm-log-clustered: Unable to send cluster log request [DM_CLOG_CTR] to server: -3 taft-01 kernel: device-mapper: dm-log-clustered: Userspace cluster log server not found taft-01 kernel: device-mapper: table: 253:5: mirror: Error creating mirror dirty log taft-01 kernel: device-mapper: ioctl: error adding target to table taft-01 lvm[22876]: Error locking on node taft-01: device-mapper: reload ioctl failed: Invalid argument taft-01 lvm[22876]: Failed to lock ha1 taft-01 lvm[22876]: Repair of mirrored LV TAFT/ha1 failed. taft-01 lvm[22876]: Failed to remove faulty devices in TAFT-ha1. taft-01 kernel: INFO: task pdflush:303 blocked for more than 120 seconds. taft-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. taft-01 kernel: pdflush D ffffffff801540f1 0 303 87 304 302 (L-TLB) taft-01 kernel: ffff81021f53bd80 0000000000000046 ffff81021f53be3c 000000028006ec3f taft-01 kernel: ffff81021f53be30 000000000000000a ffff8101ffde3100 ffff8101fff24080 taft-01 kernel: 00003d20702f39b6 0000000000000d66 ffff8101ffde32e8 0000000200000000 Version-Release number of selected component (if applicable): 2.6.18-256.el5 lvm2-2.02.84-3.el5 BUILT: Wed Apr 27 03:42:24 CDT 2011 lvm2-cluster-2.02.84-3.el5 BUILT: Wed Apr 27 03:42:43 CDT 2011 device-mapper-1.02.63-2.el5 BUILT: Fri Mar 4 10:23:17 CST 2011 device-mapper-event-1.02.63-2.el5 BUILT: Fri Mar 4 10:23:17 CST 2011 cmirror-1.1.39-10.el5 BUILT: Wed Sep 8 16:32:05 CDT 2010 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009
Why is cmirror user here (HA)? I'll reassing it to cmirror for now, problably something Jon could disentangle :-)
I was just going to ask the same question. Perhaps the 'exclusive' flag is being lost upon disk -> core log conversion?
also, does this happen everytime?
simplified steps to reproduce: 1) create disk log mirror (in a cluster) 2) activate 'aey' 3) killall clogd on all machines 4) convert to core log mirror The above should have no reason to contact clogd, because local targets should be used.
An exclusively activated disk-logged mirror that converts to corelog keeps the exclusivity, BUT uses the cluster target instead of the local target. will attach verbose output of lvconvert...
Created attachment 505714 [details] disk -> core log convert of exclusively active mirror in a cluster
convert of the log from either direction leaves the resulting target as the cluster-aware target (i.e. core <-> disk)
does it still fail if 'clogd' is properly running on all cluster machines?
ANY convert (i.e +/- images or log converts) results in using cluster-aware target.
Will change subject or open new bug depending on answer to comment #8
Jon I had forgotten that the test specifically kills cmirror/clogd once creation is finished in order to attempt to use local targets. I'm now running the tests with cmirror/clogd allowed to run...
ok, so those two factors explain the problem. It seems that RHEL5 did not pull in the changes that were made to daemons/clvmd/lvm-functions.c:do_resume_lv() that ensures that the exclusive attribute is passed down.
(By the way, it was a good thing you were killing the log server after creation...
it's more than just commit 8f1cc91a848e90e7e1c03bf141cf1a88ac17e6e4 (from read-only git repo of LVM2)... it is still loading the cluster targets. Still looking...
Two patches are needed. The one mentioned in comment #14 preserves the loading of the local kernel target when resuming an exclusive LV. It was originally necessary for snapshots of mirrors (something that came in RHEL6, which is why the patch was not thought to be necessary in rhel5). When converting mirrors, a second patch is necessary - one that preserves the local kernel target when /suspending/ an exclusive LV. This is because a suspend can perform a preload of the kernel target and therefore must be aware of the fact that the LV is exclusive. I will attach the 2 patches below, as they are to be applied to the rhel5 LVM2 tree.
Created attachment 505875 [details] Patch to preserve local kernel target upon resume of exclusive LV
Created attachment 505876 [details] Patch to preserve local kernel target upon suspend of exclusive LV
A further patch may be needed, to clean things up entirely. If you convert an exclusive 2-way mirror to an exclusive 3-way mirror, I think the temporary mirror that is present while converting may not have the exclusive flag and may therefore be a cluster-aware kernel target.
Created attachment 505905 [details] Patch to preserve local kernel target when up-converting exclusive cluster mirror
Need to forward port patches now. The attached patches above are for rhel5.
Patches have been forward ported. Committed into LVM 2.02.86, entry: "Fix to preserve exclusive activation of mirror while up-converting."
This bug still exists in the latest rpms. 2.6.18-301.el5 lvm2-2.02.88-6.el5 BUILT: Wed Jan 18 03:34:29 CST 2012 lvm2-cluster-2.02.88-6.el5 BUILT: Wed Jan 18 03:33:26 CST 2012 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-14.el5 BUILT: Wed Nov 2 17:25:33 CDT 2011 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009 ================================================================================ Iteration 0.1 started at Wed Jan 18 16:56:36 CST 2012 ================================================================================ Scenario kill_log: Kill log ********* Mirror info for this scenario ********* * mirrors: ha1 * leg devices: /dev/sdb1 /dev/sdc1 * log devices: /dev/sdh1 * failpv(s): /dev/sdh1 * failnode(s): taft-02 * leg fault policy: allocate * log fault policy: allocate * HA mirror: YES * HA service name: halvm * HA current owner: taft-02 ************************************************* Mirror Structure(s): LV Attr LSize Copy% Devices ha1 Mwi-ao 5.00G 100.00 ha1_mimage_0(0),ha1_mimage_1(0) [ha1_mimage_0] iwi-ao 5.00G /dev/sdb1(0) [ha1_mimage_1] iwi-ao 5.00G /dev/sdc1(0) [ha1_mlog] lwi-ao 4.00M /dev/sdh1(0) PV=/dev/sdh1 ha1_mlog: 1.2 PV=/dev/sdh1 ha1_mlog: 1.2 Disabling device sdh on taft-02 Attempting I/O to cause mirror down conversion(s) on taft-02 1+0 records in 1+0 records out 512 bytes (512 B) copied, 3.7e-05 seconds, 13.8 MB/s
You don't need to use HA LVM to expose this bug, you can simply activate a mirror exclusively in a cluster and then attempt to convert it. BEFORE FIX: [root@taft-01 LVM2.2.02.88]# lvchange -aey vg/lv [root@taft-01 LVM2.2.02.88]# devices vg LV Copy% Devices lv 100.00 lv_mimage_0(0),lv_mimage_1(0) [lv_mimage_0] /dev/sdb1(0) [lv_mimage_1] /dev/sdc1(0) [lv_mlog] /dev/sdh1(0) [root@taft-01 LVM2.2.02.88]# lvconvert --corelog vg/lv Error locking on node taft-01: device-mapper: reload ioctl failed: Invalid argument Failed to lock lv [root@taft-01 LVM2.2.02.88]# tail /var/log/messages Jan 19 18:36:57 taft-01 kernel: dlm: got connection from 3 Jan 19 18:36:57 taft-01 kernel: dlm: connecting to 2 Jan 19 18:36:57 taft-01 kernel: dlm: got connection from 2 Jan 19 18:36:58 taft-01 clvmd: Cluster LVM daemon started - connected to CMAN Jan 19 18:37:13 taft-01 lvm[7189]: Monitoring mirror device vg-lv for events. Jan 19 18:37:13 taft-01 lvm[7189]: vg-lv is now in-sync. Jan 19 18:37:37 taft-01 kernel: device-mapper: dm-log-clustered: Unable to send cluster log request [DM_CLOG_CTR] to server: -3 Jan 19 18:37:37 taft-01 kernel: device-mapper: dm-log-clustered: Userspace cluster log server not found Jan 19 18:37:37 taft-01 kernel: device-mapper: table: 253:5: mirror: Error creating mirror dirty log Jan 19 18:37:37 taft-01 kernel: device-mapper: ioctl: error adding target to table AFTER FIX: [root@taft-01 LVM2.2.02.88]# lvchange -aey vg/lv [root@taft-01 LVM2.2.02.88]# devices LV Copy% Devices LogVol00 /dev/sda2(0) LogVol01 /dev/sda2(1868) lv 100.00 lv_mimage_0(0),lv_mimage_1(0) [lv_mimage_0] /dev/sdb1(0) [lv_mimage_1] /dev/sdc1(0) [lv_mlog] /dev/sdh1(0) [root@taft-01 LVM2.2.02.88]# lvconvert --corelog vg/lv Logical volume lv converted. [root@taft-01 LVM2.2.02.88]# devices LV Copy% Devices LogVol00 /dev/sda2(0) LogVol01 /dev/sda2(1868) lv 100.00 lv_mimage_0(0),lv_mimage_1(0) [lv_mimage_0] /dev/sdb1(0) [lv_mimage_1] /dev/sdc1(0)
Created attachment 556431 [details] Rework of 2nd patch so it applies smoothly to current code base
Fixed in lvm2-2.02.88-7.el5.
Fix verified in the latest rpms. 2.6.18-305.el5 lvm2-2.02.88-7.el5 BUILT: Fri Jan 20 14:19:36 CST 2012 lvm2-cluster-2.02.88-7.el5 BUILT: Fri Jan 20 14:21:13 CST 2012 device-mapper-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 device-mapper-event-1.02.67-2.el5 BUILT: Mon Oct 17 08:31:56 CDT 2011 cmirror-1.1.39-13.el5 BUILT: Wed Oct 26 14:50:40 CDT 2011 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009 Scenario kill_log: Kill log ********* Mirror info for this scenario ********* * mirrors: ha1 * leg devices: /dev/sdb1 /dev/sdc1 * log devices: /dev/sdh1 * failpv(s): /dev/sdh1 * failnode(s): taft-01 * leg fault policy: remove * log fault policy: allocate * HA mirror: YES * HA service name: halvm * HA current owner: taft-01 ************************************************* Mirror Structure(s): LV Attr LSize Copy% Devices ha1 Mwi-ao 5.00G 100.00 ha1_mimage_0(0),ha1_mimage_1(0) [ha1_mimage_0] iwi-ao 5.00G /dev/sdb1(0) [ha1_mimage_1] iwi-ao 5.00G /dev/sdc1(0) [ha1_mlog] lwi-ao 4.00M /dev/sdh1(0) PV=/dev/sdh1 ha1_mlog: 1.2 PV=/dev/sdh1 ha1_mlog: 1.2 Disabling device sdh on taft-01 Attempting I/O to cause mirror down conversion(s) on taft-01 1+0 records in 1+0 records out 512 bytes (512 B) copied, 3.7e-05 seconds, 13.8 MB/s RELOCATING halvm FROM taft-01 TO taft-02 Verifying current sanity of lvm after the failure Mirror Structure(s): LV Attr LSize Copy% Devices ha1 Mwi-ao 5.00G 100.00 ha1_mimage_0(0),ha1_mimage_1(0) [ha1_mimage_0] iwi-ao 5.00G /dev/sdb1(0) [ha1_mimage_1] iwi-ao 5.00G /dev/sdc1(0) [ha1_mlog] lwi-ao 4.00M /dev/sdg1(0) Verify that each of the mirror repairs finished successfully Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s) olog: 2.2 Verifying LOG device(s) /dev/sdh1 ARE *NOT* in the mirror/linear(s) Verifying LEG device /dev/sdb1 *IS* in the volume(s) Verifying LEG device /dev/sdc1 *IS* in the volume(s) verify the newly allocated dm devices were added as a result of the failures Checking EXISTENCE of ha1_mlog on: taft-02 Verify that the mirror image order remains the same after the down conversion Enabling device sdh on taft-01 Up converting linear(s) back to mirror(s) on taft-02... taft-02: lvconvert --mirrorlog disk -m 1 -b TAFT/ha1 /dev/sdb1:0-20000 /dev/sdc1:0-20000 /dev/sdh1:0-150 Verifying the up conversions by checking if all original leg devices are back in the mirror(s) Verifying device /dev/sdb1 *IS* one of the legs in the mirror(s) Verifying device /dev/sdc1 *IS* one of the legs in the mirror(s) Waiting until all mirrors become fully syncd... 1/1 mirror(s) are fully synced: ( 100.00% )
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: LVM mirrors can be operated in a single-machine or cluster-aware mode. The cluster-aware mode requires a service daemon and an additional kernel module. HA LVM makes use of the single-machine mode and should not require the extra daemon or module as long as the LVM mirrors are active exclusively. Cause: This bug is caused by a failure to maintain the exclusive activation of a mirrored logical volume when that volume is being repaired or converted. When the exclusive nature of the activation is lost, the cluster-aware mode is attempted. This fails due to the lack of the necessary daemon or kernel module. Consequence: Mirrors that are managed by HA-LVM are unable to handle device failures. Fix: The commands that are used to repair or convert LVM mirrors were fixed in order to preserve exclusive activation. Result: When a mirror is repaired or converted, devices are added or removed. The kernel's representation of the device is changed when this happens. The fix to preserve exclusive activation ensures that only the single-machine kernel representation is used and the extra daemon and kernel module necessary for cluster-aware operation - which may not be present - are not invoked.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0161.html