Bug 430590
Summary: | RHEL5 cmirror tracker: failed leg recovery is broken | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Corey Marthaler <cmarthal> |
Component: | cmirror | Assignee: | Jonathan Earl Brassow <jbrassow> |
Status: | CLOSED WORKSFORME | QA Contact: | Cluster QE <mspqa-list> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 5.2 | CC: | agk, ccaulfie, dwysocha, heinzm, mbroz |
Target Milestone: | rc | Keywords: | TestBlocker |
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2008-07-15 16:32:36 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 430797 |
Description
Corey Marthaler
2008-01-28 22:00:35 UTC
Although this operation didn't fail everytime I attempted it, I was still able to reproduce it with the latest. kmod-cmirror-0.1.5-2.el5 cmirror-1.1.8-1.el5 Here's the output from the failed lvconvert after the device was disabled and then re-enabled. [root@taft-04 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices LogVol00 VolGroup00 -wi-ao 66.19G /dev/sda2(0) LogVol01 VolGroup00 -wi-ao 1.94G /dev/sda2(2118) syncd_primary_2legs_1 helter_skelter mwi-ao 800.00M syncd_primary_2legs_1_mlog 100.00 syncd_primary_2legs_1_mimage_0(0),syncd_primary_2legs_1_mimage_1(0) [syncd_primary_2legs_1_mimage_0] helter_skelter iwi-ao 800.00M /dev/sdd1(0) [syncd_primary_2legs_1_mimage_1] helter_skelter iwi-ao 800.00M /dev/sdf1(0) [syncd_primary_2legs_1_mlog] helter_skelter lwi-ao 4.00M /dev/sdc1(0) [...] Disabling device sdd on taft-01 Disabling device sdd on taft-02 Disabling device sdd on taft-03 Disabling device sdd on taft-04 Attempting I/O to cause mirror down conversion(s) on taft-03 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.080716 seconds, 520 MB/s Verifying the down conversion of the failed mirror(s) /dev/sdd1: open failed: No such device or address Verifying FAILED device /dev/sdd1 is *NOT* in the volume(s) /dev/sdd1: open failed: No such device or address Verifying LEG device /dev/sdb1 *IS* in the volume(s) /dev/sdd1: open failed: No such device or address Verifying LEG device /dev/sdc1 *IS* in the volume(s) /dev/sdd1: open failed: No such device or address Verifying LOG device /dev/sde1 *IS* in the mirror(s) /dev/sdd1: open failed: No such device or address Verifying files (checkit) on mirror(s) on... Enabling device sdd on taft-01 Enabling device sdd on taft-02 Enabling device sdd on taft-03 Enabling device sdd on taft-04 Recreating PV /dev/sdd1 WARNING: Volume group helter_skelter is not consistent WARNING: Volume Group helter_skelter is not consistent WARNING: Volume group helter_skelter is not consistent Extending the recreated PV back into VG helter_skelter Since we can't yet up convert existing mirrors, down converting to linear(s) on taft-03 before re-converting back to original mirror(s) Up converting linear(s) back to mirror(s) on taft-03... taft-03: lvconvert -m 2 helter_skelter/syncd_primary_3legs_1 /dev/sdd1:0-1000 /dev/sdb1:0-1000 /dev/sdc1:0-1000 /dev/sde1:0-150 Error locking on node taft-04: Command timed out Problem reactivating syncd_primary_3legs_1 Error locking on node taft-04: Command timed out couldn't up convert mirror syncd_primary_3legs_1 on taft-03 The last portion of the above is interesting: "taft-03: lvconvert -m 2 helter_skelter/syncd_primary_3legs_1 /dev/sdd1:0-1000 /dev/sdb1:0-1000 /dev/sdc1:0-1000 /dev/sde1:0-150 Error locking on node taft-04: Command timed out Problem reactivating syncd_primary_3legs_1 Error locking on node taft-04: Command timed out couldn't up convert mirror syncd_primary_3legs_1 on taft-03" Why did it timeout? Are there "Timed out waiting for cluster log server" messages in the log, or did clvmd simply time-out because of all the disk traffic? This bz doesn't appear to have been reproduced in the past 5 months, so closing. Will reopen if seen again. |