Bug 531637
Summary: | core conversion or log allocation doesn't take place when lowest ID node doesn't experience the failure | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Corey Marthaler <cmarthal> | |
Component: | Documentation-cluster | Assignee: | Steven J. Levine <slevine> | |
Status: | CLOSED CURRENTRELEASE | QA Contact: | ecs-bugs | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 5.4 | CC: | agk, ccaulfie, coughlan, dwysocha, heinzm, iannis, jbrassow, jha, mbroz, mhideo, prockai | |
Target Milestone: | rc | Keywords: | Documentation | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
With clustered mirrors, the mirror log management is completely the responsibility of the cluster node with the currently lowest cluster ID. Therefore, when the device holding the cluster mirror log becomes unavailable on a subset of the cluster, the clustered mirror can continue operating without any impact, as long as the cluster node with lowest ID retains access to the mirror log. Since the mirror is undisturbed, no automatic corrective action (repair) is issued, either. When the lowest-ID cluster node loses access to the mirror log, however, automatic action will kick in (regardless of accessibility of the log from other nodes).
|
Story Points: | --- | |
Clone Of: | ||||
: | 642400 (view as bug list) | Environment: | ||
Last Closed: | 2011-04-14 04:48:57 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 642400, 656090 |
Description
Corey Marthaler
2009-10-28 22:45:09 UTC
I just reproduced this by killing the log on 2/4 nodes in the cluster. Scenario: Kill disk log of non synced 2 leg mirror(s) ********* Mirror hash info for this scenario ********* * names: nonsyncd_log_2legs_1 * sync: 0 * disklog: /dev/sdh1 * failpv(s): /dev/sdh1 * failnode(s): taft-03 taft-04 * leg devices: /dev/sdf1 /dev/sde1 * leg fault policy: remove * log fault policy: remove ****************************************************** Creating mirror(s) on taft-04... taft-04: lvcreate -m 1 -n nonsyncd_log_2legs_1 -L 600M helter_skelter /dev/sdf1:0-1000 /dev/sde1:0-1000 /dev/sdh1:0-150 Continuing on without fully syncd mirrors, currently at... ( 3.58% ) Creating gfs on top of mirror(s) on taft-01... Mounting mirrored gfs filesystems on taft-01... Mounting mirrored gfs filesystems on taft-02... Mounting mirrored gfs filesystems on taft-03... Mounting mirrored gfs filesystems on taft-04... Writing verification files (checkit) to mirror(s) on... ---- taft-01 ---- ---- taft-02 ---- ---- taft-03 ---- ---- taft-04 ---- Sleeping 10 seconds to get some outsanding GFS I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- taft-01 ---- ---- taft-02 ---- ---- taft-03 ---- ---- taft-04 ---- Disabling device sdh on taft-03 Disabling device sdh on taft-04 Attempting I/O to cause mirror down conversion(s) on taft-03 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.107447 seconds, 390 MB/s Verifying the down conversion of the failed mirror(s) /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 145669664768: Input/output error /dev/sdh1: read failed after 0 of 512 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error [...] Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s) /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error [...] /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid 'tXGiD0-3zwu-VXo7-YtTK-omuL-xIqr-dAKOnE'. log policy (if failed) is remove: remove Verifying LOG device /dev/sdh1 is *NOT* in the linear(s) /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 145669554176: Input/output error [...] /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid 'tXGiD0-3zwu-VXo7-YtTK-omuL-xIqr-dAKOnE'. Verifying LEG device /dev/sdf1 *IS* in the volume(s) /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error [...] /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid 'tXGiD0-3zwu-VXo7-YtTK-omuL-xIqr-dAKOnE'. Verifying LEG device /dev/sde1 *IS* in the volume(s) /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error [...] /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid 'tXGiD0-3zwu-VXo7-YtTK-omuL-xIqr-dAKOnE'. Verify the dm devices associated with /dev/sdh1 are no longer present nonsyncd_log_2legs_1_mlog on taft-01 should no longer be there FI_engine: recover() method failed Here's the mirror view on each node after the partial failure: [root@taft-01 sts-rhel5.4]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices nonsyncd_log_2legs_1 helter_skelter mwi-ao 600.00M nonsyncd_log_2legs_1_mlog 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 600.00M /dev/sdf1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 600.00M /dev/sde1(0) [nonsyncd_log_2legs_1_mlog] helter_skelter lwi-ao 4.00M /dev/sdh1(0) [root@taft-02 sts-rhel5.4]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices nonsyncd_log_2legs_1 helter_skelter mwi-ao 600.00M nonsyncd_log_2legs_1_mlog 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 600.00M /dev/sdf1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 600.00M /dev/sde1(0) [nonsyncd_log_2legs_1_mlog] helter_skelter lwi-ao 4.00M /dev/sdh1(0) [root@taft-03 sts-rhel5.4]# lvs -a -o +devices /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 4128768: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 4186112: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 4096: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 145669554176: Input/output error /dev/sdh1: read failed after 0 of 512 at 145669664768: Input/output error /dev/sdh1: read failed after 0 of 512 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid 'tXGiD0-3zwu-VXo7-YtTK-omuL-xIqr-dAKOnE'. [...] LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices nonsyncd_log_2legs_1 helter_skelter mwi-ao 600.00M nonsyncd_log_2legs_1_mlog 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 600.00M /dev/sdf1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 600.00M /dev/sde1(0) [nonsyncd_log_2legs_1_mlog] helter_skelter lwi-ao 4.00M unknown device(0) [root@taft-04 sts-rhel5.4]# lvs -a -o +devices /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 4128768: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 4186112: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 4096: Input/output error /dev/mapper/helter_skelter-nonsyncd_log_2legs_1_mlog: read failed after 0 of 4096 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 145669554176: Input/output error /dev/sdh1: read failed after 0 of 512 at 145669664768: Input/output error /dev/sdh1: read failed after 0 of 512 at 0: Input/output error /dev/sdh1: read failed after 0 of 512 at 4096: Input/output error /dev/sdh1: read failed after 0 of 2048 at 0: Input/output error Couldn't find device with uuid 'tXGiD0-3zwu-VXo7-YtTK-omuL-xIqr-dAKOnE'. [...] LV VG Attr LSize Origin Snap% Move Log Copy% Convert Devices nonsyncd_log_2legs_1 helter_skelter mwi-ao 600.00M nonsyncd_log_2legs_1_mlog 100.00 nonsyncd_log_2legs_1_mimage_0(0),nonsyncd_log_2legs_1_mimage_1(0) [nonsyncd_log_2legs_1_mimage_0] helter_skelter iwi-ao 600.00M /dev/sdf1(0) [nonsyncd_log_2legs_1_mimage_1] helter_skelter iwi-ao 600.00M /dev/sde1(0) [nonsyncd_log_2legs_1_mlog] helter_skelter lwi-ao 4.00M unknown device(0) This bug exists when 3/4 nodes have the log device fail. [...] Disabling device sdf on taft-02 Disabling device sdf on taft-04 Disabling device sdf on taft-03 Attempting I/O to cause mirror down conversion(s) on taft-02 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.105072 seconds, 399 MB/s Verifying current sanity of lvm after the failure Verifying FAILED device /dev/sdf1 is *NOT* in the volume(s) failed device /dev/sdf1 should no longer be in volume on taft-01 log on taft-01: [nonsyncd_log_2legs_1_mlog] helter_skelter lwi-ao 4.00M /dev/sdf1(0) log on the other tafts taft-[234]: [nonsyncd_log_2legs_1_mlog] helter_skelter lwi-ao 4.00M unknown device(0) In comment #4, the fault policy was allocate, so a new log should have appeared. Corey, as far as I understand clustered mirrors, the log is only written by the cluster node with lowest ID. Presumably, what happens is that you are failing the log on cluster nodes that are not writing to the log, so the conversion does not happen. I think this behaviour is correct. It would be helpful if you could verify that this is the case, though: if a log device is failed on the lowest-id node, it should be replaced or removed as dictated by policy. If it is only failed on nodes different from the lowest-id one, nothing should happen. The clustered mirror should function properly as long as the log is available on that one node. If you can confirm this is the case, we can close this bug. It may be necessary to update documentation to clarify this? Petr, that does appear to be the case. If I fail the log on only the lowest ID node, everything appears to be repaired properly. However if the lowest ID node isn't failed, then the repair doesn't happen. If this isn't going to be fixed then we should update the docs with this issue. Corey, in that case, yes, we need to update docs. I should point out that the cmirror will keep functioning properly -- only the lowest-id cluster node actually needs the log to be accessible. To change this behaviour, a considerable redesign of cmirror would be necessary, which I don't think is feasible, nor useful: the current behaviour in this regard is, in my opinion, very reasonable. Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: With clustered mirrors, the mirror log management is completely the responsibility of the cluster node with the currently lowest cluster ID. Therefore, when the device holding the cluster mirror log becomes unavailable on a subset of the cluster, the clustered mirror can continue operating without any impact, as long as the cluster node with lowest ID retains access to the mirror log. Since the mirror is undisturbed, no automatic corrective action (repair) is issued, either. When the lowest-ID cluster node loses access to the mirror log, however, automatic action will kick in (regardless of accessibility of the log from other nodes). This BZ only requires a 5.6. Tech Note. No patch. I believe Ryan will see the flag and add the Tech. Note, independant of the state of the BZ. This text should also go into whichever manual covers this area. I am reassigning to Documentation-cluster. I assume this is needed for both RHEL 5 and 6. documentation. Peter, do you think we need a 6.1 Tech. Note as well? I believe manual update is good enough for 6.1, a separate technical note is probably not needed. Petr: I am updating the LVM docs (in RHEL 5.6. and 6.1) with this info and trying to determine where to put the information. When you say that "automatic action will kick in", do you mean the action specified by the mirror_log_fault_policy parameter in the configuration file? -Steven I have added the paragraph in Comment 9 to the draft of the RHEL 5.6 document. This will be available when RHEL 5.6 is released. The 5.6 document is complete and checked in for 5.6. At the 5.6 release this note will be included in the document. |