Bug 732124
Summary: | 2-leg mirrored filesystem turned read only after primary image and primary log leg device both fail | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Corey Marthaler <cmarthal> | |
Component: | lvm2 | Assignee: | Jonathan Earl Brassow <jbrassow> | |
Status: | CLOSED ERRATA | QA Contact: | Corey Marthaler <cmarthal> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 6.2 | CC: | agk, coughlan, dwysocha, heinzm, jbrassow, nperic, prajnoha, prockai, thornber, zkabelac | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | x86_64 | |||
OS: | Linux | |||
Whiteboard: | ||||
Fixed In Version: | lvm2-2.02.98-4.el6 | Doc Type: | Bug Fix | |
Doc Text: |
A mirror logical volume can itself have a mirrored log device. When a device in an image of the mirror and its log failed at the same time, it was possible for I/O errors to appear on the mirror LV when they should have been handled. That is, the kernel would not absorb the I/O errors from the failed device by relying on the remaining device. This would then cause file systems built on the device to respond to the I/O errors (turn read-only in the case of the ext3/4 file systems).
The cause was found to be that the mirror was not suspended for repair using the 'noflush' flag. This flag allows the kernel to requeue I/O requests that need to be retried. Because the kernel was not allowed to requeue the requests, it had no choice but to return the I/O as errored. This issue has been corrected and the mirror is now properly suspended with the 'noflush' flag.
|
Story Points: | --- | |
Clone Of: | ||||
: | 749883 (view as bug list) | Environment: | ||
Last Closed: | 2013-02-21 08:03:19 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 825323 | |||
Bug Blocks: | 749883, 756082, 886216 |
Description
Corey Marthaler
2011-08-19 21:32:14 UTC
Similar to bug 732098: I have not hit this bug in a weekend's worth of testing (using the helter_skelter test). I hit a different bug where some of the mirror sub-LVs did not come out of suspension. This may have blocked me from seeing this bug, but I have tested for a cumulative duration of 72+ hours and not hit this bug. I'll continue to look for it, but in the absence of hitting it myself, it will have to be verified by the reporter - after fixes for 746254/743112 are in place. (I am currently testing with the proposed fixes for 746254/743112.) Marking as NEEDINFO until either: 1) I hit it with my continued helter_skelter testing or 2) The reporter is able to get new rpms with the aforementioned patches and is able to confirm this bug. Attempting to reproduce this issue is blocked until redundant log mirrors work again (bug 794904). bug 794904 has been cleared. Testing for this case is blocked behind bug 825323. Adding QA ack for 6.4. However, testing of this bug remains blocked by bug 825323. Devel will need to provide unit testing results however before this bug can be ultimately verified by QA. The same fix for bug 825323 appears to fix this bug also: ... ================================================================================ Iteration 92.1 started at Wed Nov 14 17:05:27 CST 2012 ================================================================================ Scenario kill_pri_log_and_pri_leg_2_legs_2_logs: Kill primary leg and primary log of synced 2 leg redundant log mirror(s) ********* Mirror hash info for this scenario ********* * names: syncd_pri_leg_pri_log_2legs_2logs_1 syncd_pri_leg_pri_log_2legs_2logs_2 syncd_pri_leg_pri_log_2legs_2logs_3 * sync: 1 * striped: 0 * leg devices: /dev/sdc1 /dev/sdf1 * log devices: /dev/sdh1 /dev/sde1 * no MDA devices: * failpv(s): /dev/sdc1 /dev/sdh1 * failnode(s): bp-01 * leg fault policy: remove * log fault policy: remove ****************************************************** ... Ran 10 iterations of the tests without problems: ================================================================================ Iteration 9.1 started at Wed Dec 19 11:59:28 CST 2012 ================================================================================ Scenario kill_pri_log_and_pri_leg_2_legs_2_logs: Kill primary leg and primary log of synced 2 leg redundant log mirror(s) ********* Mirror hash info for this scenario ********* * names: syncd_pri_leg_pri_log_2legs_2logs_1 * sync: 1 * striped: 0 * leg devices: /dev/sdh1 /dev/sdf1 * log devices: /dev/sdb1 /dev/sdd1 * no MDA devices: * failpv(s): /dev/sdh1 /dev/sdb1 * failnode(s): r6-node01 * additional snap: /dev/sdf1 * leg fault policy: remove * log fault policy: remove ****************************************************** Verified with: lvm2-2.02.98-6.el6.x86_64 device-mapper-1.02.77-6.el6.x86_64 kernel-2.6.32-347.el6.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0501.html |