Red Hat Bugzilla – Bug 185785
[RHEL4 U3] device-mapper mirror: Data corruption by temporal errors during recovery.
Last modified: 2013-04-02 19:51:29 EDT
Description of problem:
Data corruption occurs by temporal errors during recovery,
because all errors during recovery are ignored.
Version-Release number of selected component:
Steps to Reproduce:
1. Prepare some PVs (more than 2) and create VG from them.
- /dev/sda, /dev/sdb, /dev/sdc as PVs
- vg0 contains these 3 PVs
2. Create a mirror LV and activate it.
# lvcreate -L 200M -n lv0 -m 1 vg0
3. Make filesystem on the mirror LV.
# mke2fs -j /dev/mapper/vg0-lv0
4. Disconnect one of PVs used for the mirror LV.
# echo offline > /sys/block/sdb/device/state
This step must be completed before the recovery has been finished.
5. Re-connect the PV.
# echo running > /sys/block/sdb/device/state
6. Wait the recovery has been finished.
7. Check if the filesystem is fine.
This check should be done many times because errors may not be
detected by read balance.
# while true; do e2fsck -f /dev/mapper/vg0-lv0; done
e2fsck complains file system errors, while there is no error is
recorded in kernel log and 'dmsetup status' shows no failure
This happens because the temporal PV failure from Step 4 through
Step 5 is ignored in the kernel.
e2fsck should not detect any error.
In the kernel side, errors during recovery should be handled.
The error handler should mark the failed device as "failed".
And the status of the region having errors in recovery should be
out-of-sync until the failed device is removed from the mirror map
or it is restored and recovered correctly.
If the status is "in-sync", other data corruption should occur
- Errors by temporal device failure are detected during recovery.
- The errors are handled and the failed device is marked as "failed",
but corresponding regions are marked as "in-sync".
- System down before the dmeventd takes action.
- The temporal device failure becomes fine during system down.
- Bootup and the mirror map is activated with "no error" and
committed in stream U4 build 34.26. A test kernel with this patch is available
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.