Bug 821454
Summary: | LVM2 issues invalid ioctl sequence that crashes kernel when snapshots of mounted raid volumes are taken | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Alasdair Kergon <agk> |
Component: | lvm2 | Assignee: | Alasdair Kergon <agk> |
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 6.3 | CC: | agk, borgan, cmarthal, dwysocha, heinzm, jbrassow, mbroz, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac |
Target Milestone: | rc | ||
Target Release: | 6.3 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | lvm2-2.02.95-9.el6 | Doc Type: | Bug Fix |
Doc Text: |
RAID is a new feature to RHEL6.3. No tech note needed.
|
Story Points: | --- |
Clone Of: | 818371 | Environment: | |
Last Closed: | 2012-06-20 15:03:52 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 818371 | ||
Bug Blocks: |
Description
Alasdair Kergon
2012-05-14 14:28:36 UTC
So, there is a table resume on the -real device that assumes it's a linear mapping. This is needed to set the correct size of the device before the snapshot table references it. The strictest fix is to always load that table as an error device by that point to set the device size and then to load it with the correct table without resuming it (suppressed in the case of a linear table). A less good alternative is to suppress the resume at that point, which would rely on the kernel not attempting to access the -real device in the ctr and skipping the device size validation. A reminder that all 'snapshot of mirror' cases need to be tested: creation of first snapshot of a mirror creation of a second snapshot of a mirror removal of 2nd snapshot removal of 1st snapshot removal of mirror with 1 snapshot removal of mirror with 2 snapshots activation/deactivation of mirror with 1 snapshot activation/deactivation of mirror with 2 snapshots each removal/creation case with the mirror(+snapshots) active/inactive All snapshot merge cases must also be tested with a mirror underneath. I have a one-line patch (using the 'less good' method) that seems to fix the problems reported here. However, working through the other cases I listed has thrown up a related problem with the order of device removal upon the completion of a snapshot merge. Patch to fix the basic problem. http://sourceware.org/cgi-bin/cvsweb.cgi/LVM2/libdm/libdm-deptree.c.diff?cvsroot=lvm2&r1=text&tr1=1.163&r2=text&tr2=1.166&f=u I can't yet spot an easy fix for snapshot merging and it might need mentioning in the technical note instead. (It might need to load an error table into the old -real device. Or we might find a way to re-order the tree dependencies for the removal. Or we might be able to suppress the resume prior to deletion safely.) These test cases no longer fail with the latest kernel/lvm rpms. Marking verified. 2.6.32-274.el6.x86_64 lvm2-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 lvm2-libs-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 lvm2-cluster-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 udev-147-2.41.el6 BUILT: Thu Mar 1 13:01:08 CST 2012 device-mapper-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-libs-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-event-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-event-libs-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 cmirror-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: RAID is a new feature to RHEL6.3. No tech note needed. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0962.html |