Bug 1290874
| Summary: | --test cache|cachepool conversion appears to lead to corrupted locks on disk | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 7 | Reporter: | Corey Marthaler <cmarthal> | ||||||
| Component: | lvm2 | Assignee: | David Teigland <teigland> | ||||||
| lvm2 sub component: | LVM lock daemon / lvmlockd | QA Contact: | cluster-qe <cluster-qe> | ||||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||||
| Severity: | high | ||||||||
| Priority: | high | CC: | agk, heinzm, jbrassow, jkachuck, prajnoha, teigland, zkabelac | ||||||
| Version: | 7.2 | ||||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | lvm2-2.02.156-1.el7 | Doc Type: | Bug Fix | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-11-04 04:13:30 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1213541, 1295577, 1313485, 1364088 | ||||||||
| Attachments: |
|
||||||||
|
Description
Corey Marthaler
2015-12-11 18:16:58 UTC
Created attachment 1104755 [details]
first lvconvert attempt
Created attachment 1104756 [details]
second lvconvert attempt
It's not as terrible as it seemed at first. The lvmlockd steps in lvconvert are ignoring the '--test' option, so they are doing the normal steps of freeing the locks for the two original LVs being converted. The lvconvert then quits before creating the actual cache pool, which leaves the two LVs without locks. Any subsequent command that tries to use the lock on the LV will get the error -227 which indicates that no lock was found where it was expected. I just need to add awareness of the --test option to short-circuit the lvmlockd actions. I'm not sure how to clean up from this state at the moment. (I once had special lock options to override specific locking calls, which would allow us to easily clean up and correct unexpected failures.) Until I can find all the locations that need to be check the test mode, I've disabled test mode in shared VGs. disabled test mode in commit 48f270970fc526f9f0ac7d074639e8ed90346586 Verified that the --test flag no longer works with sanlock in the latest rpms. 3.10.0-418.el7.x86_64 lvm2-2.02.156-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 lvm2-libs-2.02.156-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 lvm2-cluster-2.02.156-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 device-mapper-1.02.126-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 device-mapper-libs-1.02.126-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 device-mapper-event-1.02.126-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 device-mapper-event-libs-1.02.126-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 device-mapper-persistent-data-0.6.2-0.1.rc8.el7 BUILT: Wed May 4 02:56:34 CDT 2016 cmirror-2.02.156-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 sanlock-3.3.0-1.el7 BUILT: Wed Feb 24 09:52:30 CST 2016 sanlock-lib-3.3.0-1.el7 BUILT: Wed Feb 24 09:52:30 CST 2016 lvm2-lockd-2.02.156-1.el7 BUILT: Mon Jun 13 03:05:51 CDT 2016 SCENARIO - [test_cache_create] Test cache pool volume creation and combining cache data and cache metadata volumes *** Cache info for this scenario *** * origin (slow): /dev/mapper/mpathc1 * pool (fast): /dev/mapper/mpatha1 ************************************ Adding "slow" and "fast" tags to corresponding pvs Create origin (slow) volume lvcreate --activate ey -L 4G -n corigin cache_sanity @slow lvcreate --activate ey -L 2G -n test_cache cache_sanity /dev/mapper/mpatha1 lvcreate --activate ey -L 8M -n test_cache_meta cache_sanity /dev/mapper/mpatha1 1A. Test that cache pool volume creation works by combining the cache data and cache metadata (fast) volumes lvconvert --yes --test --type cache-pool --poolmetadata cache_sanity/test_cache_meta cache_sanity/test_cache TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated. Test mode is not yet supported with lock type sanlock couldn't create combined cache pool volume 1B. Test that cache pool volume creation doesn't work by combining a nonexistent cache data and cache metadata (fast) volumes lvconvert --yes --test --type cache-pool --poolmetadata cache_sanity/test_cache_meta cache_sanity/FAKE TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated. Failed to find logical volume "cache_sanity/FAKE" Check that no cache pool volume was actually created when attempted with '--test' 2A. Test the origin volume can be cached with lvm auto creating a cash pool out of cache_sanity/test_cache lvconvert --yes --test --type cache --cachepool cache_sanity/test_cache cache_sanity/corigin TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated. Test mode is not yet supported with lock type sanlock couldn't create cache volume and required cache pool 2B. Test the origin volume can't be cached since we are attempting with a non existing cache pool TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated. Test mode is not yet supported with lock type sanlock Now actually create the cache pool by combining the cache data and cache metadata (fast) volumes lvconvert --yes --type cache-pool --poolmetadata cache_sanity/test_cache_meta cache_sanity/test_cache WARNING: Converting logical volume cache_sanity/test_cache and cache_sanity/test_cache_meta to pool's data and metadata volumes. THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.) 3. Test the origin volume can be cached now since we have a proper cache pool (fast) volumes now lvconvert --yes --test --type cache --cachepool cache_sanity/test_cache cache_sanity/corigin TEST MODE: Metadata will NOT be updated and volumes will not be (de)activated. Test mode is not yet supported with lock type sanlock couldn't create combined cached volume Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-1445.html |