Bug 235948
| Summary: | clvm 2-way mirrored volume with log crashes if one mirror leg and the log is lost | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] Red Hat Cluster Suite | Reporter: | Mattias Haern <mattias.haern> | ||||
| Component: | lvm2-cluster | Assignee: | Jonathan Earl Brassow <jbrassow> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Corey Marthaler <cmarthal> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4 | CC: | agk, ccaulfie, dwysocha, jbrassow, mbroz, prockai, rkenna | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | i686 | ||||||
| OS: | Linux | ||||||
| URL: | http://intranet.corp.redhat.com/ic/intranet/ClusterMirrorBeta45.html | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | beta1 | Doc Type: | Bug Fix | ||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2007-04-19 18:42:35 UTC | Type: | --- | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Created attachment 152187 [details]
Cluster configuration file
perform lvmdump to gather lvm/device-mapper information. "* Force sudden removal of mirror disk and mirror log disk Not OK as expected. Volume crash when log disk is removed. Writing to the file system stopped and corruption occurred." "Volume crash" - what does this mean? What was printed/logged? "corruption occurred" - what kind of corruption? Data corruption? Metadata corruption? New tests with beta1 showed that this no longer occurs. If your continued testing shows that this is truly fixed, please close bug. assigned -> modified |
Description of problem: After creating a 2-way clustered LVM2 mirror with log, the volume crashes if both one mirror leg and the volume mirror log is removed at the same time. Version-Release number of selected component (if applicable): 4.5 beta How reproducible: Every time. Steps to Reproduce: 1. Install RHEL 4.5 beta 2. Install RHEL 4.5 cluster beta 3. Configure a mirrored clustered LVM2 volume with a log 4. Remove one mirror leg and the log Actual results: Volume crashed. Expected results: Volume continues to be available, with only one copy. Additional info: Test environment ---------------- Infrastructure; * 2 x IBM xSeries 346 installed with Redhat ES 4U5beta_64 * EMC SAN with shared disks (2 x Emulex LP10000 HBAs on each server) Cluster configuration; * 2 nodes * Fencing based on RSA II * Cluster service based on following resources; o IP address o Logical volume on shared disk o Mount of LVM based filesystem Tests with cluster (all tests are done on SAN disk) --------------------------------------------------- * Convert linear volume to mirror volume with mirror log on disk OK. * Initially create mirror volume with mirror log on disk OK. * Initially create mirror volume with mirror log in memory (corelog) OK. * Force sudden removal of mirror disk with mirror log volume intact OK. Volume automatically converted to linear volume. Cluster in unchanged status. * Force sudden removal of mirror disk and mirror log disk Not OK as expected. Volume crash when log disk is removed. Writing to the file system stopped and corruption occurred. * Force sudden power off on active node in cluster (log disk), with both sides of mirror intact OK. Mirrored volume is moved to remaining node in cluster. * Force sudden removal of mirror disk and mirror log disk (corelog) Not OK. Volume is online and can be accessed, but status of volume is strange: [root@tnscl02cn001 ~]# vgdisplay -v testvg1 Loaded external locking library liblvm2clusterlock.so Using volume group(s) on command line Finding volume group "testvg1" Wiping cache of LVM-capable devices Couldn't find device with uuid 'jccjQF-Ql0I-CYAp-N5Ak-tRaV-Z2IR-uWFLh4'. Couldn't find all physical volumes for volume group testvg1. Couldn't find device with uuid 'jccjQF-Ql0I-CYAp-N5Ak-tRaV-Z2IR-uWFLh4'. Couldn't find all physical volumes for volume group testvg1. Couldn't find device with uuid 'jccjQF-Ql0I-CYAp-N5Ak-tRaV-Z2IR-uWFLh4'. Couldn't find all physical volumes for volume group testvg1. Couldn't find device with uuid 'jccjQF-Ql0I-CYAp-N5Ak-tRaV-Z2IR-uWFLh4'. Couldn't find all physical volumes for volume group testvg1. Volume group "testvg1" not found We do not understand. First we removed only one part of the mirror , but the vgdisplay output indicates problems. Still it is possible to write to the file system. But when the node fails (and the mirror log disappears then because it's kept in memory of the failing node) the service fail to come up on adaptive node, because the logical volume is not possible to activate. * Force sudden removal of mirror disk and mirror log disk (corelog) and simultaneously force suddenly power off on active node in cluster Not OK. Cluster is trying to failover volume, but volume is in strange status as in previous test, and it is not possible to reactivate.