Bug 732098
| Summary: | '-missing' dm devices continue to exist after mirror leg and log leg failure | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | lvm2 | Assignee: | Jonathan Earl Brassow <jbrassow> |
| Status: | CLOSED WORKSFORME | QA Contact: | Corey Marthaler <cmarthal> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.2 | CC: | agk, coughlan, dwysocha, heinzm, jbrassow, prajnoha, prockai, thornber, zkabelac |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-07-13 19:02:48 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 756082 | ||
I have not hit this bug in a weekend's worth of testing (using the helter_skelter test). I hit a different bug where some of the mirror sub-LVs did not come out of suspension. This may have blocked me from seeing this bug, but I have tested for a cumulative duration of 72+ hours and not hit this bug. I'll continue to look for it, but in the absence of hitting it myself, it will have to be verified by the reporter - after fixes for 746254/743112 are in place. (I am currently testing with the proposed fixes for 746254/743112.) Marking as NEEDINFO until either: 1) I hit it with my continued helter_skelter testing or 2) The reporter is able to get new rpms with the aforementioned patches and is able to confirm this bug. Attempting to reproduce this issue is blocked until redundant log mirrors work again (bug 794904). bug 794904 has been cleared. I can no longer reproduce this with the latest rpms. 2.6.32-278.el6.x86_64 lvm2-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 lvm2-libs-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 lvm2-cluster-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 udev-147-2.41.el6 BUILT: Thu Mar 1 13:01:08 CST 2012 device-mapper-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-libs-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-event-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 device-mapper-event-libs-1.02.74-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 cmirror-2.02.95-10.el6 BUILT: Fri May 18 03:26:00 CDT 2012 |
Description of problem: This iteration of device failure included a 2 legged mirror with a redundant mirror log. A few issues appeared this iteration when the primary leg and primary log were failed. 1. after the repair finished, there are remaining '-missing_' dm devices 2. there are remaining "leg devices" that are not mapped to any physical device 3. the expected mimage numbers after the failure should have remained mimage_0 and mimage_1 if everything had finish properly, instead they became mimage_2 and mimage_3. Scenario: Kill primary leg and primary log of synced 2 leg redundant log mirror(s) ********* Mirror hash info for this scenario ********* * names: syncd_pri_leg_pri_log_2legs_2logs_1 * sync: 1 * striped: 0 * leg devices: /dev/sdd1 /dev/sdg1 * log devices: /dev/sdc1 /dev/sdf1 * no MDA devices: * failpv(s): /dev/sdd1 /dev/sdc1 * failnode(s): taft-01 * leg fault policy: allocate * log fault policy: allocate ****************************************************** Creating mirror(s) on taft-01... taft-01: lvcreate --mirrorlog mirrored -m 1 -n syncd_pri_leg_pri_log_2legs_2logs_1 -L 300M helter_skelter /dev/sdd1:0-1000 /dev/sdg1:0-1000 /dev/sdc1:0-150 /dev/sdf1:0-150 PV=/dev/sdd1 syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0: 5.1 PV=/dev/sdc1 syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0: 1.3 PV=/dev/sdd1 syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0: 5.1 PV=/dev/sdc1 syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0: 1.3 Waiting until all mirrors become fully syncd... 1/1 mirror(s) are fully synced: ( 100.00% ) Creating ext on top of mirror(s) on taft-01... mke2fs 1.41.12 (17-May-2010) Mounting mirrored ext filesystems on taft-01... Writing verification files (checkit) to mirror(s) on... ---- taft-01 ---- Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- taft-01 ---- Disabling device sdd on taft-01 Disabling device sdc on taft-01 Attempting I/O to cause mirror down conversion(s) on taft-01 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.460628 s, 91.1 MB/s Verifying current sanity of lvm after the failure Verifying FAILED device /dev/sdd1 is *NOT* in the volume(s) Verifying FAILED device /dev/sdc1 is *NOT* in the volume(s) olog: 1 Verifying LOG device(s) /dev/sdf1 *ARE* in the mirror(s) Verifying LEG device /dev/sdg1 *IS* in the volume(s) verify the newly allocated dm devices were added as a result of the failures Checking EXISTENCE of syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0 on: taft-01 Checking EXISTENCE of syncd_pri_leg_pri_log_2legs_2logs_1_mimage_1 on: taft-01 Checking EXISTENCE of syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0 on: taft-01 Verifying files (checkit) on mirror(s) on... ---- taft-01 ---- Enabling device sdd on taft-01 Enabling device sdc on taft-01 Verify that each of the mirror repairs finished successfully Checking for leftover '-missing_0_0' or 'unknown devices' there should no longer be any 'missing' dm devices on taft-01 [root@taft-01 ~]# grep lvm\\[ /var/log/messages Aug 19 12:20:41 taft-01 lvm[2188]: Monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog for events. Aug 19 12:20:41 taft-01 lvm[2188]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog is now in-sync. Aug 19 12:20:41 taft-01 lvm[2188]: No longer monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog for events. Aug 19 12:20:41 taft-01 lvm[2188]: Monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog for events. Aug 19 12:20:41 taft-01 lvm[2188]: Monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 for events. Aug 19 12:20:42 taft-01 lvm[2188]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog is now in-sync. Aug 19 12:20:50 taft-01 lvm[2188]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 is now in-sync. Aug 19 12:21:28 taft-01 lvm[2188]: Primary mirror device 253:3 has failed (D). Aug 19 12:21:28 taft-01 lvm[2188]: Device failure in helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog. Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdc1: read failed after 0 of 512 at 145669554176: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdc1: read failed after 0 of 512 at 145669664768: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdc1: read failed after 0 of 512 at 0: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdc1: read failed after 0 of 512 at 4096: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdc1: read failed after 0 of 2048 at 0: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdd1: read failed after 0 of 512 at 145669554176: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdd1: read failed after 0 of 512 at 145669664768: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdd1: read failed after 0 of 512 at 0: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdd1: read failed after 0 of 512 at 4096: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: /dev/sdd1: read failed after 0 of 2048 at 0: Input/output error Aug 19 12:21:28 taft-01 lvm[2188]: Couldn't find device with uuid A1ftXq-5gK6-0Adu-RibR-CycP-S6tP-KQDoN8. Aug 19 12:21:28 taft-01 lvm[2188]: Couldn't find device with uuid Gr10TR-Zrvh-afk6-Jp8q-MdGi-0SJB-jQC28l. Aug 19 12:21:29 taft-01 lvm[2188]: Another thread is handling an event. Waiting... Aug 19 12:21:31 taft-01 lvm[2188]: LV helter_skelter/syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0 in use: not deactivating Aug 19 12:21:31 taft-01 lvm[2188]: Mirror status: 1 of 2 images failed. Aug 19 12:21:31 taft-01 lvm[2188]: Mirror log status: 1 of 2 images failed. Aug 19 12:21:31 taft-01 lvm[2188]: Trying to up-convert to 2 images, 2 logs. Aug 19 12:21:31 taft-01 lvm[2188]: Unable to create LV syncd_pri_leg_pri_log_2legs_2logs_1_mlog in Volume Group helter_skelter: name already in use. Aug 19 12:21:31 taft-01 lvm[2188]: Failed to create mirror log. Aug 19 12:21:31 taft-01 lvm[2188]: Trying to up-convert to 2 images, 1 logs. Aug 19 12:21:31 taft-01 lvm[2188]: Unable to create LV syncd_pri_leg_pri_log_2legs_2logs_1_mlog in Volume Group helter_skelter: name already in use. Aug 19 12:21:31 taft-01 lvm[2188]: Failed to create mirror log. Aug 19 12:21:31 taft-01 lvm[2188]: Trying to up-convert to 2 images, 0 logs. Aug 19 12:21:32 taft-01 lvm[2188]: WARNING: Failed to replace 2 of 2 logs in volume syncd_pri_leg_pri_log_2legs_2logs_1 Aug 19 12:21:32 taft-01 lvm[2188]: helter_skelter/syncd_pri_leg_pri_log_2legs_2logs_1: Converted: 2.7% Aug 19 12:21:47 taft-01 lvm[2188]: helter_skelter/syncd_pri_leg_pri_log_2legs_2logs_1: Converted: 100.0% Aug 19 12:22:02 taft-01 lvm[2188]: Repair of mirrored LV helter_skelter/syncd_pri_leg_pri_log_2legs_2logs_1 finished successfully. Aug 19 12:22:02 taft-01 lvm[2188]: Primary mirror device 253:6 read failed. Aug 19 12:22:02 taft-01 lvm[2188]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 is now in-sync. Aug 19 12:22:02 taft-01 lvm[2188]: helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 is now in-sync. Aug 19 12:22:04 taft-01 lvm[2188]: No longer monitoring mirror device helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog for events. [root@taft-01 ~]# dmsetup ls helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1 (253, 8) vg_taft01-lv_home (253, 2) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mimage_3 (253, 11) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_1 (253, 4) vg_taft01-lv_swap (253, 1) vg_taft01-lv_root (253, 0) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mimage_2 (253, 10) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0 (253, 3) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mimage_1 (253, 7) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0 (253, 6) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog (253, 5) helter_skelter-syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0-missing_0_0 (253, 9) [root@taft-01 ~]# lvs -a -o +devices LV Attr LSize Log Copy% Devices syncd_pri_leg_pri_log_2legs_2logs_1 mwi-ao 300.00m 100.00 syncd_pri_leg_pri_log_2legs_2logs_1_mimage_2(0),syncd_pri_leg_pri_log_2legs_2logs_1_mimage_3(0) syncd_pri_leg_pri_log_2legs_2logs_1_mimage_0 -wi-a- 300.00m /dev/sdd1(0) syncd_pri_leg_pri_log_2legs_2logs_1_mimage_1 vwi-a- 300.00m [syncd_pri_leg_pri_log_2legs_2logs_1_mimage_2] iwi-ao 300.00m /dev/sdg1(0) [syncd_pri_leg_pri_log_2legs_2logs_1_mimage_3] iwi-ao 300.00m /dev/sdh1(0) syncd_pri_leg_pri_log_2legs_2logs_1_mlog vwi-a- 4.00m syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_0 -wi-a- 4.00m /dev/sdc1(0) syncd_pri_leg_pri_log_2legs_2logs_1_mlog_mimage_1 -wi-a- 4.00m /dev/sdf1(0) Version-Release number of selected component (if applicable): 2.6.32-188.el6.x86_64 lvm2-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 lvm2-libs-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 lvm2-cluster-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 udev-147-2.37.el6 BUILT: Wed Aug 10 07:48:15 CDT 2011 device-mapper-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 device-mapper-libs-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 device-mapper-event-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 device-mapper-event-libs-1.02.66-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011 cmirror-2.02.87-1.el6 BUILT: Fri Aug 12 06:11:57 CDT 2011