637936 – killing both redundant logs causes deadlock

Bug 637936 - killing both redundant logs causes deadlock

Summary: killing both redundant logs causes deadlock

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	5.5
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Petr Rockai
QA Contact:	Corey Marthaler
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-09-27 19:35 UTC by Corey Marthaler
Modified:	2011-01-13 22:42 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2011-01-13 22:42:07 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Output of 'lvconvert --repair --use-policies vg/lv -vvvv' (559.94 KB, text/plain) 2010-10-13 17:53 UTC, Jonathan Earl Brassow	no flags	Details
Patch that was sent to lvm-devel (3.88 KB, patch) 2010-10-14 19:10 UTC, Jonathan Earl Brassow	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2011:0052	0	normal	SHIPPED_LIVE	lvm2 bug fix and enhancement update	2011-01-12 17:15:25 UTC

Description Corey Marthaler 2010-09-27 19:35:04 UTC

Description of problem:
Scenario: Kill both logs of synced 2 leg redundant log mirror(s)

********* Mirror hash info for this scenario *********
* names:              syncd_both_logs_2legs_2logs_1
* sync:               1
* leg devices:        /dev/sdf1 /dev/sdd1
* log devices:        /dev/sdb1 /dev/sdg1
* failpv(s):          /dev/sdb1 /dev/sdg1
* failnode(s):        taft-01
* additional snap:    /dev/sdf1
* leg fault policy:   remove
* log fault policy:   allocate
******************************************************

Creating mirror(s) on taft-01...
taft-01: lvcreate --mirrorlog mirrored -m 1 -n syncd_both_logs_2legs_2logs_1 -L 600M helter_skelter /dev/sdf1:0-1000 /dev/sdd1:0-1000 /dev/sdb1:0-150 /dev/sdg1:0-150
Creating a snapshot volume of each of the mirrors

PV=/dev/sdg1
        syncd_both_logs_2legs_2logs_1_mlog_mimage_1: 1.4
PV=/dev/sdb1
        syncd_both_logs_2legs_2logs_1_mlog_mimage_0: 1.4
PV=/dev/sdg1
        syncd_both_logs_2legs_2logs_1_mlog_mimage_1: 1.4
PV=/dev/sdb1
        syncd_both_logs_2legs_2logs_1_mlog_mimage_0: 1.4

Waiting until all mirrors become fully syncd...
   0/1 mirror(s) are fully synced: ( 80.67% )
   1/1 mirror(s) are fully synced: ( 100.00% )

Creating ext on top of mirror(s) on taft-01...
mke2fs 1.39 (29-May-2006)
Mounting mirrored ext filesystems on taft-01...

Writing verification files (checkit) to mirror(s) on...
        ---- taft-01 ----
checkit starting with:
CREATE
Num files:          100
Random Seed:        7619
Verify XIOR Stream: /tmp/checkit_syncd_both_logs_2legs_2logs_1
Working dir:        /mnt/syncd_both_logs_2legs_2logs_1/checkit

<start name="taft-01_syncd_both_logs_2legs_2logs_1" pid="19564" time="Mon Sep 27 14:12:58 2010" type="cmd" />
Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure 
Verifying files (checkit) on mirror(s) on...
        ---- taft-01 ----
checkit starting with:
VERIFY
Verify XIOR Stream: /tmp/checkit_syncd_both_logs_2legs_2logs_1
Working dir:        /mnt/syncd_both_logs_2legs_2logs_1/checkit


Disabling device sdb on taft-01
Disabling device sdg on taft-01
[DEADLOCK]


[root@taft-01 ~]# dmsetup ls
helter_skelter-snap0-cow        (253, 10)
helter_skelter-syncd_both_logs_2legs_2logs_1_mlog_mimage_0-missing_0_0  (253, 12)
helter_skelter-syncd_both_logs_2legs_2logs_1_mimage_1   (253, 6)
helter_skelter-syncd_both_logs_2legs_2logs_1_mlog_mimage_1      (253, 3)
helter_skelter-syncd_both_logs_2legs_2logs_1_mimage_0   (253, 5)
helter_skelter-syncd_both_logs_2legs_2logs_1_mlog_mimage_0      (253, 2)
helter_skelter-snap0    (253, 8)
helter_skelter-syncd_both_logs_2legs_2logs_1    (253, 7)
helter_skelter-syncd_both_logs_2legs_2logs_1_mlog       (253, 4)
helter_skelter-syncd_both_logs_2legs_2logs_1-real       (253, 9)
helter_skelter-syncd_both_logs_2legs_2logs_1_mlog_mimage_1-missing_0_0  (253, 11)


Log from taft-01:
 taft-01 qarshd[7632]: Running cmdline: echo offline > /sys/block/sdb/device/state &
 taft-01 qarshd[7635]: Talking to peer 10.15.80.47:44602
 taft-01 qarshd[7635]: Running cmdline: echo offline > /sys/block/sdg/device/state &
 taft-01 kernel: sd 1:0:0:6: rejecting I/O to offline device
 [...]
 taft-01 kernel: sd 1:0:0:1: rejecting I/O to offline device
 taft-01 kernel: device-mapper: raid1: All sides of mirror have failed.
 taft-01 lvm[6327]: Another thread is handling an event. Waiting...
 taft-01 lvm[6327]: Log device 253:4 has failed (D).
 taft-01 lvm[6327]: Device failure in helter_skelter-syncd_both_logs_2legs_2logs_1-real.
 taft-01 lvm[6327]: Couldn't find device with uuid xe95w7-0iAN-RE2G-mPaw-mXXY-eqzP-WDTdjq.
 taft-01 kernel: sd 1:0:0:1: rejecting I/O to offline device
 [...]
 taft-01 kernel: sd 1:0:0:1: rejecting I/O to offline device
 taft-01 qarshd[7639]: Running cmdline: pvs -a
 taft-01 kernel: sd 1:0:0:1: rejecting I/O to offline device
 [...]
 taft-01 kernel: sd 1:0:0:6: rejecting I/O to offline device
 taft-01 lvm[6327]: Mirror log status: 2 of 2 images failed - switching to core
 taft-01 kernel: device-mapper: raid1: All sides of mirror have failed.
 taft-01 lvm[6327]: Monitoring mirror device helter_skelter-syncd_both_logs_2legs_2logs_1_mlog for events.
 taft-01 lvm[6327]: Another thread is handling an event. Waiting...
 taft-01 lvm[6327]: Another thread is handling an event. Waiting...
 taft-01 kernel: INFO: task pdflush:301 blocked for more than 120 seconds.
 taft-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 taft-01 kernel: pdflush       D ffffffff80150839     0   301     87           302   300 (L-TLB)
 taft-01 kernel:  ffff81021ec41b30 0000000000000046 ffff81021504d380 ffff81021fb73200
 taft-01 kernel:  ffff81021fb73200 0000000000000009 ffff81021eccc820 ffff8101fff15100
 taft-01 kernel:  00000053c6a5ed6c 000000000004fa30 ffff81021eccca08 0000000100000000
 taft-01 kernel: Call Trace:
 taft-01 kernel:  [<ffffffff8006e1db>] do_gettimeofday+0x40/0x90
 taft-01 kernel:  [<ffffffff80028b14>] sync_page+0x0/0x43
 taft-01 kernel:  [<ffffffff800637ea>] io_schedule+0x3f/0x67
 taft-01 kernel:  [<ffffffff80028b52>] sync_page+0x3e/0x43
 taft-01 kernel:  [<ffffffff8006392e>] __wait_on_bit_lock+0x36/0x66
 taft-01 kernel:  [<ffffffff8003fc56>] __lock_page+0x5e/0x64
 taft-01 kernel:  [<ffffffff800a0a06>] wake_bit_function+0x0/0x23
 taft-01 kernel:  [<ffffffff8001cf60>] mpage_writepages+0x14f/0x37d
 taft-01 kernel:  [<ffffffff88050479>] :ext3:ext3_ordered_writepage+0x0/0x198
 taft-01 kernel:  [<ffffffff8005ac6f>] do_writepages+0x29/0x2f
 taft-01 kernel:  [<ffffffff8002fc33>] __writeback_single_inode+0x1ae/0x328
 taft-01 kernel:  [<ffffffff80020ff4>] sync_sb_inodes+0x1b5/0x26f
 taft-01 kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
 taft-01 kernel:  [<ffffffff80050ffd>] writeback_inodes+0x82/0xd8
 taft-01 kernel:  [<ffffffff800c9714>] wb_kupdate+0xd4/0x14e
 taft-01 kernel:  [<ffffffff8005663c>] pdflush+0x0/0x1fb
 taft-01 kernel:  [<ffffffff8005678d>] pdflush+0x151/0x1fb
 taft-01 kernel:  [<ffffffff800c9640>] wb_kupdate+0x0/0x14e
 taft-01 kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
 taft-01 kernel:  [<ffffffff8009d64e>] request_module+0x0/0x14d
 taft-01 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
 taft-01 kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
 taft-01 kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
 taft-01 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11
 taft-01 kernel:
 taft-01 kernel: INFO: task dmeventd:7545 blocked for more than 120 seconds.
 taft-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 taft-01 kernel: dmeventd      D ffffffff80150839     0  7545      1          7678  7544 (NOTLB)
 taft-01 kernel:  ffff81020a3dbbd8 0000000000000086 0000000100000000 ffff8101ffd13080
 taft-01 kernel:  0000000000000000 0000000000000008 ffff810209ca07e0 ffff8101fff15100
 taft-01 kernel:  0000005247e193ae 000000000008e6ed ffff810209ca09c8 0000000100000000
 taft-01 kernel: Call Trace:
 taft-01 kernel:  [<ffffffff80129317>] avc_alloc_node+0x3a/0x187
 taft-01 kernel:  [<ffffffff880eb530>] :dm_mod:dev_suspend+0x0/0x18c
 taft-01 kernel:  [<ffffffff8009daa3>] flush_cpu_workqueue+0x7f/0xad
 taft-01 kernel:  [<ffffffff800a09d8>] autoremove_wake_function+0x0/0x2e
 taft-01 kernel:  [<ffffffff8009dadf>] flush_workqueue+0xe/0x87
 taft-01 kernel:  [<ffffffff8810d63c>] :dm_mirror:mirror_presuspend+0x11d/0x126
 taft-01 kernel:  [<ffffffff880e8bb4>] :dm_mod:suspend_targets+0x33/0x43
 taft-01 kernel:  [<ffffffff880e85e6>] :dm_mod:dm_suspend+0x89/0x2db
 taft-01 kernel:  [<ffffffff8008cfa1>] default_wake_function+0x0/0xe
 taft-01 kernel:  [<ffffffff880eb58d>] :dm_mod:dev_suspend+0x5d/0x18c
 taft-01 kernel:  [<ffffffff880ebed8>] :dm_mod:ctl_ioctl+0x210/0x25b
 taft-01 kernel:  [<ffffffff80042181>] do_ioctl+0x55/0x6b
 taft-01 kernel:  [<ffffffff80030204>] vfs_ioctl+0x457/0x4b9
 taft-01 kernel:  [<ffffffff8004c633>] sys_ioctl+0x59/0x78
 taft-01 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
 taft-01 kernel:
 taft-01 kernel: INFO: task kmirrord:7483 blocked for more than 120 seconds.
 taft-01 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 taft-01 kernel: kmirrord      D ffffffff80150839     0  7483     87          7484  6075 (L-TLB)
 taft-01 kernel:  ffff81020a455bb0 0000000000000046 ffff81020b08d668 ffff81020a455be0
 taft-01 kernel:  0000000000000011 000000000000000a ffff8101ffdcf820 ffff8101ffdd87a0
 taft-01 kernel:  00000051fdc32143 0000000000023daa ffff8101ffdcfa08 0000000300000000
 taft-01 kernel: Call Trace:
 taft-01 kernel:  [<ffffffff8006e1db>] do_gettimeofday+0x40/0x90
 taft-01 kernel:  [<ffffffff8005a7d6>] getnstimeofday+0x10/0x28
 taft-01 kernel:  [<ffffffff800637ea>] io_schedule+0x3f/0x67
 taft-01 kernel:  [<ffffffff880ec4f8>] :dm_mod:sync_io+0xb6/0xf3
 taft-01 kernel:  [<ffffffff880ec76d>] :dm_mod:dm_io+0xb7/0xf1
 taft-01 kernel:  [<ffffffff880ec17d>] :dm_mod:vm_get_page+0x0/0x42
 taft-01 kernel:  [<ffffffff880ec0ce>] :dm_mod:vm_next_page+0x0/0x17
 taft-01 kernel:  [<ffffffff881016e3>] :dm_log:disk_flush+0x43/0x6a
 taft-01 kernel:  [<ffffffff8810ef4e>] :dm_mirror:do_mirror+0x93a/0xcbb
 taft-01 kernel:  [<ffffffff8008bae4>] find_busiest_group+0x20d/0x621
 taft-01 kernel:  [<ffffffff80062ff8>] thread_return+0x62/0xfe
 taft-01 kernel:  [<ffffffff8002e23f>] __wake_up+0x38/0x4f
 taft-01 kernel:  [<ffffffff8810e614>] :dm_mirror:do_mirror+0x0/0xcbb
 taft-01 kernel:  [<ffffffff8004d6b3>] run_workqueue+0x94/0xe4
 taft-01 kernel:  [<ffffffff80049eee>] worker_thread+0x0/0x122
 taft-01 kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
 taft-01 kernel:  [<ffffffff80049fde>] worker_thread+0xf0/0x122
 taft-01 kernel:  [<ffffffff8008cfa1>] default_wake_function+0x0/0xe
 taft-01 kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
 taft-01 kernel:  [<ffffffff8003290a>] kthread+0xfe/0x132
 taft-01 kernel:  [<ffffffff8005dfb1>] child_rip+0xa/0x11
 taft-01 kernel:  [<ffffffff800a07c0>] keventd_create_kthread+0x0/0xc4
 taft-01 kernel:  [<ffffffff8003280c>] kthread+0x0/0x132
 taft-01 kernel:  [<ffffffff8005dfa7>] child_rip+0x0/0x11


Version-Release number of selected component (if applicable):
2.6.18-194.11.3.el5

lvm2-2.02.73-2.el5    BUILT: Mon Aug 30 06:36:20 CDT 2010
lvm2-cluster-2.02.73-2.el5    BUILT: Mon Aug 30 06:38:05 CDT 2010
device-mapper-1.02.54-2.el5    BUILT: Fri Sep 10 12:00:05 CDT 2010
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

How reproducible:
Everytime

Comment 1 Corey Marthaler 2010-09-27 19:44:41 UTC

[root@taft-01 ~]# lvs -a -o +devices
LV                                            VG             Attr   LSize   Origin                        Snap%  Log                                Copy%  Convert Devices
snap0                                         helter_skelter swi-a- 252.00M syncd_both_logs_2legs_2logs_1   0.00                                                   /dev/sdf1(150)
syncd_both_logs_2legs_2logs_1                 helter_skelter owi-a- 600.00M                                      syncd_both_logs_2legs_2logs_1_mlog 100.00         syncd_both_logs_2legs_2logs_1_mimage_0(0),syncd_both_logs_2legs_2logs_1_mimage_1(0)
[syncd_both_logs_2legs_2logs_1_mimage_0]      helter_skelter iwi-ao 600.00M                                                                                        /dev/sdf1(0)
[syncd_both_logs_2legs_2logs_1_mimage_1]      helter_skelter iwi-ao 600.00M                                                                                        /dev/sdd1(0)
[syncd_both_logs_2legs_2logs_1_mlog]          helter_skelter mwi-ao   4.00M                                                                         100.00         syncd_both_logs_2legs_2logs_1_mlog_mimage_0(0),syncd_both_logs_2legs_2logs_1_mlog_mimage_1(0)
[syncd_both_logs_2legs_2logs_1_mlog_mimage_0] helter_skelter iwi-ao   4.00M                                                                                        /dev/sdb1(0)
[syncd_both_logs_2legs_2logs_1_mlog_mimage_1] helter_skelter iwi-ao   4.00M                                                                                        /dev/sdg1(0)

Comment 2 Corey Marthaler 2010-09-27 19:46:46 UTC

Just a note that the mirror in comment #0 & #1 has a snapshot volume associated with it, however this issue has been reproduced without there being a snapshot involved.

Comment 4 Jonathan Earl Brassow 2010-10-13 15:46:54 UTC

Please identify test, I'm not sure which one it is:
        kill_primary_synced_2_legs
        kill_primary_synced_3_legs
        kill_primary_synced_core_log_2_legs
        kill_secondary_synced_2_legs
        kill_secondary_synced_3_legs
        kill_secondary_synced_core_log_2_legs
        kill_secondary_non_synced_2_legs
        kill_secondary_non_synced_3_legs
        kill_secondary_non_synced_core_log_2_legs
        kill_log_synced_2_legs
        kill_log_synced_3_legs
        kill_log_non_synced_2_legs
        kill_log_non_synced_3_legs
        kill_primary_synced_4_legs
        kill_primary_synced_core_log_4_legs
        kill_secondary_synced_4_legs
        kill_secondary_synced_core_log_4_legs
        kill_secondary_non_synced_4_legs
        kill_log_synced_4_legs
        kill_log_non_synced_4_legs
        kill_primary_and_log_synced_2_legs
        kill_primary_and_log_synced_3_legs
        kill_secondary_and_log_synced_2_legs
        kill_secondary_and_log_synced_4_legs
        kill_multiple_legs_synced_3_legs
        kill_multiple_legs_synced_4_legs
[jbrassow@silver sts-root]$ grep syncd_both_logs_2legs_2logs_1 -r lvm2
[jbrassow@silver sts-root]$

Comment 6 Jonathan Earl Brassow 2010-10-13 17:53:43 UTC

Created attachment 453256 [details]
Output of 'lvconvert --repair --use-policies vg/lv -vvvv'

***
[root@bp-01 ~]# devices
  LV                 Copy%  Devices
  LogVol00                  /dev/sda2(0)
  LogVol01                  /dev/sda2(4451)
  lv                 100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]             /dev/sdb1(0)
  [lv_mimage_1]             /dev/sdc1(0)
  [lv_mlog]          100.00 lv_mlog_mimage_0(0),lv_mlog_mimage_1(0)
  [lv_mlog_mimage_0]        /dev/sdf1(0)
  [lv_mlog_mimage_1]        /dev/sdg1(0)
[root@bp-01 ~]# ps -C dmeventd
  PID TTY          TIME CMD
[root@bp-01 ~]# off.sh sd[fg]
Turning off sdf
Turning off sdg
[root@bp-01 ~]# dmsetup status
vg-lv: 0 10485760 mirror 2 253:5 253:6 10240/10240 1 AA 3 disk 253:4 D
vg-lv_mimage_1: 0 10485760 linear
vg-lv_mimage_0: 0 10485760 linear
vg-lv_mlog_mimage_1: 0 8192 linear
vg-lv_mlog: 0 8192 mirror 2 253:2 253:3 7/8 1 DD 1 core
vg-lv_mlog_mimage_0: 0 8192 linear
VolGroup00-LogVol01: 0 20578304 linear
VolGroup00-LogVol00: 0 291700736 linear
[root@bp-01 ~]# lvconvert --repair --use-policies vg/lv -vvvv >& output.txt
***

lvconvert seems to be trying to suspend vg/lv - which will fail because the kernel target is blocking I/O (because it's log has failed).  I thought we had this logic right.  Did something go missing in the rebase?

Comment 7 Jonathan Earl Brassow 2010-10-13 18:28:27 UTC

In '_lvconvert_mirrors_repair', after removing the entire log, I pause (in gdb) before going on to attempt re-adding the mirrored log and I see the following:

[root@bp-01 ~]# dmsetup table
vg-lv: 0 10485760 mirror core 3 1024 nosync block_on_error 2 253:5 0 253:6 0
vg-lv_mimage_1: 0 10485760 linear 8:33 2048
vg-lv_mimage_0: 0 10485760 linear 8:17 2048
vg-lv_mlog_mimage_1-missing_0_0: 0 8192 error
vg-lv_mlog_mimage_1: 0 8192 linear 253:8 0
vg-lv_mlog_mimage_0: 0 8192 linear 253:9 0
vg-lv_mlog_mimage_0-missing_0_0: 0 8192 error

The devices there seem to prevent us from adding the mirror log.  Allowed to continue, we see the log is replace; but it is a linear (not mirrored) log:

[root@bp-01 ~]# devices
  /dev/sdf1: open failed: No such device or address
  /dev/sdg1: open failed: No such device or address
  LV            Copy%  Devices
  LogVol00             /dev/sda2(0)
  LogVol01             /dev/sda2(4451)
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]        /dev/sdb1(0)
  [lv_mimage_1]        /dev/sdc1(0)
  [lv_mlog]            /dev/sde1(0)
[root@bp-01 ~]# dmsetup ls
vg-lv   (253, 7)
vg-lv_mimage_1  (253, 6)
vg-lv_mimage_0  (253, 5)
vg-lv_mlog_mimage_1-missing_0_0 (253, 8)
vg-lv_mlog_mimage_1     (253, 3)
vg-lv_mlog      (253, 4)
vg-lv_mlog_mimage_0     (253, 2)
vg-lv_mlog_mimage_0-missing_0_0 (253, 9)
VolGroup00-LogVol01     (253, 1)
VolGroup00-LogVol00     (253, 0)

... and we still have the old mlog devices left around.

This seems to be just one possible outcome.  There seem to be other ways of failing.  I'm not sure if they are all related to DM devices that are not being removed.

Comment 8 Jonathan Earl Brassow 2010-10-13 21:03:28 UTC

gdb is bouncing all over the place... I can correlate the line numbers, but it isn't doing them in the proper sequence:

(gdb) n
703             if (!log_count)
(gdb)
1250            if (!(lp->failed_pvs = _failed_pv_list(lv->vg)))
(gdb)
703             if (!log_count)
(gdb)
1250            if (!(lp->failed_pvs = _failed_pv_list(lv->vg)))
(gdb)
703             if (!log_count)
(gdb)
1265            if (failed_mirrors) {
(gdb)
1262            if (lp->mirrors == 1)
(gdb)
1265            if (failed_mirrors) {
(gdb)
1272            if (!_lv_update_log_type(cmd, lp, lv, lp->failed_pvs,

The _lv_update_log_type at 1272 is already too late... it skips over the one at 1259.  Probably have to compile w/o optimization.

Comment 9 Jonathan Earl Brassow 2010-10-13 21:45:26 UTC

After attempting to remove the mirrored log
_lvconvert_mirrors_repair 
-> _lv_update_log_type
-> remove_mirror_log
-> _remove_mirror_images
-> replace_lv_with_error_segment - followed by suspend

Why is there no new table loaded for the _mlog, which should be an error target?
[from dmsetup ls]
vg-lv_mlog_mimage_1     (253, 3)
vg-lv_mlog      (253, 4)
vg-lv_mlog_mimage_0     (253, 2)

[from dmsetup info]
Name:              vg-lv_mlog
State:             SUSPENDED
Read Ahead:        256
Tables present:    LIVE

And after the resume, why do I have -missing_ devices?  What possible good are they?  Especially since they aren't used?
[from dmsetup ls]
vg-lv_mlog_mimage_1-missing_0_0 (253, 8)
vg-lv_mlog_mimage_1     (253, 3)
vg-lv_mlog      (253, 4)
vg-lv_mlog_mimage_0     (253, 2)
vg-lv_mlog_mimage_0-missing_0_0 (253, 9)

[from dmsetup table]
vg-lv_mlog_mimage_1-missing_0_0: 0 8192 error
vg-lv_mlog_mimage_1: 0 8192 linear 253:8 0
vg-lv_mlog: 0 8192 mirror core 3 1024 nosync block_on_error 2 253:2 0 253:3 0
vg-lv_mlog_mimage_0: 0 8192 linear 253:9 0
vg-lv_mlog_mimage_0-missing_0_0: 0 8192 error

Where did this behavior come from?

Comment 10 Petr Rockai 2010-10-14 08:54:35 UTC

Jon, about gdb jumping around: you need to compile with -O0 instead of -O2 to get the runs straight. Optimized code does not usually follow the control flow of the program straightforwardly.

I will have a look at replace_lv_with_error_segment. Since you can reproduce this problem, do you think it would be possible to script this down as a regression test? It would be very valuable, both for hunting this down and for making sure it does not regress in the future... It seems you don't even need dmeventd to reproduce the problem, is that right?

Comment 11 Petr Rockai 2010-10-14 12:41:20 UTC

Jon, what version are you using here? I can kill a mirrored log and have lvconvert --repair clean it up (and even replace the legs) here, without hitting any bugs. That is with the current CVS, although nothing has changed recently in this respect. The ...-missing_0_0 mappings do not appear (they may appear during the repair temporarily, but after it is done, there's no debris left).

Comment 12 Jonathan Earl Brassow 2010-10-14 13:42:40 UTC

I know about gdb (it's why I mentioned recompiling), but I'm trying to stick with the rpms.  I'll abandon the RPMs soon in favor of CVS head.

Versions:
lvm2-2.02.73-2.el5
lvm2-cluster-2.02.73-2.el5
device-mapper-multipath-0.4.7-39.el5
device-mapper-1.02.54-2.el5
device-mapper-event-1.02.54-2.el5

The components necessary to reproduce are:
1) mirror with a mirrored log [from comment #11 you have this]
2) I/O must be occurring at time of failure [I mount and do loops of kernel untars]
3) kill both devices containing the mirrored log

I know you are doing 1 & 3 - perhaps not 2?

Ok, I won't worry about the -missing_* mappings for now.  Even w/o that, there is still no new table for _mlog - which should be error after the above mentioned suspend.

Comment 13 Jonathan Earl Brassow 2010-10-14 14:04:42 UTC

Yes, the bug can be hit without going through dmeventd.  I move dmeventd so I can run 'lvconvert --repair' directly.

Comment 14 Jonathan Earl Brassow 2010-10-14 14:06:20 UTC

I should mention too that if you are going to run 'lvconvert --repair', I typically wait for 'dmsetup status' to show that the devices have failed before running the command (note the 'D's):
[root@bp-01 ~]# dmsetup status
vg-lv: 0 10485760 mirror 2 253:5 253:6 10240/10240 1 AA 3 disk 253:4 D
vg-lv_mimage_1: 0 10485760 linear
vg-lv_mimage_0: 0 10485760 linear
vg-lv_mlog_mimage_1-missing_0_0: 0 8192 error
vg-lv_mlog_mimage_1: 0 8192 linear
vg-lv_mlog: 0 8192 mirror 2 253:2 253:3 7/8 1 DD 1 core
vg-lv_mlog_mimage_0: 0 8192 linear
vg-lv_mlog_mimage_0-missing_0_0: 0 8192 error

Comment 15 Petr Rockai 2010-10-14 16:42:36 UTC

The following change fixes the problem:

--- old-upstream/lib/metadata/mirror.c  2010-10-14 18:38:37.000000000 +0200
+++ new-upstream/lib/metadata/mirror.c  2010-10-14 18:38:37.000000000 +0200
@@ -905,6 +905,11 @@ static int _remove_mirror_images(struct 
                        return 0;
                }
 
+               if (!vg_write(detached_log_lv->vg)) {
+                       log_error("intermediate VG write failed.");
+                       return 0;
+               }
+
                /*
                 * Flush all I/Os held by mirrored log.
                 */
@@ -915,6 +920,12 @@ static int _remove_mirror_images(struct 
                        return 0;
                }
 
+               if (!vg_commit(detached_log_lv->vg)) {
+                       if (!resume_lv(detached_log_lv->vg->cmd, detached_log_lv))
+                               stack;
+                       return_0;
+               }
+
                if (!resume_lv(detached_log_lv->vg->cmd,
                               detached_log_lv)) {
                        log_error("Failed to resume %s",

Comment 16 Corey Marthaler 2010-10-14 18:25:18 UTC

Jon,

The H_S test case is "kill_both_logs_2_legs_2_logs". It doesn't show up because it's been commented out in lvm2/lib/helter_skelter/HS_Scenarios.pm due to this issue.

Comment 17 Jonathan Earl Brassow 2010-10-14 19:08:41 UTC

New patch created, tested, and sent to lvm-devel

Comment 18 Jonathan Earl Brassow 2010-10-14 19:10:07 UTC

Created attachment 453534 [details]
Patch that was sent to lvm-devel

Comment 20 Milan Broz 2010-10-15 15:09:39 UTC

Fix in lvm2-2.02.74-1.el5.

Comment 22 Corey Marthaler 2010-10-18 20:56:08 UTC

Fix verified in lvm2-2.02.74-1.el5.

Comment 24 errata-xmlrpc 2011-01-13 22:42:07 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-0052.html

Note You need to log in before you can comment on or make changes to this bug.