Bug 203739 - [RHEL4U4] LVM2 mirror: 'vgchange -an' sometimes fails to remove mirror corresponding maps.
Summary: [RHEL4U4] LVM2 mirror: 'vgchange -an' sometimes fails to remove mirror corres...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2
Version: 4.4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Jonathan Earl Brassow
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 236328
TreeView+ depends on / blocked
 
Reported: 2006-08-23 15:07 UTC by Kiyoshi Ueda
Modified: 2013-04-02 23:51 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-01-26 19:09:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Test script (1.95 KB, text/plain)
2006-08-23 15:07 UTC, Kiyoshi Ueda
no flags Details
New test script (4.41 KB, text/plain)
2006-11-30 19:51 UTC, Jonathan Earl Brassow
no flags Details

Description Kiyoshi Ueda 2006-08-23 15:07:40 UTC
Description of problem:
After 'vgchange -an' is executed, some parts of mirror corresponding
maps are sometimes remained even though the vgchange is succeeded.


Version-Release number of selected component:
lvm2-2.02.06-6.0.RHEL4
device-mapper-1.02.07-4.0.RHEL4
kernel-2.6.9-42.EL


How reproducible:
Sometimes


Steps to Reproduce:
 1. If you have modified the /etc/lvm/lvm.conf, reset to default setting.
 2. Run attached script.
      # ./vgchange-fail.sh

 The attached script do the following:
    o create a file and setup a loop-back device by using the file
    o create PVs by linear mapping to the loop-back device
    o create VGs and mirror LVs from the PVs
    o Run 'vgchange -an' and 'vgchange -ay' until a problem occurs
      (e.g. map is remained after 'vgchange -an', 'vgchange -an' fails.)


Actual results:
The script stops with 'Exit test.' message. (It will take a few minutes.)
At the time, some parts of mirror corresponding maps are remained.
(Output example is attached in Additional info below.)


Expected results:
The script doesn't stop. (at least, more than 30 minutes)


Additional info:
This is not regression.
This problem seems to be prone to occur when many mirror LVs are exist.

Output example of the vgchange-fail.sh is below.
--------------------------------------------------------------------
[root@nec-em4 ~]# ./vgchange-fail.sh
INFO: Creating a file for loop setup. file=tmpfile
819200+0 records in
819200+0 records out
INFO: Setting up loop device. dev=/dev/loop0 file=tmpfile
INFO: Creating PV. nr_pv=20
  Physical volume "/dev/mapper/pv0" successfully created
  Physical volume "/dev/mapper/pv1" successfully created
  Physical volume "/dev/mapper/pv2" successfully created
  Physical volume "/dev/mapper/pv3" successfully created
  Physical volume "/dev/mapper/pv4" successfully created
  Physical volume "/dev/mapper/pv5" successfully created
  Physical volume "/dev/mapper/pv6" successfully created
  Physical volume "/dev/mapper/pv7" successfully created
  Physical volume "/dev/mapper/pv8" successfully created
  Physical volume "/dev/mapper/pv9" successfully created
  Physical volume "/dev/mapper/pv10" successfully created
  Physical volume "/dev/mapper/pv11" successfully created
  Physical volume "/dev/mapper/pv12" successfully created
  Physical volume "/dev/mapper/pv13" successfully created
  Physical volume "/dev/mapper/pv14" successfully created
  Physical volume "/dev/mapper/pv15" successfully created
  Physical volume "/dev/mapper/pv16" successfully created
  Physical volume "/dev/mapper/pv17" successfully created
  Physical volume "/dev/mapper/pv18" successfully created
  Physical volume "/dev/mapper/pv19" successfully created
INFO: Creating VG. nr_vg=4. each pv in a vg: 5
INFO: vgname=testvg0
  Volume group "testvg0" successfully created
INFO: vgname=testvg1
  Volume group "testvg1" successfully created
INFO: vgname=testvg2
  Volume group "testvg2" successfully created
INFO: vgname=testvg3
  Volume group "testvg3" successfully created
INFO: Creating LV.  each lv in a vg: 4
INFO: lvname=testvg0/lv0
  Logical volume "lv0" created
INFO: lvname=testvg0/lv1
  Logical volume "lv1" created
INFO: lvname=testvg0/lv2
  Logical volume "lv2" created
INFO: lvname=testvg0/lv3
  Logical volume "lv3" created
INFO: lvname=testvg1/lv0
  Logical volume "lv0" created
INFO: lvname=testvg1/lv1
  Logical volume "lv1" created
INFO: lvname=testvg1/lv2
  Logical volume "lv2" created
INFO: lvname=testvg1/lv3
  Logical volume "lv3" created
INFO: lvname=testvg2/lv0
  Logical volume "lv0" created
INFO: lvname=testvg2/lv1
  Logical volume "lv1" created
INFO: lvname=testvg2/lv2
  Logical volume "lv2" created
INFO: lvname=testvg2/lv3
  Logical volume "lv3" created
INFO: lvname=testvg3/lv0
  Logical volume "lv0" created
INFO: lvname=testvg3/lv1
  Logical volume "lv1" created
INFO: lvname=testvg3/lv2
  Logical volume "lv2" created
INFO: lvname=testvg3/lv3
  Logical volume "lv3" created
INFO: Start activate/deactivate testing.
INFO: This test stops when a problem occurs.
INFO: trials=0. deactivating all vgs.
  0 logical volume(s) in volume group "testvg3" now active
  0 logical volume(s) in volume group "testvg2" now active
  0 logical volume(s) in volume group "testvg1" now active
  0 logical volume(s) in volume group "testvg0" now active
INFO: trials=0. activating all vgs.
  4 logical volume(s) in volume group "testvg3" now active
  4 logical volume(s) in volume group "testvg2" now active
  4 logical volume(s) in volume group "testvg1" now active
  4 logical volume(s) in volume group "testvg0" now active
INFO: trials=1. deactivating all vgs.
  0 logical volume(s) in volume group "testvg3" now active
  0 logical volume(s) in volume group "testvg2" now active
  0 logical volume(s) in volume group "testvg1" now active
  0 logical volume(s) in volume group "testvg0" now active
INFO: The problem occurs. Exit test.
[root@nec-em4 ~]#
[root@nec-em4 ~]# dmsetup table
pv18: 0 40960 linear 7:0 737280
pv7: 0 40960 linear 7:0 286720
testvg0-lv1_mlog: 0 8192 linear 253:4 8576
testvg1-lv2_mlog: 0 8192 linear 253:9 16768
pv17: 0 40960 linear 7:0 696320
pv6: 0 40960 linear 7:0 245760
pv16: 0 40960 linear 7:0 655360
pv5: 0 40960 linear 7:0 204800
testvg0-lv2_mlog: 0 8192 linear 253:4 16768
pv15: 0 40960 linear 7:0 614400
pv4: 0 40960 linear 7:0 163840
testvg0-lv3_mlog: 0 8192 linear 253:4 24960
pv14: 0 40960 linear 7:0 573440
pv3: 0 40960 linear 7:0 122880
pv13: 0 40960 linear 7:0 532480
pv2: 0 40960 linear 7:0 81920
pv12: 0 40960 linear 7:0 491520
pv1: 0 40960 linear 7:0 40960
pv11: 0 40960 linear 7:0 450560
pv0: 0 40960 linear 7:0 0
pv10: 0 40960 linear 7:0 409600
pv9: 0 40960 linear 7:0 368640
testvg0-lv0_mlog: 0 8192 linear 253:4 384
pv19: 0 40960 linear 7:0 778240
pv8: 0 40960 linear 7:0 327680
[root@nec-em4 ~]#
[root@nec-em4 ~]# dmsetup ls --tree
pv18 (253:18)
 `- (7:0)
pv7 (253:7)
 `- (7:0)
testvg0-lv1_mlog (253:72)
 `-pv4 (253:4)
    `- (7:0)
testvg1-lv2_mlog (253:60)
 `-pv9 (253:9)
    `- (7:0)
pv17 (253:17)
 `- (7:0)
pv6 (253:6)
 `- (7:0)
pv16 (253:16)
 `- (7:0)
pv5 (253:5)
 `- (7:0)
testvg0-lv2_mlog (253:76)
 `-pv4 (253:4)
    `- (7:0)
pv15 (253:15)
 `- (7:0)
testvg0-lv3_mlog (253:80)
 `-pv4 (253:4)
    `- (7:0)
pv14 (253:14)
 `- (7:0)
pv3 (253:3)
 `- (7:0)
pv13 (253:13)
 `- (7:0)
pv2 (253:2)
 `- (7:0)
pv12 (253:12)
 `- (7:0)
pv1 (253:1)
 `- (7:0)
pv11 (253:11)
 `- (7:0)
pv0 (253:0)
 `- (7:0)
pv10 (253:10)
 `- (7:0)
testvg0-lv0_mlog (253:68)
 `-pv4 (253:4)
    `- (7:0)
pv19 (253:19)
 `- (7:0)
pv8 (253:8)
 `- (7:0)
[root@nec-em4 ~]#
--------------------------------------------------------------------

Comment 1 Kiyoshi Ueda 2006-08-23 15:07:47 UTC
Created attachment 134722 [details]
Test script

Comment 2 Jonathan Earl Brassow 2006-11-30 19:51:52 UTC
Created attachment 142513 [details]
New test script

Updated the test script so it's more useful to QA (and they can add it to their
regression suite)

1) when creating VGs, use the '-c n' option
2) only vgchange those VGs which the test creates
3) complete with success after 30 min (or '-t <min>')
4) add proper cleanup
...

Comment 3 Jonathan Earl Brassow 2006-11-30 20:47:52 UTC
Clears 45min/35 iterations


Comment 4 Jun'ichi NOMURA 2006-11-30 20:56:06 UTC
Did you find the cause of the problem?
As shown in comment#3 of BZ#203745 (same bug for RHEL5),
the problem becomes very rare but still happening with
the recent versions of lvm2.


Comment 5 Jonathan Earl Brassow 2006-11-30 21:15:52 UTC
Hmmm, I know we had a problem were dmeventd was still waiting on the mirror
device when it got removed (causing the sub devices to stick around).  That's
been fixed.

If this is still happening in recent versions, we can remark this bug as ASSIGNED.

I don't have a machine to dedicate to this effort for "days" though...

run the above script with '-t 10080' to run it for a week.


Comment 6 Jonathan Earl Brassow 2006-11-30 21:17:39 UTC
I do have 3 machines that I can run this on overnight though...


Comment 7 Jonathan Earl Brassow 2006-12-01 15:59:01 UTC
3 machines, 1000 min a peice, no problem.


Comment 8 Corey Marthaler 2006-12-21 16:47:18 UTC
marking verified. I have been running creation/deletion activation/deactivation
tests for quite awhile and haven't seen any issues.


Note You need to log in before you can comment on or make changes to this bug.