Bug 203745

Summary: [RHEL5 Alpha] LVM2 mirror: 'vgchange -an' sometimes fails to remove mirror corresponding maps.
Product: Red Hat Enterprise Linux 5 Reporter: Kiyoshi Ueda <kueda>
Component: lvm2Assignee: Petr Rockai <prockai>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 5.0CC: agk, coughlan, dwysocha, i-kitayama, jbrassow, jnomura, junichi.nomura, kueda, kueda, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0516 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-07 16:48:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 228988    
Attachments:
Description Flags
Test script none

Description Kiyoshi Ueda 2006-08-23 15:25:18 UTC
Description of problem:
After 'vgchange -an' is executed, some parts of mirror corresponding
maps are sometimes remained even though the vgchange is succeeded.


Version-Release number of selected component:
lvm2-2.02.09-1.0.RHEL5
device-mapper-1.02.09-1.0.RHEL5
kernel-2.6.17-1.2519.4.5.el5


How reproducible:
Sometimes


Steps to Reproduce:
 1. If you have modified the /etc/lvm/lvm.conf, reset to default setting.
 2. Run attached script.
      # ./vgchange-fail.sh

 The attached script do the following:
    o create a file and setup a loop-back device by using the file
    o create PVs by linear mapping to the loop-back device
    o create VGs and mirror LVs from the PVs
    o Run 'vgchange -an' and 'vgchange -ay' until a problem occurs
      (e.g. map is remained after 'vgchange -an', 'vgchange -an' fails.)


Actual results:
The script stops with 'Exit test.' message. (It will take a few minutes.)
At the time, some parts of mirror corresponding maps are remained.
(Output example is attached in Additional info below.)


Expected results:
The script doesn't stop. (at least, more than 30 minutes)


Additional info:
This problem seems to be prone to occur when many mirror LVs are exist.

Output example of the vgchange-fail.sh is below.
--------------------------------------------------------------------
[root@nec-em4 ~]# ./vgchange-fail.sh
INFO: Creating a file for loop setup. file=tmpfile
819200+0 records in
819200+0 records out
419430400 bytes (419 MB) copied, 22.9046 seconds, 18.3 MB/s
INFO: Setting up loop device. dev=/dev/loop0 file=tmpfile
INFO: Creating PV. nr_pv=20
  Physical volume "/dev/mapper/pv0" successfully created
  Physical volume "/dev/mapper/pv1" successfully created
  Physical volume "/dev/mapper/pv2" successfully created
  Physical volume "/dev/mapper/pv3" successfully created
  Physical volume "/dev/mapper/pv4" successfully created
  Physical volume "/dev/mapper/pv5" successfully created
  Physical volume "/dev/mapper/pv6" successfully created
  Physical volume "/dev/mapper/pv7" successfully created
  Physical volume "/dev/mapper/pv8" successfully created
  Physical volume "/dev/mapper/pv9" successfully created
  Physical volume "/dev/mapper/pv10" successfully created
  Physical volume "/dev/mapper/pv11" successfully created
  Physical volume "/dev/mapper/pv12" successfully created
  Physical volume "/dev/mapper/pv13" successfully created
  Physical volume "/dev/mapper/pv14" successfully created
  Physical volume "/dev/mapper/pv15" successfully created
  Physical volume "/dev/mapper/pv16" successfully created
  Physical volume "/dev/mapper/pv17" successfully created
  Physical volume "/dev/mapper/pv18" successfully created
  Physical volume "/dev/mapper/pv19" successfully created
INFO: Creating VG. nr_vg=4. each pv in a vg: 5
INFO: vgname=testvg0
  Volume group "testvg0" successfully created
INFO: vgname=testvg1
  Volume group "testvg1" successfully created
INFO: vgname=testvg2
  Volume group "testvg2" successfully created
INFO: vgname=testvg3
  Volume group "testvg3" successfully created
INFO: Creating LV.  each lv in a vg: 4
INFO: lvname=testvg0/lv0
  Logical volume "lv0" created
INFO: lvname=testvg0/lv1
  Logical volume "lv1" created
INFO: lvname=testvg0/lv2
  Logical volume "lv2" created
INFO: lvname=testvg0/lv3
  Logical volume "lv3" created
INFO: lvname=testvg1/lv0
  Logical volume "lv0" created
INFO: lvname=testvg1/lv1
  Logical volume "lv1" created
INFO: lvname=testvg1/lv2
  Logical volume "lv2" created
INFO: lvname=testvg1/lv3
  Logical volume "lv3" created
INFO: lvname=testvg2/lv0
  Logical volume "lv0" created
INFO: lvname=testvg2/lv1
  Logical volume "lv1" created
INFO: lvname=testvg2/lv2
  Logical volume "lv2" created
INFO: lvname=testvg2/lv3
  Logical volume "lv3" created
INFO: lvname=testvg3/lv0
  Logical volume "lv0" created
INFO: lvname=testvg3/lv1
  Logical volume "lv1" created
INFO: lvname=testvg3/lv2
  Logical volume "lv2" created
INFO: lvname=testvg3/lv3
  Logical volume "lv3" created
INFO: Start activate/deactivate testing.
INFO: This test stops when a problem occurs.
INFO: trials=0. deactivating all vgs.
  0 logical volume(s) in volume group "testvg3" now active
  0 logical volume(s) in volume group "testvg2" now active
  0 logical volume(s) in volume group "testvg1" now active
  0 logical volume(s) in volume group "testvg0" now active
INFO: trials=0. activating all vgs.
  4 logical volume(s) in volume group "testvg3" now active
  4 logical volume(s) in volume group "testvg2" now active
  4 logical volume(s) in volume group "testvg1" now active
  4 logical volume(s) in volume group "testvg0" now active
INFO: trials=1. deactivating all vgs.
  0 logical volume(s) in volume group "testvg3" now active
  0 logical volume(s) in volume group "testvg2" now active
  0 logical volume(s) in volume group "testvg1" now active
  0 logical volume(s) in volume group "testvg0" now active
INFO: The problem occurs. Exit test.
[root@nec-em4 ~]#
[root@nec-em4 ~]# dmsetup table
pv18: 0 40960 linear 7:0 737280
pv7: 0 40960 linear 7:0 286720
testvg0-lv1_mlog: 0 8192 linear 253:4 8576
testvg1-lv2_mlog: 0 8192 linear 253:9 16768
pv17: 0 40960 linear 7:0 696320
pv6: 0 40960 linear 7:0 245760
pv16: 0 40960 linear 7:0 655360
pv5: 0 40960 linear 7:0 204800
testvg0-lv2_mlog: 0 8192 linear 253:4 16768
pv15: 0 40960 linear 7:0 614400
pv4: 0 40960 linear 7:0 163840
testvg0-lv3_mlog: 0 8192 linear 253:4 24960
pv14: 0 40960 linear 7:0 573440
pv3: 0 40960 linear 7:0 122880
pv13: 0 40960 linear 7:0 532480
pv2: 0 40960 linear 7:0 81920
pv12: 0 40960 linear 7:0 491520
pv1: 0 40960 linear 7:0 40960
pv11: 0 40960 linear 7:0 450560
pv0: 0 40960 linear 7:0 0
pv10: 0 40960 linear 7:0 409600
pv9: 0 40960 linear 7:0 368640
testvg0-lv0_mlog: 0 8192 linear 253:4 384
pv19: 0 40960 linear 7:0 778240
pv8: 0 40960 linear 7:0 327680
[root@nec-em4 ~]#
[root@nec-em4 ~]# dmsetup ls --tree
pv18 (253:18)
 `- (7:0)
pv7 (253:7)
 `- (7:0)
testvg0-lv1_mlog (253:72)
 `-pv4 (253:4)
    `- (7:0)
testvg1-lv2_mlog (253:60)
 `-pv9 (253:9)
    `- (7:0)
pv17 (253:17)
 `- (7:0)
pv6 (253:6)
 `- (7:0)
pv16 (253:16)
 `- (7:0)
pv5 (253:5)
 `- (7:0)
testvg0-lv2_mlog (253:76)
 `-pv4 (253:4)
    `- (7:0)
pv15 (253:15)
 `- (7:0)
testvg0-lv3_mlog (253:80)
 `-pv4 (253:4)
    `- (7:0)
pv14 (253:14)
 `- (7:0)
pv3 (253:3)
 `- (7:0)
pv13 (253:13)
 `- (7:0)
pv2 (253:2)
 `- (7:0)
pv12 (253:12)
 `- (7:0)
pv1 (253:1)
 `- (7:0)
pv11 (253:11)
 `- (7:0)
pv0 (253:0)
 `- (7:0)
pv10 (253:10)
 `- (7:0)
testvg0-lv0_mlog (253:68)
 `-pv4 (253:4)
    `- (7:0)
pv19 (253:19)
 `- (7:0)
pv8 (253:8)
 `- (7:0)
[root@nec-em4 ~]#
--------------------------------------------------------------------

Comment 1 Kiyoshi Ueda 2006-08-23 15:25:24 UTC
Created attachment 134725 [details]
Test script

Comment 2 Jun'ichi NOMURA 2006-09-01 23:05:17 UTC
The problem reproduces even if udev is disabled.
When dmeventd is disabled, it didn't reproduce
after about 400 trials.
(It usually reproduces within 100 trials.)

However, when dmeventd is disabled,
about once in 500 trials, I observed vgchange -ay stalls.
In this case, there is '[dmeventd]' process.
Both dmeventd and vgchange seems to do select()
on dmeventd socket.
# As dmeventd is disabled by setting mirror_library = "none",
# the daemon is invoked.

Comment 3 Kiyoshi Ueda 2006-11-03 19:43:21 UTC
This is an testing result update.

The problem still occurs in:
    kernel-2.6.18-1.2732.el5
    device-mapper-1.02.12-2.el5
    lvm2-2.02.12-3.el5
    udev-095-14

It takes several days to reproduce now, though it used to
take few hours.


Comment 4 RHEL Program Management 2007-03-21 23:55:39 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 8 Corey Marthaler 2007-06-28 20:25:35 UTC
fix verified in lvm2-2.02.26-1.el5.

Comment 10 errata-xmlrpc 2007-11-07 16:48:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0516.html