Bug 748065 - temporarily "missing" devices cause clvm change operations to fail
Summary: temporarily "missing" devices cause clvm change operations to fail
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.2
Hardware: x86_64
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: LVM and device-mapper development team
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 756082
TreeView+ depends on / blocked
 
Reported: 2011-10-21 21:27 UTC by Corey Marthaler
Modified: 2012-03-01 20:33 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-03-01 20:30:27 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Corey Marthaler 2011-10-21 21:27:40 UTC
Description of problem:
Although this problem is not reliably reproducible, it has been seen on many clusters during 6.2 regression testing. Basically, during change operations, device appear to be missing and cause the following errors:

   Couldn't find device with uuid H019sC-nSGg-iM1p-vcTw-BSfB-SfeT-bwwLg9.
   Cannot change VG mirror_sanity while PVs are missing.
   Consider vgreduce --removemissing.

Upon further investigation however, all the devices are present and the VG remains fine.

 SCENARIO - [open_fsadm_resize_attempt]
 Create mirror, add fs, and then attempt to resize it while it's mounted
 grant-03: lvcreate -m 1 -n open_fsadm_resize -L 4G --nosync mirror_sanity
   WARNING: New mirror won't be synchronised. Don't read what you didn't write!
 Placing an ext4 on open_fsadm_resize volume
 mke2fs 1.41.12 (17-May-2010)
 Attempt to resize the open mirrored filesystem multiple times with lvextend/fsadm on grant-03
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
 resize2fs 1.41.12 (17-May-2010)
 (lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize)
   Couldn't find device with uuid H019sC-nSGg-iM1p-vcTw-BSfB-SfeT-bwwLg9.
   Cannot change VG mirror_sanity while PVs are missing.
   Consider vgreduce --removemissing.
 couldn't resize mirror and filesystem on grant-03

Oct 21 15:03:09 grant-03 qarshd[29601]: Running cmdline: lvextend -L +3G -r /dev/mirror_sanity/open_fsadm_resize
Oct 21 15:03:10 grant-03 xinetd[5684]: EXIT: qarsh status=0 pid=29601 duration=1(sec)
Oct 21 15:04:23 grant-03 lvm[1092]: mirror_sanity-open_fsadm_resize is now in-sync.
                                    

[root@grant-03 ~]# lvs -a -o +devices
 LV                           Attr   LSize  Log                    Copy%  Devices
 open_fsadm_resize            Mwi-ao 28.00g open_fsadm_resize_mlog 100.00 open_fsadm_resize_mimage_0(0),open_fsadm_resize_mimage_1(0)
 [open_fsadm_resize_mimage_0] iwi-ao 28.00g                               /dev/sdb1(0)
 [open_fsadm_resize_mimage_1] iwi-ao 28.00g                               /dev/sdb2(0)
 [open_fsadm_resize_mlog]     lwi-ao  4.00m                               /dev/sdc6(0)


Version-Release number of selected component (if applicable):
2.6.32-209.el6.x86_64

lvm2-2.02.87-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
lvm2-libs-2.02.87-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
lvm2-cluster-2.02.87-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.66-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
device-mapper-libs-1.02.66-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
device-mapper-event-1.02.66-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
device-mapper-event-libs-1.02.66-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011
cmirror-2.02.87-6.el6    BUILT: Wed Oct 19 06:46:31 CDT 2011


How reproducible:
Often during extended regression testing

Comment 2 Alasdair Kergon 2012-01-04 20:21:25 UTC
I wonder if you can pin down one instance of this occurring with the exact sequence of commands that the script issued.  How long has it been doing this?  Is it just a few test scripts or many different ones?

LVM is supposed to take responsibility for ensuring its own data is updated on disk, visible to all nodes, at the crucial places - not stuck in buffers.

We should probably review the code to check none of the recent changes broke the guarantees, or some other logic bug has crept in.  Equally it's possible the test scripts themselves aren't providing the necessary guarantees in everything they do.

So basically, more investigation needed to try to narrow down the circumstances/versions/variations when it does happen and when it doesn't.

Comment 4 Peter Rajnoha 2012-02-16 11:05:47 UTC
Corey, is this still seen with the latest test build?


Note You need to log in before you can comment on or make changes to this bug.