Bug 520913 - pvmove fails, leading to total loss of logical volumes
Summary: pvmove fails, leading to total loss of logical volumes
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: lvm2
Version: rawhide
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Milan Broz
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-09-02 20:20 UTC by James Ralston
Modified: 2013-03-01 04:07 UTC (History)
12 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2012-03-22 16:21:54 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
output of "lvmdump -m" on problem system (65.60 KB, application/x-gzip)
2009-09-12 19:51 UTC, James Ralston
no flags Details

Description James Ralston 2009-09-02 20:20:05 UTC
I wish to use pvmove to move logical volumes from dm-0 (a LUKS partition on an external USB hard drive) to dm-1 (a LUKS partition on an internal ATA hard drive).

The following happens:

$ lvs --options lv_name,vg_name,lv_attr,lv_size,devices
  LV     VG   Attr   LSize   Devices        
  conf   os   -wi-ao   1.00G /dev/dm-0(768) 
  dva    os   -wi-ao   5.00G /dev/dm-0(0)   
  home   os   -wi-ao   8.00G /dev/dm-0(480) 
  images os   -wi-ao  10.00G /dev/dm-0(160) 
  log    os   -wi-ao   1.00G /dev/dm-0(736) 
  root   os   -wi-ao   1.00G /dev/dm-1(0)   
  rpms   os   -wi-ao   5.00G /dev/dm-0(800) 
  tmp    os   -wi-ao   1.00G /dev/dm-0(1768)
  usr    os   -wi-ao   6.00G /dev/dm-1(32)  
  var    os   -wi-ao   1.00G /dev/dm-1(224) 
  wapps  os   -wi-ao  16.00G /dev/dm-0(1248)
  whome  os   -wi-ao 256.00M /dev/dm-0(1760)

$ pvmove -i 60 -v -n tmp /dev/dm-0 /dev/dm-1
    Finding volume group "os"
    Archiving volume group "os" metadata (seqno 114).
    Creating logical volume pvmove0
    Moving 32 extents of logical volume os/tmp
    Found volume group "os"
    Updating volume group metadata
    Found volume group "os"
    Found volume group "os"
    Suspending os-tmp (253:10) with device flush
    Found volume group "os"
    Found volume group "os"
    Creating os-pvmove0
  device-mapper: create ioctl failed: Device or resource busy
    Loading os-tmp table
  device-mapper: reload ioctl failed: Invalid argument
    Creating volume group backup "/etc/lvm/backup/os" (seqno 115).
    Checking progress every 60 seconds
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
    Found volume group "os"
    Found volume group "os"
    Found volume group "os"
    Found volume group "os"
    Suspending os-pvmove0 (253:14) with device flush
    Found volume group "os"
    Creating os-pvmove0
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"
    Found volume group "os"
    Loading os-tmp table
    Resuming os-tmp (253:10)
    Found volume group "os"
    Removing os-pvmove0 (253:14)
    Found volume group "os"
    Removing temporary pvmove LV
    Writing out final volume group after pvmove
    Creating volume group backup "/etc/lvm/backup/os" (seqno 117).

$ dmesg
...
device-mapper: table: 253:10: linear: dm-linear: Device lookup failed
device-mapper: ioctl: error adding target to table

$ lvs --options lv_name,vg_name,lv_attr,lv_size,devices
  LV     VG   Attr   LSize   Devices        
  conf   os   -wi-ao   1.00G /dev/dm-0(768) 
  dva    os   -wi-ao   5.00G /dev/dm-0(0)   
  home   os   -wi-ao   8.00G /dev/dm-0(480) 
  images os   -wi-ao  10.00G /dev/dm-0(160) 
  log    os   -wi-ao   1.00G /dev/dm-0(736) 
  root   os   -wi-ao   1.00G /dev/dm-1(0)   
  rpms   os   -wi-ao   5.00G /dev/dm-0(800) 
  tmp    os   -wi-ao   1.00G /dev/dm-1(256) 
  usr    os   -wi-ao   6.00G /dev/dm-1(32)  
  var    os   -wi-ao   1.00G /dev/dm-1(224) 
  wapps  os   -wi-ao  16.00G /dev/dm-0(1248)
  whome  os   -wi-ao 256.00M /dev/dm-0(1760)

$ umount /tmp
$ e2fsck -p /dev/os/tmp 
/dev/os/tmp: Resize inode not valid.  

/dev/os/tmp: UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.
        (i.e., without -a or -p options)

$ e2fsck -b 32768 -p /dev/os/tmp
e2fsck: Invalid argument while trying to open /dev/os/tmp
/dev/os/tmp: 
The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

What effectively happened here is that pvmove attempted to migrate the "tmp" logical volume from dm-0 to dm-1, failed for some reason (without actually copying any data)... but then updated the LVM metadata to indicate that the volume was successfully moved!

This effectively *DESTROYS* the contents of the filesystem, because the filesystem's actual contents are back on dm-0; now that LVM thinks it's on dm-1, it's backed by garbage.

In this particular case, because of how the logical volumes are packed on dm-0, I know that if I create a new 1GB logical volume on dm-0, it will fall directly upon the boundaries where /dev/os/tmp used to be. I can use this to recover:

$ lvremove /dev/os/tmp 
Do you really want to remove active logical volume tmp? [y/n]: y
  Logical volume "tmp" successfully removed

$ lvs --options lv_name,vg_name,lv_attr,lv_size,devices
  LV     VG   Attr   LSize   Devices        
  conf   os   -wi-ao   1.00G /dev/dm-0(768) 
  dva    os   -wi-ao   5.00G /dev/dm-0(0)   
  home   os   -wi-ao   8.00G /dev/dm-0(480) 
  images os   -wi-ao  10.00G /dev/dm-0(160) 
  log    os   -wi-ao   1.00G /dev/dm-0(736) 
  root   os   -wi-ao   1.00G /dev/dm-1(0)   
  rpms   os   -wi-ao   5.00G /dev/dm-0(800) 
  usr    os   -wi-ao   6.00G /dev/dm-1(32)  
  var    os   -wi-ao   1.00G /dev/dm-1(224) 
  wapps  os   -wi-ao  16.00G /dev/dm-0(1248)
  whome  os   -wi-ao 256.00M /dev/dm-0(1760)

$ lvcreate -L 1G -n tmp os /dev/dm-0
  Logical volume "tmp" created

$ e2fsck -p /dev/os/tmp
e2fsck: Bad magic number in super-block while trying to open /dev/os/tmp
/dev/os/tmp: 
The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>

$ e2fsck -b 32768 -p /dev/os/tmp
/dev/os/tmp: Superblock needs_recovery flag is clear, but journal has data.
/dev/os/tmp: Recovery flag not set in backup superblock, so running journal anyway.
/dev/os/tmp: recovering journal
/dev/os/tmp: Group descriptor 0 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 1 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 2 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 3 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 4 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 5 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 6 checksum is invalid.  FIXED.
/dev/os/tmp: Group descriptor 7 checksum is invalid.  FIXED.
/dev/os/tmp: 16/65536 files (0.0% non-contiguous), 12638/262144 blocks

$ e2fsck -p /dev/os/tmp
/dev/os/tmp: clean, 16/65536 files, 12638/262144 blocks

$ e2fsck -C 0 -f /dev/os/tmp
e2fsck 1.41.4 (27-Jan-2009)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure                                           
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/os/tmp: 16/65536 files (0.0% non-contiguous), 12638/262144 blocks         

$ mount /dev/os/tmp
$ cd /tmp
$ ls
lost+found/  ssh-MXpLCq2978/

However, if this had been my / or /usr partition, the system would be toast. (And that is indeed what happened earlier when I attempted to move the /usr partition: LVM destroyed it, and I had to reload the entire system.) Furthermore, depending on where the logical volume fell on dm-0, creating a new volume with the same size wouldn't necessarily cover the same physical space.

The bottom line: pvmove destroyed the logical volume it attempted to move. This is very, very, very bad bug, and needs to be tracked down and fixed ASAP.

(Within the past few days, I have been using pvmove extensively on Fedora 9 and Fedora 10 systems, without incident. I only started seeing this problem after I reloaded the Fedora 9 system with Fedora 11.)

Versions:

0:kernel-PAE-2.6.29.6-217.2.16.fc11.i686
0:lvm2-2.02.48-2.fc11.i586

Comment 1 Alasdair Kergon 2009-09-03 00:13:36 UTC
1) Try to track down why the ioctls failed.  Mismatched devices/sizes?  'dmsetup info -c' gives a little more info.  Devices left behind by earlier pvmoves not cleaned up properly for some reason?  udev rules interfering?

2) Need to audit the failure path to see why it proceeded despite the ioctls failing.  (A similar bug was fixed in 2.02.46.)

Comment 2 James Ralston 2009-09-03 01:26:36 UTC
> Mismatched devices/sizes?

Shouldn't be; the target physical device has plenty of space:

$ pvs
  PV         VG   Fmt  Attr PSize   PFree  
  /dev/dm-0  os   lvm2 a-   596.16G 548.91G
  /dev/dm-1  os   lvm2 a-    48.97G  40.97G

I don't see anything unusual about the devices:

$ dmsetup info -c
Name                                      Maj Min Stat Open Targ Event  UUID                                                                
os-dva                                    253   4 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpU50pXpp6mBRWx7jq5bQ3UuHfC85UexVD
luks-ae54d1db-a914-4a56-9361-b8cbaa3880a0 253   1 L--w    3    1      0                                                                     
os-tmp                                    253  10 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpVRMMYtXJvW7qLVaXPPsf9W416BxtKkB3
luks-57af346e-986e-4ba6-9122-4d9b3d330f37 253   0 L--w    9    1      0                                                                     
os-log                                    253   7 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpNreyyEjOVP9jtfAO35DqyKWaux7nIBcq
os-usr                                    253  13 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpS0qQzeTNby0TT23T5FcdeZg2cUfuptX9
os-var                                    253   3 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qp0n6V3BwcqPYomv8fRRHPuV2Bi59Ku45G
os-home                                   253   6 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpZ7N1E0TTUrZ1ux8qqVjj1dXeIcPFeZpw
os-wapps                                  253  11 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpP8CKtED01mlfztagNVKKRX62r2ZVsx64
os-images                                 253   5 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpHBmqrXbOipQBDzNoAAI2WJGjoga3pQhF
os-root                                   253   2 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpYa203KRuBqHfI0QvspeYyUkH9hian0xW
os-whome                                  253  12 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qp2D33YVoXIfiETgrXx0sHlBVRTtO5dsQH
os-conf                                   253   8 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qpx9yERn0mmmhr8QZEVxLmq5B10vwZX48A
os-rpms                                   253   9 L--w    1    1      0 LVM-lj8ttaOvblrd45G7WsFB3bXAGeWPe6qp1lw8e8zWR1YXniOr6glzmTQ3H17a4uFx

> Devices left behind by earlier pvmoves not cleaned up properly for
> some reason?

Running "vgdisplay --verbose" shows only the expected physical volumes, volume groups, and logical volumes.

> udev rules interfering?

I have created no custom udev rules.

Comment 3 James Ralston 2009-09-03 02:02:57 UTC
BTW, I can *easily* reproduce this at will. Here's a sequence of sequential commands:

$ lvcreate -L 1M -n test os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test" created

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ lvremove /dev/os/test
  /dev/dm-15: open failed: No such file or directory
Do you really want to remove active logical volume test? [y/n]: y
  Logical volume "test" successfully removed

$ lvcreate -L 1M -n test os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test" created

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  /dev/dm-1: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ lvremove /dev/os/test
Do you really want to remove active logical volume test? [y/n]: y
  Logical volume "test" successfully removed

$ lvcreate -L 1M -n test os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test" created

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ lvremove /dev/os/test
Do you really want to remove active logical volume test? [y/n]: y
  Logical volume "test" successfully removed

$ lvcreate -L 1M -n test os /dev/dm-0
  /dev/dm-14: open failed: No such file or directory
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test" created

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ lvremove /dev/os/test
  /dev/dm-15: open failed: No such device or address
Do you really want to remove active logical volume test? [y/n]: y
  Logical volume "test" successfully removed

$ lvcreate -L 1M -n test os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test" created

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-15: open failed: No such device or address
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-15: open failed: No such device or address
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ pvmove -n test /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

The only pattern I can see is that the *first* move never seems to fail, but subsequent moves might:

$ lvcreate -L 1M -n test1 os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test1" created

$ lvcreate -L 1M -n test2 os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test2" created

$ lvcreate -L 1M -n test3 os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test3" created

$ lvcreate -L 1M -n test4 os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test4" created

$ lvcreate -L 1M -n test5 os /dev/dm-0
  Rounding up size to full physical extent 32.00 MB
  Logical volume "test5" created

$ pvmove -n test1 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test2 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test3 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test4 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test5 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test1 /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test2 /dev/dm-1 /dev/dm-0
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  /dev/dm-1: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ pvmove -n test3 /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test4 /dev/dm-1 /dev/dm-0
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test5 /dev/dm-1 /dev/dm-0
  /dev/dm-19: open failed: No such device or address
  /dev/dm-1: Moved: 100.0%

$ pvmove -n test1 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test2 /dev/dm-0 /dev/dm-1
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  WARNING: dev_open(/dev/dm-0) called while suspended
  WARNING: dev_open(/dev/dm-1) called while suspended
  /dev/dm-0: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"

$ pvmove -n test3 /dev/dm-0 /dev/dm-1
  /dev/dm-19: open failed: No such device or address
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test4 /dev/dm-0 /dev/dm-1
  /dev/dm-0: Moved: 100.0%

$ pvmove -n test5 /dev/dm-0 /dev/dm-1
  /dev/dm-19: open failed: No such device or address
  /dev/dm-0: Moved: 100.0%

I'm also not sure what to make of the "open failed: No such device or address" error on the *successful* moves.

One other point, if it matters: I originally pvmove'd the (root, usr, var) volumes from the internal drive (dm-1) to the USB drive (dm-0) while the box was still running Fedora 9; all other logical volumes I created as new on dm-0 and rsync'ed the contents (because I wanted to "update" the filesystems on those volumes from ext3 to ext4). When I reinstalled the box with Fedora 11, all logical volumes were housed on the USB drive. The pain only began when I attempted to pvmove the logical volumes from the USB drive back to the (repartitioned) internal drive.

Is there any other information I could gather that would be useful to help troubleshoot this problem?

Comment 4 Milan Broz 2009-09-03 05:59:32 UTC
(In reply to comment #3)
>   device-mapper: create ioctl failed: Device or resource busy
>   Unable to reactivate logical volume "pvmove0"

This is strange and really should not happen. Maybe again some udev rule scanning device in wrong place?

Please can you attach tarball created with "lvmdump -m" so we can check all system information?

Comment 5 Milan Broz 2009-09-04 14:22:47 UTC
I tried that on rawhide 0 everyhting works ok.

I am not sure if your problem is the same as following but when 
I run script which intentionally opens *-pvmove0 device (to lock it):

# pvmove -i 1 -n lv1 /dev/dm-3 /dev/dm-2
  device-mapper: create ioctl failed: Device or resource busy
  device-mapper: reload ioctl failed: Invalid argument
  WARNING: dev_open(/dev/dm-3) called while suspended
  WARNING: dev_open(/dev/dm-2) called while suspended
  WARNING: dev_open(/dev/dm-3) called while suspended
  WARNING: dev_open(/dev/dm-2) called while suspended
  WARNING: dev_open(/dev/dm-3) called while suspended
  WARNING: dev_open(/dev/dm-2) called while suspended
  /dev/dm-3: Moved: 100.0%
  device-mapper: create ioctl failed: Device or resource busy
  Unable to reactivate logical volume "pvmove0"
  device-mapper: create ioctl failed: Device or resource busy
  ABORTING: Unable to deactivate temporary logical volume "pvmove0"

# dmsetup info
Name:              vg_test-lv1
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        0
Event number:      0
Major, minor:      253, 4
Number of targets: 1
UUID: LVM-fJ7aJYQKyIXno1RacEZtHwvU2LfO8lRHvnIt89QJeZccbJIIBmW4OszVVDLsb119

Name:              vg_test-pvmove0
State:             SUSPENDED
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      1
Major, minor:      253, 5
Number of targets: 1
UUID: LVM-fJ7aJYQKyIXno1RacEZtHwvU2LfO8lRHET8LQS9mOh4QVtA2JolP1GYZpTmURPI1

...


So two problems:
- something (probably wrong udev rule) open the device in the wrong moment and it cause pvmove operation to fail.

- pvmove aborts but temporary device is left suspended and pvmove --abort doesn't work

# lvm version
  LVM version:     2.02.51(1) (2009-08-06)
  Library version: 1.02.36 (2009-08-06)
  Driver version:  4.15.0
# rpm -q lvm2
lvm2-2.02.51-3.fc12.x86_64

Comment 6 James Ralston 2009-09-12 19:51:40 UTC
Created attachment 360811 [details]
output of "lvmdump -m" on problem system

Here's the information requested in comment 4.

Note that I'm about to take this system down to replace most of its hardware (motherboard, processor, memory, power supply, hard drive) and reload the OS (with Fedora 11 again, but x86_64 this time). When I get the system reloaded and back up, I'll try to reproduce this problem again.

Comment 7 Milan Broz 2009-09-13 08:20:03 UTC
Thanks,
I think I already found possible problem in pvmove when something unexpectedly opens private temporary device (traceback is very similar to reported one, so it is probably part of problem).

Still not sure if this can lead to data corruption but it is on my todo list for next week.

Comment 8 James Ralston 2009-12-09 04:01:31 UTC
Milan, do you have a status update? Do you believe the problem with pvmove is fixed in Fedora 12?

I ask because I need to run some pvmove commands on some boxes I just upgraded to Fedora 12, but I don't trust pvmove right now. :(

Comment 9 Milan Broz 2009-12-09 09:37:55 UTC
Ah, sorry I forgot to update this.

The basic problem is here that lvm was designed such way that it do not expect any application, except lvm itself, will touch internal lvm devices (here pvmove temporary mirror) - and error handling was not perfect if it happens. Unfortunatelly this changed with DevKit-disks and similar utilities which scan every devices in system.

I proposed three patches, one is in recent stable (since lvm2 2.02.54), two remaining were rejected (it was basically just workaround for udev rules problems, which should be fixed in these udev rules.)

I think that udev rules were updated in F12 final (there were similar problems with cryptsetup). But because lvm switched in rawhide to using udev directly I am now really not sure that all problems were addressed in this F12 code too, I hope so - see bug 528909 also.

Comment 10 Bug Zapper 2010-04-28 10:08:43 UTC
This message is a reminder that Fedora 11 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 11.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '11'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 11's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 11 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 11 Milan Broz 2010-04-28 10:21:25 UTC
I'll keep this open, it needs retesting (there is lot of changes relating to udev).

Comment 12 James Ralston 2010-05-25 23:47:27 UTC
Milan, once you think you have this fixed, let me know, and I'll try to reproduce the problem again.

In the meantime, I'm still afraid of running pvmove operations. :(

Comment 13 Alasdair Kergon 2010-05-26 00:05:25 UTC
"What effectively happened here is that pvmove attempted to migrate the "tmp"
logical volume from dm-0 to dm-1, failed for some reason (without actually
copying any data)... but then updated the LVM metadata to indicate that the
volume was successfully moved!"

We fixed that.  (Use a version 2.02.61 or later.)  I'd still like us to add more sanity checks though.

Comment 14 Bug Zapper 2010-07-30 10:43:41 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 14 development cycle.
Changing version to '14'.

More information and reason for this action is here:
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 15 Milan Broz 2012-03-22 16:21:54 UTC
Closing this, lvm should now cope with situations where some other tool opens device. I did not seen such pvmove problems for long time.

If you want to retest it (though this bug is open for really long time) try rawhide or F17 for the most recent code (I think updates for older releases will follow if possible).


Note You need to log in before you can comment on or make changes to this bug.