Bug 573240 - pvcreate fails to create a physical volume on large RADI1 volume
Summary: pvcreate fails to create a physical volume on large RADI1 volume
Keywords:
Status: CLOSED NEXTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: lvm2
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: LVM and device-mapper development team
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-03-13 16:23 UTC by Jerry Feldman
Modified: 2010-06-23 15:43 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-23 15:43:05 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Jerry Feldman 2010-03-13 16:23:01 UTC
Description of problem:
pvcreate fails on a large RAID1 volume
[root@gaf gaf]# pvcreate -v -f /dev/md1
  /dev/md1: pe_align (128 sectors) must not be less than pe_align_offset
(36028797018963967 sectors)
  /dev/md1: Format-specific setup of physical volume failed.
  Failed to setup physical volume "/dev/md1"

In this case I was migrating my system to a new HD, but I also tried to install Fedora 12 from scratch using /dev/sda and /dev/sdc whicgh are both 1TB, and pvcreate failed on the write-to-disk phase. I was unable to save the details. 

Version-Release number of selected component (if applicable):
Fedora 12 with kernel 2.6.32.9-70.fc12.x86_64 
lvm2-2.02.53-2.fc12.x86_64
lvm2-libs-2.02.53-2.fc12.x86_64

How reproducible:
On a 1TB or larger drive create a large primary partition, allocate it as a RAID1.


Steps to Reproduce:
1.use fdisk to create a RAD partition
2.create the RAID1: mdadm --create /dev/md1 --verbose --level=1 ==raid-devices=2 missing /dev/sdc2
3.pvcreate -v /dev/md1
  
Actual results:
[root@gaf gaf]# pvcreate -v -f /dev/md1
  /dev/md1: pe_align (128 sectors) must not be less than pe_align_offset
(36028797018963967 sectors)
  /dev/md1: Format-specific setup of physical volume failed.
  Failed to setup physical volume "/dev/md1"

Expected results:
The LVM physical volume would be created. 

Additional info:
Please note here that /dev/sda was an existing 1TB LVM system running fedora 12, and /dev/sda3 is the new drive. /dev/sdb is used for backups. 
http://pastebin.com/4AtMzEjr
   1.Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
   2.255 heads, 63 sectors/track, 121601 cylinders
   3.Units = cylinders of 16065 * 512 = 8225280 bytes
   4.Disk identifier: 0x000448a0
   5.      
   6.Device Boot      Start         End      Blocks   Id  System
   7./dev/sda1   *           1          26      204800   fd  Linux raid autodetect
   8.Partition 1 does not end on cylinder boundary.
   9./dev/sda2              26      121601   976555201   8e  Linux LVM
  10.[root@gaf gaf]# fdisk -l /dev/sdc
  11.       
  12.Disk /dev/sdc: 1000.2 GB, 1000204886016 bytes
  13. 255 heads, 63 sectors/track, 121601 cylinders
  14.
      Units = cylinders of 16065 * 512 = 8225280 bytes
  15.
      Disk identifier: 0x0003a96c
  16.
       
  17.
         Device Boot      Start         End      Blocks   Id  System
  18.
      /dev/sdc1               1          25      200781   fd  Linux raid autodetect
  19.
      /dev/sdc2              26      121601   976559220   fd  Linux raid autodetect
  20.
      [root@gaf gaf]# cat /proc/partitions
  21.
      major minor  #blocks  name
  22.
       
  23.
         8        0  976762584 sda
  24.
         8        1     204800 sda1
  25.
         8        2  976555201 sda2
  26.
         8       16  156290904 sdb
  27.
         8       17  156290903 sdb1
  28.
         8       32  976762584 sdc
  29.
         8       33     200781 sdc1
  30.
         8       34  976559220 sdc2
  31.
       253        0    8290304 dm-0
  32.
       253        1   95109120 dm-1
  33.
       253        2   10240000 dm-2
  34.
       253        3   83648512 dm-3
  35.
       253        4   57962496 dm-4
  36.
       253        5    5120000 dm-5
  37.
       253        6   50331648 dm-6
  38.
         9        0     200704 md0
  39.
         9        1  976559104 md1
  40.
      [root@gaf gaf]# mdadm --detail /dev/md0
  41.
      /dev/md0:
  42.
              Version : 0.90
  43.
        Creation Time : Sun Feb 28 10:44:44 2010
  44.
           Raid Level : raid1
  45.
           Array Size : 200704 (196.03 MiB 205.52 MB)
  46.
        Used Dev Size : 200704 (196.03 MiB 205.52 MB)
  47.
         Raid Devices : 2
  48.
        Total Devices : 2
  49.
      Preferred Minor : 0
  50.
          Persistence : Superblock is persistent
  51.
       
  52.
          Update Time : Tue Mar  2 07:26:32 2010
  53.
                State : clean
  54.
       Active Devices : 2
  55.
      Working Devices : 2
  56.
       Failed Devices : 0
  57.
        Spare Devices : 0
  58.
       
  59.
                 UUID : 4420140e:d8a9d5b5:91e28e81:d7bbf71d (local to host gaf.blu.org)
  60.
               Events : 0.159
  61.
       
  62.
          Number   Major   Minor   RaidDevice State
  63.
             0       8        1        0      active sync   /dev/sda1
  64.
             1       8       33        1      active sync   /dev/sdc1
  65.
      [root@gaf gaf]# mdadm --detail /dev/md1
  66.
      /dev/md1:
  67.
              Version : 0.90
  68.
        Creation Time : Sun Feb 28 12:40:13 2010
  69.
           Raid Level : raid1
  70.
           Array Size : 976559104 (931.32 GiB 1000.00 GB)
  71.
        Used Dev Size : 976559104 (931.32 GiB 1000.00 GB)
  72.
         Raid Devices : 2
  73.
        Total Devices : 1
  74.
      Preferred Minor : 1
  75.
          Persistence : Superblock is persistent
  76.
       
  77.
          Update Time : Sun Feb 28 15:54:28 2010
  78.
                State : clean, degraded
  79.
       Active Devices : 1
  80.
      Working Devices : 1
  81.
       Failed Devices : 0
  82.
        Spare Devices : 0
  83.
       
  84.
                 UUID : 05a09bd8:f968bb0e:91e28e81:d7bbf71d (local to host gaf.blu.org)
  85.
               Events : 0.19
  86.
       
  87.
          Number   Major   Minor   RaidDevice State
  88.
             0       0        0        0      removed
  89.
             1       8       34        1      active sync   /dev/sdc2
  90.
       
  91.
      [root@gaf gaf]# mdadm --examine /dev/sda1
  92.
      /dev/sda1:
  93.
                Magic : a92b4efc
  94.
              Version : 0.90.00
  95.
                 UUID : 4420140e:d8a9d5b5:91e28e81:d7bbf71d (local to host gaf.blu.org)
  96.
        Creation Time : Sun Feb 28 10:44:44 2010
  97.
           Raid Level : raid1
  98.
        Used Dev Size : 200704 (196.03 MiB 205.52 MB)
  99.
           Array Size : 200704 (196.03 MiB 205.52 MB)
 100.
         Raid Devices : 2
 101.
        Total Devices : 2
 102.
      Preferred Minor : 0
 103.
       
 104.
          Update Time : Wed Mar  3 09:02:41 2010
 105.
                State : clean
 106.
       Active Devices : 2
 107.
      Working Devices : 2
 108.
       Failed Devices : 0
 109.
        Spare Devices : 0
 110.
             Checksum : c6afcbc2 - correct
 111.
               Events : 161
 112.
       
 113.
       
 114.
            Number   Major   Minor   RaidDevice State
 115.
      this     0       8        1        0      active sync   /dev/sda1
 116.
       
 117.
         0     0       8        1        0      active sync   /dev/sda1
 118.
         1     1       8       33        1      active sync   /dev/sdc1
 119.
      [root@gaf gaf]# mdadm --examine /dev/sdc1
 120.
      /dev/sdc1:
 121.
                Magic : a92b4efc
 122.
              Version : 0.90.00
 123.
                 UUID : 4420140e:d8a9d5b5:91e28e81:d7bbf71d (local to host gaf.blu.org)
 124.
        Creation Time : Sun Feb 28 10:44:44 2010
 125.
           Raid Level : raid1
 126.
        Used Dev Size : 200704 (196.03 MiB 205.52 MB)
 127.
           Array Size : 200704 (196.03 MiB 205.52 MB)
 128.
         Raid Devices : 2
 129.
        Total Devices : 2
 130.
      Preferred Minor : 0
 131.
       
 132.
          Update Time : Wed Mar  3 09:02:41 2010
 133.
                State : clean
 134.
       Active Devices : 2
 135.
      Working Devices : 2
 136.
       Failed Devices : 0
 137.
        Spare Devices : 0
 138.
             Checksum : c6afcbe4 - correct
 139.
               Events : 161
 140.
       
 141.
       
 142.
            Number   Major   Minor   RaidDevice State
 143.
      this     1       8       33        1      active sync   /dev/sdc1
 144.
       
 145.
         0     0       8        1        0      active sync   /dev/sda1
 146.
         1     1       8       33        1      active sync   /dev/sdc1
 147.
      [root@gaf gaf]# mdadm --examine /dev/sdc2
 148.
      /dev/sdc2:
 149.
                Magic : a92b4efc
 150.
              Version : 0.90.00
 151.
                 UUID : 05a09bd8:f968bb0e:91e28e81:d7bbf71d (local to host gaf.blu.org)
 152.
        Creation Time : Sun Feb 28 12:40:13 2010
 153.
           Raid Level : raid1
 154.
        Used Dev Size : 976559104 (931.32 GiB 1000.00 GB)
 155.
           Array Size : 976559104 (931.32 GiB 1000.00 GB)
 156.
         Raid Devices : 2
 157.
        Total Devices : 1
 158.
      Preferred Minor : 1
 159.
       
 160.
          Update Time : Wed Mar  3 09:02:44 2010
 161.
                State : clean
 162.
       Active Devices : 1
 163.
      Working Devices : 1
 164.
       Failed Devices : 0
 165.
        Spare Devices : 0
 166.
             Checksum : e3215efc - correct
 167.
               Events : 21
 168.
       
 169.
       
 170.
            Number   Major   Minor   RaidDevice State
 171.
      this     1       8       34        1      active sync   /dev/sdc2
 172.
       
 173.
         0     0       0        0        0      removed
 174.
         1     1       8       34        1      active sync   /dev/sdc2
 175.
      [root@gaf gaf]# pvs
 176.
        PV         VG     Fmt  Attr PSize   PFree  
 177.
        /dev/sda2  vg_gaf lvm2 a-   931.31G 635.00G
 178.
      [root@gaf gaf]# vgs
 179.
        VG     #PV #LV #SN Attr   VSize   VFree  
 180.
        vg_gaf   1   7   0 wz--n- 931.31G 635.00G
 181.
      [root@gaf gaf]# lvs
 182.
        LV         VG     Attr   LSize  Origin Snap%  Move Log Copy%  Convert
 183.
        LogDown    vg_gaf -wi-ao 55.28G                                      
 184.
        LogHome    vg_gaf -wi-ao 90.70G                                      
 185.
        LogLocal   vg_gaf -wi-ao  4.88G                                      
 186.
        LogRoot    vg_gaf -wi-ao 79.77G                                      
 187.
        LogSwap    vg_gaf -wi-ao  7.91G                                      
 188.
        LogTemp    vg_gaf -wi-ao  9.77G                                      
 189.
        LogVirtual vg_gaf -wi-ao 48.00G        
 190.
      [root@gaf gaf]# lvmdiskscan
 191.
        /dev/ram0  [       16.00 MB]
 192.
        /dev/md0   [      196.00 MB]
 193.
        /dev/dm-0  [        7.91 GB]
 194.
        /dev/ram1  [       16.00 MB]
 195.
        /dev/md1   [      931.32 GB]
 196.
        /dev/dm-1  [       90.70 GB]
 197.
        /dev/ram2  [       16.00 MB]
 198.
        /dev/sda2  [      931.32 GB] LVM physical volume
 199.
        /dev/dm-2  [        9.77 GB]
 200.
        /dev/ram3  [       16.00 MB]
 201.
        /dev/root  [       79.77 GB]
 202.
        /dev/ram4  [       16.00 MB]
 203.
        /dev/dm-4  [       55.28 GB]
 204.
        /dev/ram5  [       16.00 MB]
 205.
        /dev/dm-5  [        4.88 GB]
 206.
        /dev/ram6  [       16.00 MB]
 207.
        /dev/dm-6  [       48.00 GB]
 208.
        /dev/ram7  [       16.00 MB]
 209.
        /dev/ram8  [       16.00 MB]
 210.
        /dev/ram9  [       16.00 MB]
 211.
        /dev/ram10 [       16.00 MB]
 212.
        /dev/ram11 [       16.00 MB]
 213.
        /dev/ram12 [       16.00 MB]
 214.
        /dev/ram13 [       16.00 MB]
 215.
        /dev/ram14 [       16.00 MB]
 216.
        /dev/ram15 [       16.00 MB]
 217.
        /dev/sdb1  [      149.05 GB]
 218.
        1 disk
 219.
        25 partitions
 220.
        0 LVM physical volume whole disks
 221.
        1 LVM physical volume
 222.
      [root@gaf gaf]# pvscan
 223.
        PV /dev/sda2   VG vg_gaf   lvm2 [931.31 GB / 635.00 GB free]
 224.
        Total: 1 [931.31 GB] / in use: 1 [931.31 GB] / in no VG: 0 [0   ]

Comment 1 Mike Snitzer 2010-03-13 21:16:07 UTC
(In reply to comment #0)
> Description of problem:
> pvcreate fails on a large RAID1 volume
> [root@gaf gaf]# pvcreate -v -f /dev/md1
>   /dev/md1: pe_align (128 sectors) must not be less than pe_align_offset
> (36028797018963967 sectors)
>   /dev/md1: Format-specific setup of physical volume failed.
>   Failed to setup physical volume "/dev/md1"

I believe there is a combination of failures here.

1) LVM2 had an issue where a pvcreate would fail if the underlying device (e.g. /dev/md1) was misaligned (aka: alignment_offset=-1).  This has since been fixed and is available in LVM2 2.02.62, see:
http://sources.redhat.com/git/gitweb.cgi?p=lvm2.git;a=commit;h=8cb8f65010c

(NOTE: please verify that /sys/block/md1/alignment_offset is -1)

2) The kernel (2.6.32.9-70.fc12.x86_64) does not contain the latest upstream fixes that were made to blk_stack_limits().  Without these fixes the stacking of limits (by MD) is prone to failure (resulting in alignment_offset=-1), see:
http://git.kernel.org/linus/81744ee44ab284
http://git.kernel.org/linus/fe0b393f2c0a0d

Comment 2 Milan Broz 2010-06-23 15:43:05 UTC
This should be fixed in new kernel and lvm2.


Note You need to log in before you can comment on or make changes to this bug.