463431 – [RHEL5.3] Excessive LVM volume alignment for MD device

Bug 463431 - [RHEL5.3] Excessive LVM volume alignment for MD device

Summary: [RHEL5.3] Excessive LVM volume alignment for MD device

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	lvm2
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	high
Severity:	high
Target Milestone:	beta
Target Release:	---
Assignee:	Milan Broz
QA Contact:	Cluster QE
Docs Contact:
URL:	http://rhts.redhat.com/cgi-bin/rhts/t...
Whiteboard:
Duplicates (1):	460602 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-09-23 13:03 UTC by Jeff Burke
Modified:	2013-03-01 04:06 UTC (History)
CC List:	22 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-01-20 21:34:48 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2009:0164	0	normal	SHIPPED_LIVE	anaconda bug fix and enhancement update	2009-01-20 16:05:24 UTC
Red Hat Product Errata	RHBA-2009:0179	0	normal	SHIPPED_LIVE	lvm2 bug-fix and enhancement update	2009-01-20 16:05:45 UTC

Description Jeff Burke 2008-09-23 13:03:19 UTC

Description of problem:
* RHEL5.3-Server-20080919.1
* Error appears through automate RHTS testing
* Behavior appears arch independent

Version-Release number of selected component (if applicable):
RHEL5.3-Server-20080917.nightly

How reproducible:
Always with 

Steps to Reproduce:
1. Try a nfs install from pxe with a ks.cfg file.
  
Actual results:
<snip>
anaconda(826): unaligned access to 0x2000000001956b44, ip=0x2000000000018880
anaconda(826): unaligned access to 0x2000000001956b44, ip=0x2000000000018890
anaconda(826): unaligned access to 0x2000000001956b5c, ip=0x2000000000018880
anaconda(826): unaligned access to 0x2000000001956b5c, ip=0x2000000000018890
anaconda(826): unaligned access to 0x2000000001956b74, ip=0x2000000000018880
Probing for video card:   ATI Technologies Inc ES1000
ATATI9ATRunning pre-install scripts
Retrieving installation information...
In progress...    Completed Completed 
Retrieving installation information...
In progress...    Completed Completed 
Retrieving installation information...
In progress...    Completed Completed 
Retrieving installation information...
In progress...    Completed Completed 
Checking dependencies in packages selected for installation...
In progress...    
Can't have a question in command line mode!
LVM operation failed
lvcreate failed for swap0

The installer will now exit...
</snip>

Expected results:
Install on ia64 should work

Additional info:
Link to kickstart file
 http://rhts.redhat.com/testlogs/29744/108331/925143/ks.cfg

Link to Anaconda log file
http://rhts.redhat.com/testlogs/29744/108331/925143/anaconda.log

Link to Sys log file
http://rhts.redhat.com/testlogs/29744/108331/925143/sys.log

Comment 1 Chris Lumens 2008-09-23 14:05:20 UTC

Can you remove cmdline from your kickstart file, re-run the test, and see what question it's trying to ask you?

Comment 2 Jeff Burke 2008-09-23 14:17:07 UTC

Chris,
  Unfortunately I can't. These systems are part of RHTS. To modify the kickstart template files you need root privileges, Which I do not have.

  I think the only thing we can do at this point is grab the kickstart and try and reproduce it locally in the anaconda test network. Unless you can work with the engineering operations folks to hack up the kickstart on the RHTS server.

Comment 3 Bill Peck 2008-09-25 19:04:45 UTC

Chris,

Does this log help?

http://rhts.redhat.com/testlogs/30544/110317/937822/anaconda.log

Looking at that I see alot of the following repeated..

18:52:26 DEBUG   : self.driveList(): ['hda', 'sda', 'sdb']
18:52:26 DEBUG   : DiskSet.skippedDisks: []
18:52:26 DEBUG   : DiskSet.skippedDisks: []
18:52:26 DEBUG   : done starting mpaths.  Drivelist: ['hda', 'sda', 'sdb']
18:52:26 DEBUG   : adding drive hda to disk list
18:52:26 DEBUG   : adding drive sda to disk list
18:52:26 DEBUG   : adding drive sdb to disk list
18:52:26 DEBUG   : no preexisting size for volume group VolGroup00
18:52:26 DEBUG   :   got pv.size of 7.84423828125, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7.8134765625, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7.8134765625, clamped to 0
18:52:26 DEBUG   :   total space: 0
18:52:26 DEBUG   : no preexisting size for volume group VolGroup00
18:52:26 DEBUG   :   got pv.size of 7.84423828125, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7.8134765625, clamped to 0
18:52:26 DEBUG   :   got pv.size of 5122.25683594, clamped to 5120
18:52:26 DEBUG   :   total space: 5120
18:52:26 DEBUG   : no preexisting size for volume group VolGroup00
18:52:26 DEBUG   :   got pv.size of 7.84423828125, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7.8134765625, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7679.47851562, clamped to 7648
18:52:26 DEBUG   :   total space: 7648
18:52:26 DEBUG   : no preexisting size for volume group VolGroup00
18:52:26 DEBUG   :   got pv.size of 7.84423828125, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7.8134765625, clamped to 0
18:52:26 DEBUG   :   got pv.size of 8958.08935547, clamped to 8928
18:52:26 DEBUG   :   total space: 8928
18:52:26 DEBUG   : no preexisting size for volume group VolGroup00
18:52:26 DEBUG   :   got pv.size of 7.84423828125, clamped to 0
18:52:26 DEBUG   :   got pv.size of 7.8134765625, clamped to 0
18:52:26 DEBUG   :   got pv.size of 9593.47265625, clamped to 9568
18:52:26 DEBUG   :   total space: 9568

Comment 4 Chris Lumens 2008-09-25 19:18:28 UTC

Well, the real error here is found in the lvmout.log file:

  Physical volume '/dev/sdb1' listed more than once.
  Unable to add physical volume '/dev/sdb1' to volume group 'VolGroup00'.

Then when we hit that error, we usually bring up a messageWindow.  In cmdline mode, messageWindow just displays the error message and then says "You can't have a question in command line mode!" because there's nothing more we can do from that situation.

So the question is why we're seeing that lvm error.

Comment 5 Bill Peck 2008-09-25 19:39:31 UTC

Then when we hit that error, we usually bring up a messageWindow.  In cmdline
mode, messageWindow just displays the error message and then says "You can't
have a question in command line mode!" because there's nothing more we can do
from that situation.



Don't you think you could print the error even in cmdline mode?

Comment 6 Chris Lumens 2008-09-25 19:43:53 UTC

We did:

LVM operation failed
lvcreate failed for swap0

That's the same information you would get in the usual graphical installer too.  The other information to be found is in anaconda.log and lvmout.log.

Comment 7 Bill Peck 2008-09-26 19:22:24 UTC

ok,

after further investigation I think we have two bugs here.

the lvmout.log for lvcreate for swap0 is this:
  Insufficient free extents (60) in volume group VolGroup00: 64 required


The problem referenced in comment 3 has to do with multipath.  both sda and sdb
are the same disk and anaconda does not handle this correctly.

Comment 8 Chris Lumens 2008-09-26 19:45:50 UTC

*** Bug 460602 has been marked as a duplicate of this bug. ***

Comment 13 Hans de Goede 2008-09-30 14:18:22 UTC

The link to the ks.cfg file in the original description is no longer working (non of the links are) next time please attach files instead of putting in links to volatile locations.

Can you attach ks.cfg I would like to take a look at what the ks is doing with regards to partition creation.

I think that it is trying to fit more on the disk then will fit. Assuming this worked before I guess we got stricter with regards to this. Maybe you can even do another test run with the same ks with a somewhat smaller swap0 ?

Comment 14 Bill Peck 2008-09-30 14:27:39 UTC

sorry about that.  to save space a cron job gzips all the log files.

zerombr
clearpart --all --initlabel
#PART_DETAILS#
part /boot/efi --fstype vfat --size=100 --ondisk=sda --asprimary
part raid.9 --size=100 --grow --ondisk=sdb
part raid.8 --size=100 --grow --ondisk=sda
raid pv.10 --fstype "physical volume (LVM)" --level=RAID0 --device=md0 raid.8 raid.9
volgroup VolGroup00 --pesize=32768 pv.10
logvol swap --fstype swap --name=swap0 --vgname=VolGroup00 --size=2048
logvol / --fstype ext3 --name=LogVol00 --vgname=VolGroup00 --size=1024 --grow


Thats from the ks.cfg file.

<5>SCSI device sda: 71132960 512-byte hdwr sectors (36420 MB)
<5>SCSI device sdb: 71132960 512-byte hdwr sectors (36420 MB)
6>HP CISS Driver (v 3.6.20-RH2)
<6>ACPI: PCI Interrupt 0000:08:00.0[A] -> GSI 63 (level, low) -> IRQ 55
<6>cciss0: <0x3230> at PCI 0000:08:00.0 IRQ 69 using DAC
<6>      blocks= 234281760 block_size= 512
<6>      heads= 255, sectors= 32, cylinders= 28711
<4>
<6>      blocks= 143305920 block_size= 512
<6>      heads= 255, sectors= 32, cylinders= 17562
<4>
<6>      blocks= 234281760 block_size= 512
<6>      heads= 255, sectors= 32, cylinders= 28711
<4>
<6> cciss/c0d0:
<6>      blocks= 143305920 block_size= 512
<6>      heads= 255, sectors= 32, cylinders= 17562
<4>
<6> cciss/c0d1:

Comment 15 Chris Lumens 2008-09-30 18:36:49 UTC

Hm, removing RAID from the kickstart file makes it work fine.

Comment 16 Chris Lumens 2008-09-30 18:59:48 UTC

Can I get a list of which nightlies worked and which failed this test?  Trying to narrow down the anaconda versions where it changed.

Comment 17 Jeff Burke 2008-09-30 19:10:03 UTC

Chris,
   Unfortunately I can not give you an exact list. We had other issues that caused us not to get even this far with some other distros.

  Also if the rhts scheduler selected a machine that did not duplicate this issue. IE it selected a machine that did not have raid in it's kickstart then that distro would say passed but it would still have the issue.

   I think you will have to do some testing with system(s) that are "known" to fail. Binary searching the nightly trees until we find the tree that it started in.

Comment 18 Bill Peck 2008-09-30 19:11:08 UTC

Job 31001 is in process

http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=31001

RHEL5.3-Server-20080917.nightly 
RHEL5.3-Server-20080918.nightly 
RHEL5.3-Server-20080919.nightly 
RHEL5.3-Server-20080924.nightly 
RHEL5.3-Server-20080925.nightly 
RHEL5.3-Server-20080926.nightly 
RHEL5.3-Server-20080922.0 
RHEL5.3-Server-20080919.1 
RHEL5.3-Server-20080912.1

Comment 19 Radek Vykydal 2008-10-01 13:04:12 UTC

I reproduced the bug, using the ks below (similar to that from comment #14
only using one physical drive):

zerombr
clearpart --all --initlabel
#PART_DETAILS#
part /boot --ondisk hda --fstype ext3 --size=00100 --asprimary
part raid.9 --size=100 --grow --ondisk=hda
part raid.8 --size=100 --grow --ondisk=hda
raid pv.10 --fstype "physical volume (LVM)" --level=RAID0 --device=md0 raid.8 raid.9
volgroup VolGroup00 --pesize=32768 pv.10
logvol swap --fstype swap --name=swap0 --vgname=VolGroup00 --size=2048
logvol / --fstype ext3 --name=LogVol00 --vgname=VolGroup00 --size=1024 --grow

Which gave the same result:
  Insufficient free extents (60) in volume group VolGroup00: 64 required

... 4 PE missing, which i found in pvdisplay output as "not usable":

  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               VolGroup00
  PV Size               9.90 GB / not usable 150.50 MB
  Allocatable           yes
  PE Size (KByte)       32768
  Total PE              312
  Free PE               60
  Allocated PE          252
  PV UUID               V2EVjW-zrwZ-ksk1-pCGl-GHfi-nFGh-pPr6GX

I wonder what the "not usable" means and why is that, should
our getActualSize take it into account?

Comment 20 Bill Peck 2008-10-01 13:24:29 UTC

According to RHTS job 31001

RHEL5.3-Server-20080917.nightly   InProcess  (I expect it to pass or SEGV)
RHEL5.3-Server-20080918.nightly   Success
RHEL5.3-Server-20080919.nightly   Success
RHEL5.3-Server-20080924.nightly   Fails
RHEL5.3-Server-20080925.nightly   Fails
RHEL5.3-Server-20080926.nightly   Fails
RHEL5.3-Server-20080922.0         Fails
RHEL5.3-Server-20080919.1         Fails
RHEL5.3-Server-20080912.1         Fails - Different known failure SEGV

So 0919.nightly works but 0919.1 does not. Should be easy to tell what changed there?

Comment 21 Chris Lumens 2008-10-01 14:26:51 UTC

RHEL5.3-Server-20080919.nightly has lvm2-2.02.32-4.el5.i386.rpm, whereas RHEL5.3-Server-20080919.1 has lvm2-2.02.40-2.el5.i386.rpm.  These two trees also have different versions of anaconda, but the only thing we did was move from a per-device encryption passphrase to a system-wide one and nothing in that patch looks suspect.

I can reproduce this problem after stripping all the RAID out of the original kickstart file, so this could be a problem with the rebase of the LVM tools.  Thoughts?

Comment 22 Hans de Goede 2008-10-01 14:33:46 UTC

Maybe its not as much a problem with the new LVM tools as it is a problem in the interaction between anaconda and those tools ?

IOW maybe the output of the lvm command has changed (subtly) and that is biting us?

Comment 23 Radek Vykydal 2008-10-01 15:17:45 UTC

(In reply to comment #19)

I am attaching some more log info. I don't see anything significant
here, but someone else might.

----------------------------------------------

After running
  lvm pvcreate -ff -y -v /dev/md0
by anaconda:

stdout:
  Physical volume "/dev/md0" successfully created
stderr:
  WARNING: Forcing physical volume creation on /dev/md0 of volume group "VolGroup00"
    Set up physical volume for "/dev/md0" with 20755456 available sectors
    Zeroing start of device /dev/md0

pvdisplay output:
  "/dev/md0" is a new physical volume of "9.90 GB"
  --- NEW Physical volume ---
  PV Name               /dev/md0
  VG Name
  PV Size               9.90 GB
  Allocatable           NO
  PE Size (KByte)       0
  Total PE              0
  Free PE               0
  Allocated PE          0
  PV UUID               O7G0bY-8JxY-90Tb-uSxV-gO8N-3OPu-SsTmkc


-----------------------------------------------

After running
  lvm vgcreate -v -An -s 32768 VolGroup00
by anaconda:

stdout:
  Volume group "VolGroup00" successfully created
stderr:
    Wiping cache of LVM-capable devices
    Adding physical volume '/dev/md0' to volume group 'VolGroup00'
  WARNING: This metadata update is NOT backed up

pvdisplay output:
  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               VolGroup00
  PV Size               9.90 GB / not usable 150.50 MB
  Allocatable           yes
  PE Size (KByte)       32768
  Total PE              312
  Free PE               312
  Allocated PE          0
  PV UUID               O7G0bY-8JxY-90Tb-uSxV-gO8N-3OPu-SsTmkc

vgdisplay output:
  --- Volume group ---
  VG Name               VolGroup00
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               9.75 GB
  PE Size               32.00 MB
  Total PE              312
  Alloc PE / Size       0 / 0
  Free  PE / Size       312 / 9.75 GB
  VG UUID               tQJGOU-FCJS-J0iN-9OUg-QSmA-i2gH-aTwbqW


----------------------------------------------------------

fdisk -l output:

Disk /dev/hda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/hda1   *           1          13      104391   83  Linux
/dev/hda2              14         659     5188995   fd  Linux raid autodetect
/dev/hda3             660        1305     5188995   fd  Linux raid autodetect

Disk /dev/md0: 10.6 GB, 10626793472 bytes
2 heads, 4 sectors/track, 2594432 cylinders
Units = cylinders of 8 * 512 = 4096 bytes

Disk /dev/md0 doesn't contain a valid partition table

Comment 24 Alasdair Kergon 2008-10-01 22:57:25 UTC

Try adding --config 'device { md_chunk_alignment = 0 }' to the vgcreate command as a temporary workaround.

We added a performance tweak when LVM volumes are above MD volumes to align the I/O through the stack.  This may involve a small effective size reduction to achieve better alignment.

The above lvm.conf setting disables the tweak.

Worth investigating further though - 128MB lost sounds rather a lot. (It may be that a smaller extent size should be chosen in the kickstart file).

Comment 26 Radek Vykydal 2008-10-02 09:54:38 UTC

to comment #24:

Adding --config 'devices { md_chunk_alignment = 0 }' to vgcreate
(note that the option is 'devices', not 'device' as in comment #24)
worked (installed successfully), with "not usable" reduced to 22,20 MB:

  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               VolGroup00
  PV Size               9.90 GB / not usable 22.50 MB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              316
  Free PE               0
  Allocated PE          316
  PV UUID               BpbZGx-IDY0-ff4g-UT78-OY4W-EpIc-ue0C8N

to comment #21:

> I can reproduce this problem after stripping all the RAID out of the original
> kickstart file,
I can't, without RAID the ks works for me.

Comment 27 Radek Vykydal 2008-10-02 10:12:36 UTC

(In reply to comment #26)

The "not usable" 22.50MB seems just due to extent size now,
and it is clamped accordingly during anaconda actual pv size computations.

Comment 28 Alasdair Kergon 2008-10-02 11:23:24 UTC

What's the value of /sys/block/md0/md/chunk_size ?
(and what do the md tools say the chunk size is i.e. to confirm lvm2 gets the units right)

Comment 29 Radek Vykydal 2008-10-02 12:22:50 UTC

chunk size is 256K

/sys/block/md0/md/chunk_size = 262144

/dev/md0:
        Version : 00.90.03
  Creation Time : Thu Oct  2 13:49:49 2008
     Raid Level : raid0
     Array Size : 10377728 (9.90 GiB 10.63 GB)
   Raid Devices : 2
  Total Devices : 2
Preferred Minor : 0
    Persistence : Superblock is persistent

    Update Time : Thu Oct  2 13:49:49 2008
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

     Chunk Size : 256K

           UUID : d91ac49c:17c55c1c:d2df1b05:4f6ae343
         Events : 0.1

    Number   Major   Minor   RaidDevice State
       0       3        2        0      active sync   /dev/hda2
       1       3        3        1      active sync   /dev/hda3

Comment 30 Hans de Goede 2008-10-02 14:41:11 UTC

Note lv is using extends of 32 *Megs* each though, which perfectly explains the wasted 28 MB's when using the lvm cmdline option to revert to the old behavior.

These 32 Megs extends may also be the cause of the "large" loss of 150Mb usable space with the new lvm behavior. I think the biggest problem here is though that anaconda gets the volgroup size wrong with the new lvm behavior.

Comment 32 Chris Lumens 2008-10-02 21:16:36 UTC

I've added that to our lvm.conf blurb that we write out, so anaconda-11.1.2.135-1 should have this workaround included.  That should help out testing while we figure out what we should do for real.

Comment 33 Alasdair Kergon 2008-10-03 01:03:28 UTC

Well I reckon there's a conversion from bytes to sectors missing, so the alignment boundary is 512 times larger than intended.  Need to test the patch I have for this then build a new lvm2 package.

Note to lvm2 developers:

  By default all sizes in lvm2 code are in sectors - exceptions to that should be obvious e.g. if the variable name says bytes.

  A variable should never be sometimes size-in-sectors and sometimes size-in-bytes, depending where you are in the function or calculation e.g. avoid that by using two variables.

Comment 35 Milan Broz 2008-10-06 10:28:10 UTC

Fix in
lvm2-2.02.40-4.el5
lvm2-cluster-2.02.40-4.el5

Comment 36 Milan Broz 2008-10-06 12:48:26 UTC

I suggest we need a release note like this:

For performance reasons, LVM2 Logical Volumes are now aligned to MD (Multiple Device) chunk size.

It means that Logical Volume will always start on offset which is multiple of MD chunk size.

To use previous mode of alignment set md_chunk_alignment variable to 0 in lvm.conf.

---
More info for discussion:
Previous LVM2 version aligns Logical Volumes to 64k (or to pagesize if pagesize is greater than 64k, also note that there is metadata area in the beginning of PV).

Example:
   /dev/md0 has chunk size 512k

   Without md alignment, pe_start is at 384 sector
   (see in metadata or using "dmsetup table")

   # dmsetup table
   vg_test-lv: 0 204800 linear 9:0 384

   With md alignment on, it changes pe_start to 1024
   (iow 512k - value is in 512 byte sectors)

   # dmsetup table
   vg_test-lv: 0 204800 linear 9:0 1024

It means that in some situation there can be some unused space (up to MD chunksize per LV) and creating of volume which was previously misaligned (but fits into space) can fail (max 1 extent missing because of offset increase).

Usually LVM extent is multiple of MD chunksize, so the real problem is in offset of the first volume.

The importance of aligment is that high level code (like ext3) usually optimize writes according to MD chunk and if the LVM layer doesn't respect this alignment, IO requests are split into pieces, and underlying RAID have to compute another XOR for next chunk, and runs more IO requests than necessary.

The question is, if anaconda need to change its partitioning code (which compute the size itself) of will disable this behaviour for the next release...

Comment 37 Radek Vykydal 2008-10-06 14:01:48 UTC

With fixed lvm2 package (lvm2-2.02.40-4.el5) used with ks
from comment #19, I got expected results.
With chunks of size 256k, and chunk alignment on, the offset
is 512 sectors (256k).

# dmsetup table
VolGroup00-LogVol00: 0 16515072 linear 9:0 512
VolGroup00-swap0: 0 4194304 linear 9:0 16515584

# vgdisplay
  --- Volume group ---
  VG Name               VolGroup00
  System ID             
  Format                lvm2 
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               9.88 GB
  PE Size               32.00 MB
  Total PE              316
  Alloc PE / Size       316 / 9.88 GB
  Free  PE / Size       0 / 0   
  VG UUID               ATATgq-XxRg-QD3V-sX0n-5SqA-MRtx-7qMcKJ

# pvdisplay
  --- Physical volume ---
  PV Name               /dev/md0
  VG Name               VolGroup00
  PV Size               9.90 GB / not usable 22.50 MB
  Allocatable           yes (but full)
  PE Size (KByte)       32768
  Total PE              316
  Free PE               0
  Allocated PE          316
  PV UUID               xCGS42-x1Gr-BrBX-dAt9-cggS-IXBV-fPo1gL

Comment 38 Radek Vykydal 2008-10-06 14:04:09 UTC

(In reply to comment #36)

> The question is, if anaconda need to change its partitioning code (which
> compute the size itself) of will disable this behaviour for the next release...

I think that if the change of anaconda code should be only something like
counting the chunk size in when computing actual (available) size of
pv above raid, it can be easy to make.

Comment 39 Hans de Goede 2008-10-06 14:14:22 UTC

(In reply to comment #38)
> (In reply to comment #36)
> 
> > The question is, if anaconda need to change its partitioning code (which
> > compute the size itself) of will disable this behaviour for the next release...
> 
> I think that if the change of anaconda code should be only something like
> counting the chunk size in when computing actual (available) size of
> pv above raid, it can be easy to make.

I think it can be as easy as just substract 1 LVM extend size from the computed VG size if its on top of raid, atleast if Milan Broz is correct that we loose at max 1 extend. Milan, what happens if I have 4 disks and create 2 raid0 pairs using these 4 disks and then do one volumegroup over those 2 raid "arrays", can we then still loose max 1 extend compared to the old situation?

Comment 41 Milan Broz 2008-10-06 14:28:45 UTC

(In reply to comment #39)
> I think it can be as easy as just substract 1 LVM extend size from the computed
> VG size if its on top of raid, atleast if Milan Broz is correct that we loose
> at max 1 extend. Milan, what happens if I have 4 disks and create 2 raid0 pairs
> using these 4 disks and then do one volumegroup over those 2 raid "arrays",can
> we then still loose max 1 extend compared to the old situation?

Well, I expect that LVM extend size is multiple of MD chunk size - so the problem is *only* with first offset on PV (all subsequnt LVs are alligned automatically - no space lost).

pe_start is now property every PV - so if there is more underlying MD PVs, each of them can have aligned offset.
So if I count correctly, in the wors case we can lost maximal 1 LVM extent per every underlying MD device.

Comment 42 Radek Vykydal 2008-10-06 15:18:00 UTC

(In reply to comment #39)
> (In reply to comment #38)
> > (In reply to comment #36)
> > 
> > > The question is, if anaconda need to change its partitioning code (which
> > > compute the size itself) of will disable this behaviour for the next release...
> > 
> > I think that if the change of anaconda code should be only something like
> > counting the chunk size in when computing actual (available) size of
> > pv above raid, it can be easy to make.
> 
> I think it can be as easy as just substract 1 LVM extend size from the computed
> VG size if its on top of raid, atleast if Milan Broz is correct that we loose
> at max 1 extend.

Comparing the default PE size and chunk size, the case when we lost
1 PE due to chunk alignment shouldn't be too frequent, so perhaps it is
overkill to reduce the space available always by one PE, we can just subtract
1 chunk size (perhaps with some reserve) before aligning
(clamping) the available space to PE size. It is easy too.

> Milan, what happens if I have 4 disks and create 2 raid0 pairs
> using these 4 disks and then do one volumegroup over those 2 raid "arrays", can
> we then still loose max 1 extend compared to the old situation?

In anaconda we compute the actual available size on VG
(e.g. with reductions due to PE size) as a result of computing it
for each PV of VG, and so we would in case of chunk alignment
(which is property of physical volume as Milan Broz said above).

Comment 46 errata-xmlrpc 2009-01-20 21:34:48 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0164.html

Comment 48 Vijay N. Majagaonkar 2010-09-27 11:37:07 UTC

Hi all, 
I am new to this anaconda, can somebody please tell me the problem or point me the way to fix this. here is my .ks file

anaconda version : anaconda-11.1.2.209-1.el5
lvm2 version : lvm2-2.02.56-8.el5.x86_64

[ks]

# zerombr removes invalid parition tables which may exist
zerombr
clearpart --all --initlabel
# /maint is the maintenance partition
partition /maint --asprimary --fstype=ext3 --size=5120
partition /boot --asprimary --fstype=ext3 --size=128
partition pv.01 --size=1 --grow
volgroup system_vg pv.01
logvol / --vgname=system_vg --fstype=ext3 --size=2048 --name=root_vol
logvol /tmp --vgname=system_vg --fstype=ext3 --size=2048 --name=tmp_vol
logvol /var --vgname=system_vg --fstype=ext3 --size=2048 --name=var_vol
logvol swap --vgname=system_vg --fstype=swap --recommended --name=swap_vol
logvol /opt --vgname=system_vg --fstype=ext3 --size=2048 --name=opt_vol

[/ks]

[LOG]

Running pre-install scripts
Retrieving installation information...
In progress...    Completed Completed 
Checking dependencies in packages selected for installation...
In progress...    
Can't have a question in command line mode!
LVM operation failed
lvcreate failed for tmp_vol

The installer will now exit...
custom ['_Reboot']

[/LOG]

Comment 49 Vijay N. Majagaonkar 2010-09-29 06:07:27 UTC

I am not able to find out the root cause for this issue, but it solved after increasing HDD size for VM,  

Note : Same size worked for lower version of anaconda.

Note You need to log in before you can comment on or make changes to this bug.

agk
atodorov
borgan
bpeck
ddumas
duck
dwysocha
dzickus
edamato
hdegoede
heinzm
jbrassow
k.georgiou
lwang
mbroz
pbunyan
prockai
pvrabec
rvykydal
syeghiay
vijay.majagaonkar
wenzhuo