Bug 213208

Summary: Unable to reload kernel partition table while other partitions are in use
Product: Red Hat Enterprise Linux 4 Reporter: Shmuel Protter NDS Israel <hpuxconsulting>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED WONTFIX QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: agk, dwysocha, hpuxconsulting, mbroz
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
URL: http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1072302
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-05-14 22:32:12 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Shmuel Protter NDS Israel 2006-10-31 08:21:20 UTC
Description of problem:


Version-Release number of selected component (if applicable):
lvm latest

How reproducible:
fdisk/parted partion disk 8E LVM
pvcreate
vgcreate
lvcreate
Then use fdisk to add a partition to the same disk.
fdisk fails errror 16, lvm fails to lock.

Normal partitioning with type 83 Linux is still possible.


Steps to Reproduce:
1.
2.
3.
  
Actual results:
Must reboot to clear communication problem between partition table and kernel

Expected results:
I expect LVM to lock the disk and update the partition table.


Additional info:
See ITRC thread
Make requests for additional information there.

Note: This is a critical problem and requires a fix. We have 400 support
licenses from RH and this is nearly impossible to work around.

Comment 1 Shmuel Protter NDS Israel 2006-10-31 08:36:49 UTC
 RH 4.4 RHCS 4.4 GFS with clvmd
stock smp kernel
I use fdisk to create a new partition on shared storage.
fdisk reports that the kernel is not updated.

No problem right, just run partprobe.

Partprobe does nothing.

lvcreate does an error:
pvcreate works without protest
vgcreate works without protest.
lvcreate -L +1G vgsep

Error locking on node golan2: Internal lvm error, check syslog
Failed to activate new LV.

This is what the lvscan looks like:

inactive '/dev/vgsep/lvol0' [1.00 GB] inherit
inactive '/dev/vgsep/lvol1' [1.00 GB] inherit
ACTIVE '/dev/vgsch/lvol0' [1.00 GB] inherit
ACTIVE '/dev/vgssr/lvol0' [1.00 GB] inherit
ACTIVE '/dev/vg00/vg00' [4.00 GB] inherit
ACTIVE '/dev/vg00/lvol4' [1.00 GB] inherit
ACTIVE '/dev/vg00/lvol3' [4.00 GB] inherit
ACTIVE '/dev/vg00/lvol5' [4.00 GB] inherit
ACTIVE '/dev/vg00/lv02' [8.00 GB] inherit

If I boot the logical volumes are marked ACTIVE and all is well.

The issue is I'm trying to help with quality control on a script that creates
the partitions, volume groups and logical volumes. We dont want to reboot
because the script loses control. Yes we could script it to finish after boot
but thats disruptive.

How do I kick the kernel in the can and get it to accept the update?

What I've tried:
pvscan
re-running fdisk

SEP

 Device files are not created in /dev/mapper

I want to force that process.

fdisk warned.

WARNING: Re-reading the partition table failed with error 16: Device or resource
busy.
The kernel still uses the old table.
The new table will be used at the next reboot.

partprobe did not help.

...
The error from fdisk can be avoided by not mounting anything on the shared
storage until completion of fdisk. This apparently triggers the bug.

Once the bug has occurred the kernel can't lock the lvm and we're pretty much done.

Testing the following plan.

Complete all fdisk work, which totally breaks the script I'm doing Q&A on and
then see how well partitioning works after that.

Have confirmed the problem does not occur under non-lvm.

////
This is clearly an lvm/clvmd issue.

If you do parted/fdisk work all at once instead of sequentially like the script
does it works out nicely.

Once you trigger a failed kernel read of the partition table you are done.

This being a clustered installation we can't just upgrade the kernel because
this has been proven to destroy clustering functionality, especially on DL380
servers.

I see no patches available either.


Comment 2 Alasdair Kergon 2006-11-01 18:47:17 UTC
Unfortunately the kernel cannot handle changes to partition tables while
partitions are in use - you have to reboot.

If you're creating everything at once, you can add 'lvchange -an' after the
lvcreate steps to deactivate the logical volumes so they are not in use when the
next partition is created.

Workarounds might be: just have one big partition and use LVM to divide it into
LVs; investigate a temporary use of 'kpartx' (in the device-mapper-multipath)
until the next reboot.

Comment 3 Alasdair Kergon 2010-05-14 22:32:12 UTC
I hope you found a suitable workaround, but - years later - there still seems little appetite for fixing this in the upstream kernel.  It won't now get fixed in RHEL4, anyway, as a change like this would have to be accepted upstream first, so I'll close this.