Bug 783841
| Summary: | [RHEL6.2] System fails to install hangs during formatting disks | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Jeff Burke <jburke> |
| Component: | anaconda | Assignee: | David Lehman <dlehman> |
| Status: | CLOSED ERRATA | QA Contact: | Release Test Team <release-test-team-automation> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 6.2 | CC: | agk, arozansk, atodorov, bpeck, coughlan, davids, dlehman, dwysocha, emcnabb, gozen, heinzm, jarod, jbrassow, jhutar, jpazdziora, jstancek, jstodola, matt, mbroz, mganisin, mgrigull, msnitzer, pbunyan, pcassaro, prajnoha, prockai, soft-linux-drv, thornber, yoguma, zkabelac |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | anaconda-13.21.158-1 | Doc Type: | Bug Fix |
| Doc Text: |
Cause: The issue is caused by a failure of the installer to remove old metadata from complex storage devices like lvm and software RAID. It involves the use of either the kickstart "clearpart" command or the use of one of the installer's automatic partitioning options that clears old data from the system's disks.
Consequence: The bug manifests as a deadlock in the lvm tools which causes the installation process to hang.
Fix: Two measures were taken to address the issue. First, the lvm commands in the udev rules packaged with the installer were changed to use a less restrictive method of locking. Second, the installer was changed to explicitly remove partitions from disk instead of simply creating a new partition table on top of the old contents when it reinitializes a disk.
Result: The deadlock/freeze no longer occurs.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-06-20 12:51:19 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 545868 | ||
|
Description
Jeff Burke
2012-01-22 17:59:00 UTC
Would like the LVM team to look in to this one (based on comment #1), so reassigning for now. (..you'll probably need to use "sshd asknetwork" on kernel cmd line to have the network setup first and have the sshd ready in anaconda to get in there easily) (In reply to comment #10) > (..you'll probably need to use "sshd asknetwork" on kernel cmd line to have the > network setup first and have the sshd ready in anaconda to get in there easily) Were you successful in reproducing the issue? (In reply to comment #11) > Were you successful in reproducing the issue? Unfortunately not... OK, I'm quite curious what it is, we're already trying to reproduce this (and hopefully with the debug as well) - Marian Csontos helped me with the Beaker/kickstart for anaconda. Running it just now... The difference between F16 and RHEL6 is that F16 is using GPT. GPT requires that not only beginning but also end of device mus be properly wiped for clearpart command. It this is not happening, it is quite possible that kernel see the old partition table and it crash installation. Adding dlehman just to verify that RHEL6 anaconda properly wipes GPT (I don't think so). I was seeing something similar during development for F16. To try with the patch that fixed the deadlock/hang I was seeing, add the following to your boot command line: updates=http://dlehman.fedorapeople.org/updates/updates-783841.0.img That updates.img should work with any RHEL6 from 6.2 GA on. re-running the testing the previously failed with the updates image allowed R6.2 to be installed after F16 install: https://beaker.engineering.redhat.com/jobs/190017 David says that the updated img contains changed anaconda udev rules (/lib/udev/rules.d/70-anaconda.rules) to replace "--ignorelockingfailure" with "--config 'global {locking_type=4}'" for LVM commands called within those rules.
I looked at the machine while the installation was hung and I've noticed that there's a duplicate VG there (installing RHEL 6.2 over F16):
-bash-4.1# vgs
WARNING: Duplicate VG name vg_hp8100e01: Existing VSddhS-FPZa-ZsGK-Ec50-M2aD-xec9-YnIMv5 (created here) takes precedence over wA8FWF-cKBb-oN2D-ZAEO-LKYn-5vdU-Fs3sIM
WARNING: Duplicate VG name vg_hp8100e01: Existing VSddhS-FPZa-ZsGK-Ec50-M2aD-xec9-YnIMv5 (created here) takes precedence over wA8FWF-cKBb-oN2D-ZAEO-LKYn-5vdU-Fs3sIM
WARNING: Duplicate VG name vg_hp8100e01: Existing wA8FWF-cKBb-oN2D-ZAEO-LKYn-5vdU-Fs3sIM (created here) takes precedence over VSddhS-FPZa-ZsGK-Ec50-M2aD-xec9-YnIMv5
WARNING: Duplicate VG name vg_hp8100e01: Existing VSddhS-FPZa-ZsGK-Ec50-M2aD-xec9-YnIMv5 (created here) takes precedence over wA8FWF-cKBb-oN2D-ZAEO-LKYn-5vdU-Fs3sIM
VG #PV #LV #SN Attr VSize VFree
vg_hp8100e01 1 1 0 wz--n- 232.39g 182.39g
vg_hp8100e01 1 3 0 wz--n- 232.38g 0
The VG with 3 LVs is the VG from F16 (containing the root LV, swap LV and home LV) and the VG with 1 LV is the new one created by RHEL installation (that's just in progress). And these two are mixed together.
So it looks like the LVM command called from the udev rule tries to do the repair of the VG it finds from previous installation and this seems to be the problem here. Using locking_type=4 avoids it ("read-only" locking - not changing any metadata).
I think anaconda already tries to erase some parts of the disk before installation to avoid such problems, but it seems it's not complete. It should probably use wipefs to do that reliably? I'm not quite sure now what anaconda uses today exactly - that's a question for anaconda team. David?
Anyway, I'll try to dig deeper and I'll try to see what exactly happens there that it totally hangs.
(I think the udev worker just fails processing any rules further if any command in the run queue fails, also failing to send the notification about completing the udev rule processing and so we end up waiting forever in that lvcreate on the semaphore...)
Another interesting part is this one (there are different VG uuids now as I reinstalled it again): -bash-4.1# pvs WARNING: Duplicate VG name vg_hp8100e01: Existing 9JIPH2-jxQ8-S6ia-MipJ-QCS6-Tj9V-ZZINBn (created here) takes precedence over 1xKiNC-RbN3-zxTg-ZBxU-zsu8-SC9k-aZXW1m WARNING: Duplicate VG name vg_hp8100e01: Existing 9JIPH2-jxQ8-S6ia-MipJ-QCS6-Tj9V-ZZINBn (created here) takes precedence over 1xKiNC-RbN3-zxTg-ZBxU-zsu8-SC9k-aZXW1m PV VG Fmt Attr PSize PFree /dev/dm-0 vg_hp8100e01 lvm2 a-- 232.38g 0 /dev/sda2 vg_hp8100e01 lvm2 a-- 232.39g 182.39g The dm-0 is the newly created lv_root in RHEL 6.2 which seems to fit the PV created in F16 perfectly and unveils the old metadata this way through a new device (dm-0 - the lv_root). So the lv_root acts like a PV that just appeared. Very nice bug! The F16 uses this partition layout by default: Number Start End Size File system Name Flags 1 2048s 4095s 2048s bios_grub 2 4096s 1028095s 1024000s ext4 ext4 boot 3 1028096s 488396799s 487368704s lvm The RHEL 6.2 this one: Number Start End Size Type File system Flags 1 2048s 1026047s 1024000s primary ext4 2 1026048s 488396799s 487370752s primary lvm The "lvm" partition starts at 1028096s in F16 and 1026048s in RHEL 6.2. That's a difference of 2048 sectors and that's exactly the PE start offset in LVM by default (1 MB): PV 1st PE /dev/dm-0 1.00m /dev/sda2 1.00m A perfect fit :) Though we wipe the first KB in newly created LV (the '--zero y' which is used by default), this does not help because we need to create the mapping/LV first (which generates a CHANGE udev event) and just after that we do the wipe. But there's a window open in which the udev sees it unwiped and so any lvm command run from within udev rules sees the old PV with that offset and is the source of the confusion then... So it's a race even. Which is known issue for some time - the correct activation for LV that should be zeroed is to activate it as private device - zero - deactivate - and activate as regular accessible device. The problem here is - it would be noticeable slower. The ideal solution here is - to zero PV area directly before activation of LV - which is the most efficient solution. I guess this will be resolved with 'ddlv' idea. There is also an anaconda bug here. We normally explicitly run 'wipefs -a' to clear PVs before destroying the device it is on, but in the case of 'clearpart --all --initlabel' we do not even look at the partitions. This is an ill-advised shortcut and is the basic cause for this bug IMO. I propose we reassign this bug to anaconda and target for 6.3. Yes, those metadata remnants really need to be wiped properly. The reason it hangs lies here in this line in 70-anaconda.rules:
# probe metadata of LVM2 physical volumes
ENV{ID_FS_TYPE}=="LVM2_member", IMPORT{program}="$env{ANACBIN}/lvm pvs ...
Now, just an example, considering we have /dev/sda and an LV on top of it.
When creating the LV, a CHANGE udev event is generated. That change fires "blkid" call in udev rules to scan any metadata on that device. But since normally the LV is clear, there are no metadata on it and hence the ENV{ID_FS_TYPE} remains blank.
And so the "pvs" command is not run on newly created LV.
But if we have such an unhappy offset here where the start of newly created LV just fits the start of the PV label we had there sometime before, the ENV{ID_FS_TYPE} is (correctly) set to "LVM2_member" and so it fires the "pvs" command.
So we end up with this sequence:
1. calling lvcreate, taking an lvm lock
2. lvcreate creating the vg-lvol0 mapping (to wipe the first KB of it), generating the CHANGE event
3. *before* wiping the first KBs of the new LV within lvcreate, those udev rules are processed
4. these rules will see the old PV on newly created LV, firing the pvs rule
5. the pvs will try to take an lvm lock, but it needs to wait for the lvcreate to release it first!
6. lvcreate continues until it hits the "sync_local_dev_names" call which waits on a semaphore for relevant udev rules to be processed (there's a notification through a semaphore as one of the last udev rules to be processed)
7. lvcreate waits for udev rules, pvs waits for lvcreate and hence can't process further udev rules - deadlock!
We could avoid this situation by directly wiping the start of the new LV without a need to activate it first (the idea Zdenek mentioned in comment #26). I'm not quite sure now whether we can change lvm locking logic here somehow easily to avoid this...
Wiping those parts of the disk before making new LVs is the most straightforward way to avoid this (the same as using locking_type=4).
Reassigning to anaconda team...
(We should also consider the feasibility of the idea in comment #26 on lvm side as well!).
atodorov, Can we get a qa_ack+ on this? This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release. *** Bug 745421 has been marked as a duplicate of this bug. *** Here is an updated updates image, against anaconda-13.21.157-1: http://dlehman.fedorapeople.org/updates/updates-783841.1.img This one should explicitly destroy any devices and metadata they contain, including lvm-on-md, luks, &c. David, I want to make sure that I have this correct. This latest update is for RHEL6.3. Since it is built on top of anaconda-13.21.157-1. Will the next nightly tree we get from rel-eng have a version of Anaconda with this fix applied? If so we should not need an updates image for 6.3. Will we need a second updates image for RHEL6.2, one based on anaconda-13.21.149-1? Regards, Jeff Alright, I realize now that this is a problem that is impossible to solve without un-solving a different problem. The patch I added to the updates image in comment 44 breaks the unattended kickstart by prompting for LUKS passphrases. So it is impossible to have a completely unattended install and also guarantee all metadata has been removed from all disks unless you want to literally zero the entire disks. We have several pieces from which we can build a solution: 1. lvm locking change in udev rules prevents the problem but is considered more of a workaround than a real solution to the problem of stale metadata 2. properly clearing disklabels, which allows us to also clear metadata from the partitions this does not clear metadata from within the partitions, for example LV/VG metadata within a PV partition 3. try to explicitly destroy all devices the problem with this is it break unattended installs on systems that contain encrypted block devices 4. try to explicitly destroy all devices, except for those hidden by encryption this gets quite close to a total solution but still can hit these lvm deadlocks if encryption was used in the previous install Personally, I am starting to think that the best solution is to leave the patch mentioned in comment 29 (item 2 above) and also add the lvm locking change (item 1 above) to cover any additional cases in which the lvm tools might get deadlocked. (In reply to comment #46) > 1. lvm locking change in udev rules > > prevents the problem but is considered more of a workaround > than a real solution to the problem of stale metadata Just FYI, we're tracking the problem of race-prone wiping with new lvm2 bug #796200, destined for 6.4 to provide a better solution.
Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.
New Contents:
Cause: The issue is caused by a failure of the installer to remove old metadata from complex storage devices like lvm and software RAID. It involves the use of either the kickstart "clearpart" command or the use of one of the installer's automatic partitioning options that clears old data from the system's disks.
Consequence: The bug manifests as a deadlock in the lvm tools which causes the installation process to hang.
Fix: Two measures were taken to address the issue. First, the lvm commands in the udev rules packaged with the installer were changed to use a less restrictive method of locking. Second, the installer was changed to explicitly remove partitions from disk instead of simply creating a new partition table on top of the old contents when it reinitializes a disk.
Result: The deadlock/freeze no longer occurs.
*** Bug 801709 has been marked as a duplicate of this bug. *** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2012-0782.html opening access... |