Bug 1656424
Summary: | LVM2 after last update (EL7.6) used wrong device when activating VG on boot | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Milan Kerslager <milan.kerslager> | ||||||
Component: | lvm2 | Assignee: | LVM and device-mapper development team <lvm-team> | ||||||
lvm2 sub component: | Default / Unclassified | QA Contact: | cluster-qe <cluster-qe> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | unspecified | ||||||||
Priority: | high | CC: | agk, cmarthal, gordon.messmer, heinzm, jbrassow, mcsontos, milan.kerslager, msnitzer, pasik, prajnoha, rbednar, rhandlin, teigland, zkabelac | ||||||
Version: | 7.6 | Keywords: | ZStream | ||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | lvm2-2.02.184-1.el7 | Doc Type: | If docs needed, set a value | ||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | |||||||||
: | 1657640 (view as bug list) | Environment: | |||||||
Last Closed: | 2019-08-06 13:10:41 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1657640 | ||||||||
Attachments: |
|
Description
Milan Kerslager
2018-12-05 13:49:04 UTC
Created attachment 1511777 [details]
Wrong VG Opteron
VG Opteron picked up /dev/sda1 instead of /dev/md2 after reboot.
Could you please confirm the md is version 0.9? mdadm --detail --scan -vvv Created attachment 1511893 [details]
mdadm --detail --scan -vvv
The md2 array has metadata v1.0
It is an old array that has been transformed RAID1->RAID5->RAID6 in the past.
It seems like simple mixing v0.9 and v1.0 md metadata does not reproduce the bug. Neither did creating two md raid6 devices with v0.9 and v1.0 metadata and upgrading from 7.5 to 7.6 with reboot. Any chance of providing a reliable reproducer here? =================== # mdadm -v --detail --scan ARRAY /dev/md0 level=raid6 num-devices=4 metadata=0.90 UUID=35d3e125:eff7bc90:65c288b3:2c710f73 devices=/dev/sdb,/dev/sdc,/dev/sdd,/dev/sde ARRAY /dev/md1 level=raid6 num-devices=4 metadata=1.0 name=1 UUID=43111dee:a292d73d:cb03a2dd:3ca6a979 devices=/dev/sdf,/dev/sdg,/dev/sdh,/dev/sdi # pvs PV VG Fmt Attr PSize PFree /dev/md0 vg lvm2 a-- <9.99g <9.99g /dev/md1 vg lvm2 a-- <9.99g <9.99g /dev/sdj vg lvm2 a-- 4.99g 4.99g /dev/sdk vg lvm2 a-- 4.99g 4.99g /dev/sdl vg lvm2 a-- 4.99g 4.99g /dev/sdm vg lvm2 a-- 4.99g 4.99g /dev/sdn vg lvm2 a-- 4.99g 4.99g /dev/sdo vg lvm2 a-- 4.99g 4.99g /dev/vda2 rhel_host-085 lvm2 a-- <7.00g 1.40g # cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.5 (Maipo) # yum update -y ###Update from lvm2-2.02.177-4.el7 to lvm2-2.02.180-10.el7 # reboot # cat /etc/redhat-release Red Hat Enterprise Linux Server release 7.6 (Maipo) # vgchange -an vg 0 logical volume(s) in volume group "vg" now active # vgchange -ay vg 1 logical volume(s) in volume group "vg" now active # vgs vg VG #PV #LV #SN Attr VSize VFree vg 8 1 0 wz--n- <49.93g <48.93g I was hit by this bug as well. I believe the bug is that lvm2 no longer excludes devices with md metadata 0.90 when scanning for PVs. In order to reproduce the problem, a block device must have both md metadata 0.90 and LVM PV metadata. This is easiest to reproduce with a RAID1 volume, where both component devices will have the metadata for both md 0.90 and LV PV. On a system with a RAID1 device and md metadata 1.2, running "pvs" with verbose=7 set in lvm.conf will include this output: #device/dev-io.c:609 Opened /dev/sda3 RO O_DIRECT #device/dev-io.c:359 /dev/sda3: size is 1951133696 sectors #device/dev-io.c:658 Closed /dev/sda3 #filters/filter-mpath.c:196 /dev/sda3: Device is a partition, using primary device sda for mpath component detection #device/dev-io.c:336 /dev/sda3: using cached size 1951133696 sectors #device/dev-md.c:163 Found md magic number at offset 4096 of /dev/sda3. #filters/filter-md.c:108 /dev/sda3: Skipping md component device Here, we can see that lvm2 finds the md magic number and skips examining the device for PV metadata. On a system with a RAID1 device and md metadata 0.90, running "pvs" with verbose=7 includes this output instead: #device/dev-io.c:609 Opened /dev/sda3 RO O_DIRECT #device/dev-io.c:359 /dev/sda3: size is 5858142208 sectors #device/dev-io.c:658 Closed /dev/sda3 #filters/filter-mpath.c:196 /dev/sda3: Device is a partition, using primary device sda for mpath component detection #filters/filter-partitioned.c:30 filter partitioned deferred /dev/sda3 #filters/filter-md.c:99 filter md deferred /dev/sda3 #filters/filter-persistent.c:346 filter caching good /dev/sda3 (In reply to David Teigland from comment #4) > Looking at the 2018-06-01-stable branch, these three commits are all related > to improving id of md componenents: > > scan: md metadata version 0.90 is at the end of disk > https://sourceware.org/git/?p=lvm2.git;a=commit; > h=0e42ebd6d4012d210084a9ccf8d76f853726de3c > > pvscan lvmetad: use full md filter when md 1.0 devices are present > https://sourceware.org/git/?p=lvm2.git;a=commit; > h=a01e1fec0fe7c2fa61577c0e636e907cde7279ea > > pvscan lvmetad: use udev info to improve md component detection > https://sourceware.org/git/?p=lvm2.git;a=commit; > h=a188b1e513ed5ca0f5f3702c823490f5610d4495 David, this last patch requires c527a0cb, which is broader in scope. IIUC it is just an optimization, and is not needed to fix the issue, right? Reliable reproducer was not discovered for this bug. Marking verified (SanityOnly). Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:2253 |