1141398 – anaconda does not see existing Fedora 21 install to LVM-on-RAID

Bug 1141398 - anaconda does not see existing Fedora 21 install to LVM-on-RAID

Summary: anaconda does not see existing Fedora 21 install to LVM-on-RAID

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	python-blivet
Sub Component:
Version:	21
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	mulhern
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:	AcceptedBlocker
Depends On:
Blocks:	F21BetaBlocker
TreeView+	depends on / blocked

Reported:	2014-09-13 00:56 UTC by Adam Williamson
Modified:	2015-05-25 08:56 UTC (History)
CC List:	9 users (show)
Fixed In Version:	python-blivet-0.61.4-1
Clone Of:
Clones:	1178181 (view as bug list)
Environment:
Last Closed:	2014-10-15 21:27:49 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
anaconda.log (12.85 KB, text/plain) 2014-09-13 01:15 UTC, Adam Williamson	no flags	Details
ifcfg.log (2.74 KB, text/plain) 2014-09-13 01:15 UTC, Adam Williamson	no flags	Details
journal output (461.47 KB, text/plain) 2014-09-13 01:16 UTC, Adam Williamson	no flags	Details
program.log (29.54 KB, text/plain) 2014-09-13 01:17 UTC, Adam Williamson	no flags	Details
storage.log (273.23 KB, text/plain) 2014-09-13 01:17 UTC, Adam Williamson	no flags	Details
the storage.log (425.44 KB, text/plain) 2014-12-19 19:31 UTC, Oleg Samarin	no flags	Details
anaconda.log (12.21 KB, text/plain) 2014-12-19 19:32 UTC, Oleg Samarin	no flags	Details
pvdisplay result before starting anaconda (756 bytes, text/plain) 2014-12-22 20:06 UTC, Oleg Samarin	no flags	Details
program.log of the first run (10.66 KB, text/plain) 2014-12-22 20:09 UTC, Oleg Samarin	no flags	Details
storage.log of the first run (214.44 KB, text/plain) 2014-12-22 20:10 UTC, Oleg Samarin	no flags	Details
pvdisplay result after the first (unsuccessful) anaconda run (753 bytes, text/plain) 2014-12-22 20:11 UTC, Oleg Samarin	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1145783	0	unspecified	CLOSED	F21 install crashes on Intel firmware RAID with "AttributeError: 'NoneType' object has no attribute 'startswith'"	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1147087	0	unspecified	CLOSED	Previous software raid disks not detected	2021-02-22 00:41:40 UTC

Internal Links: 1145783 1147087 1226305

Description Adam Williamson 2014-09-13 00:56:09 UTC

I installed Fedora 21 Alpha TC6 (from Server DVD, I think) to an LVM-on-RAID setup created in custom partitioning: I picked two disks, went to custom part, let it create partitions for me with LVM, then set the properties of the VG to RAID-1. This worked fine.

Then I booted Fedora 21 Alpha TC7 Workstation x86_64 live, and selected both disks and went to custom partitioning. The VG of the previous install is not visible at all; all I can see for existing OSes/volumes is the 500MiB ext4 /boot partition.

I'll attach logs . I did test up/downgrading to blivet 0.61.1, but it doesn't help (the log is with that blivet, in fact).

The relevant devices - fedora--server-root and fedora--server-swap - are present in /dev/mapper (yes, with the double dashes).

Comment 1 Adam Williamson 2014-09-13 01:03:38 UTC

Not an Alpha blocker as if I boot with just one disk or the other selected I can see the PV and delete it (and hence Alpha criterion "The installer must be able to complete an installation to a single disk using automatic partitioning." is satisfied). But proposing as a Beta blocker - https://fedoraproject.org/wiki/Fedora_21_Beta_Release_Criteria#Custom_partitioning :

"Correctly interpret, and modify as described below, any disk with a valid ms-dos or gpt disk label and partition table containing ext4 partitions, LVM and/or btrfs volumes, and/or software RAID arrays at RAID levels 0, 1 and 5 containing ext4 partitions"

Comment 2 Adam Williamson 2014-09-13 01:15:10 UTC

Created attachment 937140 [details]
anaconda.log

Comment 3 Adam Williamson 2014-09-13 01:15:24 UTC

Created attachment 937141 [details]
ifcfg.log

Comment 4 Adam Williamson 2014-09-13 01:16:32 UTC

Created attachment 937142 [details]
journal output

Comment 5 Adam Williamson 2014-09-13 01:17:01 UTC

Created attachment 937143 [details]
program.log

Comment 6 Adam Williamson 2014-09-13 01:17:19 UTC

Created attachment 937144 [details]
storage.log

Comment 7 Adam Williamson 2014-09-13 01:21:26 UTC

# mdadm --detail /dev/md127

/dev/md127:
        Version : 1.2
  Creation Time : Wed Sep 10 23:59:25 2014
     Raid Level : raid1
     Array Size : 15719424 (14.99 GiB 16.10 GB)
  Used Dev Size : 15719424 (14.99 GiB 16.10 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Fri Sep 12 21:14:13 2014
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : localhost:pv00  (local to host localhost)
           UUID : dbc9631b:d5d77065:42c4a0f3:ae5a2870
         Events : 152

    Number   Major   Minor   RaidDevice State
       0     252        2        0      active sync   /dev/vda2
       1     252       17        1      active sync   /dev/vdb1

[root@localhost tmp]# lvdisplay -a
  --- Logical volume ---
  LV Path                /dev/fedora-server/root
  LV Name                root
  VG Name                fedora-server
  LV UUID                C70iud-55Tp-D8Le-NyIY-C0dZ-6kmx-FCW41P
  LV Write Access        read/write
  LV Creation host, time localhost, 2014-09-10 23:59:27 -0400
  LV Status              available
  # open                 0
  LV Size                12.91 GiB
  Current LE             3304
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:3
   
  --- Logical volume ---
  LV Path                /dev/fedora-server/swap
  LV Name                swap
  VG Name                fedora-server
  LV UUID                cqgBko-Rpa5-kJ4T-z0EN-V0uu-CpB4-ocO7iI
  LV Write Access        read/write
  LV Creation host, time localhost, 2014-09-10 23:59:33 -0400
  LV Status              available
  # open                 2
  LV Size                2.00 GiB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:4

[root@localhost tmp]# parted /dev/vda print
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  525MB   524MB   primary  ext4         boot
 2      525MB   16.6GB  16.1GB  primary               raid

[root@localhost tmp]# parted /dev/vdb print
Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 16.1GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  16.1GB  16.1GB  primary               raid

Comment 8 David Lehman 2014-09-19 17:27:17 UTC

This is related to md UUIDs containing ':' instead of '-'. I know you've made some changes in this area -- do you think they could have caused this issue? I'm asking because this is a new problem and I don't think anything has changed in udev, lvm, or mdadm.

Comment 9 Adam Williamson 2014-09-19 18:14:49 UTC

Update: this affects live, but not netinst, something in live is screwing with things, presumably.

Bug reproduced with RC1, let me know if there's anything I can provide.

I note that on the RC1 boot, the systemd unit 'lvm2-pvscan@9:127.service' appears to fire during boot and do something:

Sep 19 13:41:46 localhost lvm[728]: 2 logical volume(s) in volume group "fedora-server" now active

then something more when anaconda starts up:

Sep 19 13:56:20 localhost lvm[2220]: No PV label found on /dev/md127.
Sep 19 13:56:23 localhost lvm[2355]: 2 logical volume(s) in volume group "fedora-server" now active

The F20 Final live works, and the logs from that service are rather different. On boot:

Sep 19 14:03:15 localhost systemd[1]: Starting LVM2 PV scan on device 9:127...
Sep 19 14:03:15 localhost pvscan[717]: 2 logical volume(s) in volume group "fedora-server" now active
Sep 19 14:03:15 localhost systemd[1]: Started LVM2 PV scan on device 9:127.

which perhaps indicates the scan never finishes on F21? I don't believe there's a matching "Started" message for the "Starting", I'll go check again shortly. On F20, when anaconda starts up:

Sep 19 14:11:13 localhost systemd[1]: Stopping LVM2 PV scan on device 9:127...
Sep 19 14:11:13 localhost pvscan[2181]: Device 9:127 not found. Cleared from lvmetad cache.
Sep 19 14:11:13 localhost systemd[1]: Stopped LVM2 PV scan on device 9:127.
Sep 19 14:11:13 localhost systemd[1]: Starting LVM2 PV scan on device 9:127...
Sep 19 14:11:13 localhost pvscan[2241]: 2 logical volume(s) in volume group "fedora-server" now active
Sep 19 14:11:13 localhost systemd[1]: Started LVM2 PV scan on device 9:127.

Not sure if this is relevant, but it looked interesting.

Comment 10 mulhern 2014-10-01 21:51:23 UTC

We hope that these will all turn out to be dupes.

Comment 11 mulhern 2014-10-01 21:58:34 UTC

Although in this case we can see that the array was actually found with name pv00.

The problem is, instead, that the status is checked and the uuids are verified against each other and one is the wrong format so they turn out not equal and status is False. This should be fixed by the patch for the other bugs, since we don't just sanitize the results of mdadm methods, but also the results of udev methods.

Comment 12 Adam Williamson 2014-10-03 17:29:51 UTC

Discussed at 2014-10-03 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2014-10-03/f21-blocker-review.2014-10-03-15.58.log.txt . Accepted as a blocker per criteria cited in nomination.

Comment 13 mulhern 2014-10-06 16:37:45 UTC

Same as bz#1147077 fix.

Comment 14 Adam Williamson 2014-10-08 17:23:27 UTC

Fix for this should be in 21 Beta TC3 (not TC2).

Comment 15 Adam Williamson 2014-10-15 21:26:50 UTC

Fix confirmed in TC3, and finally I can stop keeping that VM lying around!

Comment 16 Adam Williamson 2014-10-15 21:27:49 UTC

0.61.4 is now pending stable push, so let's just close this out.

Comment 17 Oleg Samarin 2014-12-19 18:29:28 UTC

I have the same issues with Fedora 21 Live release.

I have an intel imsm raid-1 and a standalone SSD. Both contain partitions that belong to LVM PV of two different VG. There is another Fedora installations on them. I want to install F21 into some existing LVM partitions.

When I run "install to the hard disk", anaconda proposes me selection with two devices "Raid volume" and "OCZ SSD". I select both with "custom layout". On the next screen I can see the existing fedora installation with the logical volumes only on SSD. 

I can not see any volumes on top of Raid Volume neither in existing Fedora installation nor under "Unknown". Instead there is /dev/md126p2 marked as unknown partition with the filesystem of "LVM PV".

Comment 18 Oleg Samarin 2014-12-19 19:31:59 UTC

Created attachment 971299 [details]
the storage.log

Comment 19 Oleg Samarin 2014-12-19 19:32:45 UTC

Created attachment 971300 [details]
anaconda.log

Comment 20 David Shea 2014-12-19 19:49:14 UTC

(In reply to Oleg Samarin from comment #17)
> I can not see any volumes on top of Raid Volume neither in existing Fedora
> installation nor under "Unknown". Instead there is /dev/md126p2 marked as
> unknown partition with the filesystem of "LVM PV".

I looks like the raid contains an unformatted LVM PV partition

14:07:18,962 WARN blivet: PV md126p2 has no vg_name
14:07:18,962 WARN blivet: PV md126p2 has no vg_uuid
14:07:18,962 WARN blivet: PV md126p2 has no pe_start

If there is a volume group on it, even a partial volume group, we should have gotten back some information about it from udev.

Comment 21 Oleg Samarin 2014-12-20 15:54:31 UTC

But lvm recognises this partition successfully.

How can I debug this with working system without running anaconda?

Comment 22 mulhern 2014-12-22 12:57:45 UTC

Hi!

When you are running anaconda and you get to the screen you describe in comment 17, you can try switching to a terminal and running the pvs command, like:

pvs --unit=k --nosuffix --nameprefixes --unquoted --noheadings -opv_name,pv_uuid,pe_start,vg_name,vg_uuid,vg_size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,pv_count

This is the precise command blivet uses to gather information about the pvs on any visible device.

Please report the result.

Please also run the same command with the additional flag --all and report that result as well.

This will allow us to discover what information pvs is reporting for the pv at that time.

Another helpful thing to do is to include all logs. Especially useful in this context would be to have included program.log, which would tells us more about pvs output than is available from storage.log.

Comment 23 Oleg Samarin 2014-12-22 20:05:32 UTC

Today I've made more research. I need to start anaconda two times for obtaining an installer screen. Anaconda does nothing visible at the first start.

But something happens at this time: 

before: lvm has two PVs: md126p2 and sdc2
after: lvm has another PVs: sdb2 and sdc2

(the md raid consists of two disks sda and sdb).

So blivet indicates correctly at the second dime that md126p2 is no more PV.

The remaining question is what and why does anaconda for lvm to use sdb instead of md126p2?

Comment 24 Oleg Samarin 2014-12-22 20:06:48 UTC

Created attachment 972148 [details]
pvdisplay result before starting anaconda

Comment 25 Oleg Samarin 2014-12-22 20:09:20 UTC

Created attachment 972150 [details]
program.log of the first run

Comment 26 Oleg Samarin 2014-12-22 20:10:08 UTC

Created attachment 972151 [details]
storage.log of the first run

Comment 27 Oleg Samarin 2014-12-22 20:11:37 UTC

Created attachment 972152 [details]
pvdisplay result after the first (unsuccessful) anaconda run

I can'not attach anaconda.log because it has zero length

Comment 28 mulhern 2015-01-02 17:52:52 UTC

Starting from Comment #17 this is a whole new bug, so I went ahead and cloned it.

All further comments about that problem should be on #1178181.

Note You need to log in before you can comment on or make changes to this bug.