Bug 1178181 - likely udev name and name yielded by pvs do not agree so lookup in pvInfo fails
Summary: likely udev name and name yielded by pvs do not agree so lookup in pvInfo fails
Keywords:
Status: CLOSED DUPLICATE of bug 1226305
Alias: None
Product: Fedora
Classification: Fedora
Component: python-blivet
Version: 21
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: David Lehman
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-01-02 17:48 UTC by mulhern
Modified: 2015-06-23 13:52 UTC (History)
10 users (show)

Fixed In Version:
Clone Of: 1141398
Environment:
Last Closed: 2015-06-23 13:52:29 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
pvdisplay out before starting anaconda (756 bytes, text/plain)
2015-05-28 07:10 UTC, Oleg Samarin
no flags Details
journalctl output before starting anaconda (305.36 KB, text/plain)
2015-05-28 07:11 UTC, Oleg Samarin
no flags Details
pvdisplay output after starting anaconda (753 bytes, text/plain)
2015-05-28 07:12 UTC, Oleg Samarin
no flags Details
journalctl output after starting anaconda (317.85 KB, text/plain)
2015-05-28 07:13 UTC, Oleg Samarin
no flags Details
program.log from anaconda (35.37 KB, text/plain)
2015-05-28 07:14 UTC, Oleg Samarin
no flags Details
storage.log from anaconda (231.80 KB, text/plain)
2015-05-28 07:15 UTC, Oleg Samarin
no flags Details

Description mulhern 2015-01-02 17:48:33 UTC
+++ This bug was initially created as a clone of Bug #1141398 +++

I installed Fedora 21 Alpha TC6 (from Server DVD, I think) to an LVM-on-RAID setup created in custom partitioning: I picked two disks, went to custom part, let it create partitions for me with LVM, then set the properties of the VG to RAID-1. This worked fine.

Then I booted Fedora 21 Alpha TC7 Workstation x86_64 live, and selected both disks and went to custom partitioning. The VG of the previous install is not visible at all; all I can see for existing OSes/volumes is the 500MiB ext4 /boot partition.

I'll attach logs . I did test up/downgrading to blivet 0.61.1, but it doesn't help (the log is with that blivet, in fact).

The relevant devices - fedora--server-root and fedora--server-swap - are present in /dev/mapper (yes, with the double dashes).

--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:03:38 EDT ---

Not an Alpha blocker as if I boot with just one disk or the other selected I can see the PV and delete it (and hence Alpha criterion "The installer must be able to complete an installation to a single disk using automatic partitioning." is satisfied). But proposing as a Beta blocker - https://fedoraproject.org/wiki/Fedora_21_Beta_Release_Criteria#Custom_partitioning :

"Correctly interpret, and modify as described below, any disk with a valid ms-dos or gpt disk label and partition table containing ext4 partitions, LVM and/or btrfs volumes, and/or software RAID arrays at RAID levels 0, 1 and 5 containing ext4 partitions"

--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:15:10 EDT ---



--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:15:24 EDT ---



--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:16:32 EDT ---



--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:17:01 EDT ---



--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:17:19 EDT ---



--- Additional comment from Adam Williamson (Red Hat) on 2014-09-12 21:21:26 EDT ---

# mdadm --detail /dev/md127

/dev/md127:
        Version : 1.2
  Creation Time : Wed Sep 10 23:59:25 2014
     Raid Level : raid1
     Array Size : 15719424 (14.99 GiB 16.10 GB)
  Used Dev Size : 15719424 (14.99 GiB 16.10 GB)
   Raid Devices : 2
  Total Devices : 2
    Persistence : Superblock is persistent

  Intent Bitmap : Internal

    Update Time : Fri Sep 12 21:14:13 2014
          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

           Name : localhost:pv00  (local to host localhost)
           UUID : dbc9631b:d5d77065:42c4a0f3:ae5a2870
         Events : 152

    Number   Major   Minor   RaidDevice State
       0     252        2        0      active sync   /dev/vda2
       1     252       17        1      active sync   /dev/vdb1

[root@localhost tmp]# lvdisplay -a
  --- Logical volume ---
  LV Path                /dev/fedora-server/root
  LV Name                root
  VG Name                fedora-server
  LV UUID                C70iud-55Tp-D8Le-NyIY-C0dZ-6kmx-FCW41P
  LV Write Access        read/write
  LV Creation host, time localhost, 2014-09-10 23:59:27 -0400
  LV Status              available
  # open                 0
  LV Size                12.91 GiB
  Current LE             3304
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:3
   
  --- Logical volume ---
  LV Path                /dev/fedora-server/swap
  LV Name                swap
  VG Name                fedora-server
  LV UUID                cqgBko-Rpa5-kJ4T-z0EN-V0uu-CpB4-ocO7iI
  LV Write Access        read/write
  LV Creation host, time localhost, 2014-09-10 23:59:33 -0400
  LV Status              available
  # open                 2
  LV Size                2.00 GiB
  Current LE             512
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:4

[root@localhost tmp]# parted /dev/vda print
Model: Virtio Block Device (virtblk)
Disk /dev/vda: 21.5GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  525MB   524MB   primary  ext4         boot
 2      525MB   16.6GB  16.1GB  primary               raid

[root@localhost tmp]# parted /dev/vdb print
Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 16.1GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  16.1GB  16.1GB  primary               raid

--- Additional comment from David Lehman on 2014-09-19 13:27:17 EDT ---

This is related to md UUIDs containing ':' instead of '-'. I know you've made some changes in this area -- do you think they could have caused this issue? I'm asking because this is a new problem and I don't think anything has changed in udev, lvm, or mdadm.

--- Additional comment from Adam Williamson (Red Hat) on 2014-09-19 14:14:49 EDT ---

Update: this affects live, but not netinst, something in live is screwing with things, presumably.

Bug reproduced with RC1, let me know if there's anything I can provide.

I note that on the RC1 boot, the systemd unit 'lvm2-pvscan@9:127.service' appears to fire during boot and do something:

Sep 19 13:41:46 localhost lvm[728]: 2 logical volume(s) in volume group "fedora-server" now active

then something more when anaconda starts up:

Sep 19 13:56:20 localhost lvm[2220]: No PV label found on /dev/md127.
Sep 19 13:56:23 localhost lvm[2355]: 2 logical volume(s) in volume group "fedora-server" now active

The F20 Final live works, and the logs from that service are rather different. On boot:

Sep 19 14:03:15 localhost systemd[1]: Starting LVM2 PV scan on device 9:127...
Sep 19 14:03:15 localhost pvscan[717]: 2 logical volume(s) in volume group "fedora-server" now active
Sep 19 14:03:15 localhost systemd[1]: Started LVM2 PV scan on device 9:127.

which perhaps indicates the scan never finishes on F21? I don't believe there's a matching "Started" message for the "Starting", I'll go check again shortly. On F20, when anaconda starts up:

Sep 19 14:11:13 localhost systemd[1]: Stopping LVM2 PV scan on device 9:127...
Sep 19 14:11:13 localhost pvscan[2181]: Device 9:127 not found. Cleared from lvmetad cache.
Sep 19 14:11:13 localhost systemd[1]: Stopped LVM2 PV scan on device 9:127.
Sep 19 14:11:13 localhost systemd[1]: Starting LVM2 PV scan on device 9:127...
Sep 19 14:11:13 localhost pvscan[2241]: 2 logical volume(s) in volume group "fedora-server" now active
Sep 19 14:11:13 localhost systemd[1]: Started LVM2 PV scan on device 9:127.

Not sure if this is relevant, but it looked interesting.

--- Additional comment from mulhern on 2014-10-01 17:51:23 EDT ---

We hope that these will all turn out to be dupes.

--- Additional comment from mulhern on 2014-10-01 17:58:34 EDT ---

Although in this case we can see that the array was actually found with name pv00.

The problem is, instead, that the status is checked and the uuids are verified against each other and one is the wrong format so they turn out not equal and status is False. This should be fixed by the patch for the other bugs, since we don't just sanitize the results of mdadm methods, but also the results of udev methods.

--- Additional comment from Adam Williamson (Red Hat) on 2014-10-03 13:29:51 EDT ---

Discussed at 2014-10-03 blocker review meeting: http://meetbot.fedoraproject.org/fedora-blocker-review/2014-10-03/f21-blocker-review.2014-10-03-15.58.log.txt . Accepted as a blocker per criteria cited in nomination.

--- Additional comment from mulhern on 2014-10-06 12:37:45 EDT ---

Same as bz#1147077 fix.

--- Additional comment from Adam Williamson (Red Hat) on 2014-10-08 13:23:27 EDT ---

Fix for this should be in 21 Beta TC3 (not TC2).

--- Additional comment from Adam Williamson (Red Hat) on 2014-10-15 17:26:50 EDT ---

Fix confirmed in TC3, and finally I can stop keeping that VM lying around!

--- Additional comment from Adam Williamson (Red Hat) on 2014-10-15 17:27:49 EDT ---

0.61.4 is now pending stable push, so let's just close this out.

--- Additional comment from Oleg Samarin on 2014-12-19 13:29:28 EST ---

I have the same issues with Fedora 21 Live release.

I have an intel imsm raid-1 and a standalone SSD. Both contain partitions that belong to LVM PV of two different VG. There is another Fedora installations on them. I want to install F21 into some existing LVM partitions.

When I run "install to the hard disk", anaconda proposes me selection with two devices "Raid volume" and "OCZ SSD". I select both with "custom layout". On the next screen I can see the existing fedora installation with the logical volumes only on SSD. 

I can not see any volumes on top of Raid Volume neither in existing Fedora installation nor under "Unknown". Instead there is /dev/md126p2 marked as unknown partition with the filesystem of "LVM PV".

--- Additional comment from Oleg Samarin on 2014-12-19 14:31:59 EST ---



--- Additional comment from Oleg Samarin on 2014-12-19 14:32:45 EST ---



--- Additional comment from David Shea on 2014-12-19 14:49:14 EST ---

(In reply to Oleg Samarin from comment #17)
> I can not see any volumes on top of Raid Volume neither in existing Fedora
> installation nor under "Unknown". Instead there is /dev/md126p2 marked as
> unknown partition with the filesystem of "LVM PV".

I looks like the raid contains an unformatted LVM PV partition

14:07:18,962 WARN blivet: PV md126p2 has no vg_name
14:07:18,962 WARN blivet: PV md126p2 has no vg_uuid
14:07:18,962 WARN blivet: PV md126p2 has no pe_start

If there is a volume group on it, even a partial volume group, we should have gotten back some information about it from udev.

--- Additional comment from Oleg Samarin on 2014-12-20 10:54:31 EST ---

But lvm recognises this partition successfully.

How can I debug this with working system without running anaconda?

--- Additional comment from mulhern on 2014-12-22 07:57:45 EST ---

Hi!

When you are running anaconda and you get to the screen you describe in comment 17, you can try switching to a terminal and running the pvs command, like:

pvs --unit=k --nosuffix --nameprefixes --unquoted --noheadings -opv_name,pv_uuid,pe_start,vg_name,vg_uuid,vg_size,vg_free,vg_extent_size,vg_extent_count,vg_free_count,pv_count

This is the precise command blivet uses to gather information about the pvs on any visible device.

Please report the result.

Please also run the same command with the additional flag --all and report that result as well.

This will allow us to discover what information pvs is reporting for the pv at that time.

Another helpful thing to do is to include all logs. Especially useful in this context would be to have included program.log, which would tells us more about pvs output than is available from storage.log.

--- Additional comment from Oleg Samarin on 2014-12-22 15:05:32 EST ---

Today I've made more research. I need to start anaconda two times for obtaining an installer screen. Anaconda does nothing visible at the first start.

But something happens at this time: 

before: lvm has two PVs: md126p2 and sdc2
after: lvm has another PVs: sdb2 and sdc2

(the md raid consists of two disks sda and sdb).

So blivet indicates correctly at the second dime that md126p2 is no more PV.

The remaining question is what and why does anaconda for lvm to use sdb instead of md126p2?

--- Additional comment from Oleg Samarin on 2014-12-22 15:06:48 EST ---



--- Additional comment from Oleg Samarin on 2014-12-22 15:09:20 EST ---



--- Additional comment from Oleg Samarin on 2014-12-22 15:10:08 EST ---



--- Additional comment from Oleg Samarin on 2014-12-22 15:11:37 EST ---

I can'not attach anaconda.log because it has zero length

Comment 1 mulhern 2015-01-02 17:50:50 UTC
Starting with comment

--- Additional comment from Oleg Samarin on 2014-12-19 13:29:28 EST ---

this is a very different problem, so treat it that way.

Comment 2 mulhern 2015-01-02 19:33:36 UTC
The working hypothesis right now is the pvs is preferring a different name/path to the one that udev reports, so that lookup fails.

This seems likely, since the udev info gets pulled just the once.

Comment 3 Oleg Samarin 2015-01-05 06:23:32 UTC
Which logs should I attach for more researching?

Comment 4 mulhern 2015-01-20 14:31:31 UTC
(In reply to Oleg Samarin from comment #3)
> Which logs should I attach for more researching?

Hi!

I've placed an updates image at http://mulhern.fedorapeople.org/1178181.img.

It's not a fix, it just adds some extra information to try to narrow down where the disagreement between the lvm and udev tools might be arising.
The new information will be tagged with this bz and should pop up in storage.log.
Could you try again, using image, and attach storage.log and program.log when completed.

Thanks,

- mulhern

Comment 5 Oleg Samarin 2015-02-07 14:26:26 UTC
How should I use this img-file?

Comment 6 mulhern 2015-02-13 22:28:27 UTC
(In reply to Oleg Samarin from comment #5)
> How should I use this img-file?

You can use it on the boot options command line as:

inst.updates=<url to location of updates file>

Comment 7 Oleg Samarin 2015-05-27 20:25:37 UTC
I have the same issue when installing Fedora 22 on two machines.

They both have imsm raid1 md126 of sda and sdb with lvm partition on md126p2. Before starting anaconda lvm sees md126p2 as a pv, but after starting anaconda  lvm loses md126p2 and starts using sdb2 instead of md126p2. So python-blivet lookup fails

Comment 8 Oleg Samarin 2015-05-28 07:10:12 UTC
Created attachment 1031043 [details]
pvdisplay out before starting anaconda

Comment 9 Oleg Samarin 2015-05-28 07:11:20 UTC
Created attachment 1031044 [details]
journalctl output before starting anaconda

Comment 10 Oleg Samarin 2015-05-28 07:12:27 UTC
Created attachment 1031045 [details]
pvdisplay output after starting anaconda

Comment 11 Oleg Samarin 2015-05-28 07:13:35 UTC
Created attachment 1031047 [details]
journalctl output after starting anaconda

Comment 12 Oleg Samarin 2015-05-28 07:14:31 UTC
Created attachment 1031060 [details]
program.log from anaconda

Comment 13 Oleg Samarin 2015-05-28 07:15:07 UTC
Created attachment 1031061 [details]
storage.log from anaconda

Comment 14 Oleg Samarin 2015-05-30 17:31:41 UTC
The patch in https://bugzilla.redhat.com/show_bug.cgi?id=1226305 solves the problem in Fedora 22

Comment 15 mulhern 2015-06-19 16:39:09 UTC
Unfortunately, the log yielded no useful information. Reassigning...

Comment 16 Oleg Samarin 2015-06-19 18:54:35 UTC
I've already made the research and I have prepared a patch. See https://bugzilla.redhat.com/show_bug.cgi?id=1226305 

This ticket may be closed as duplicate of 1226305

Comment 17 Oleg Samarin 2015-06-19 19:01:49 UTC
But 1226305 is still unassigned.

Comment 18 David Lehman 2015-06-23 13:52:29 UTC

*** This bug has been marked as a duplicate of bug 1226305 ***


Note You need to log in before you can comment on or make changes to this bug.