Bug 1768498 - Fedora-Live-31 fails to boot: A start job is running for Monitoring of LVM2 Mirrors...
Summary: Fedora-Live-31 fails to boot: A start job is running for Monitoring of LVM2 M...
Keywords:
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: 31
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: dracut-maint-list
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-04 15:28 UTC by Enrique Gomezdelcampo
Modified: 2020-05-26 18:38 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug


Attachments (Terms of Use)
picture from the screen showing when the live boot hangs (1.29 MB, image/jpeg)
2019-11-21 17:23 UTC, Enrique Gomezdelcampo
no flags Details

Description Enrique Gomezdelcampo 2019-11-04 15:28:31 UTC
Description of problem:
When booting from an live image on a USB drive created with Fedora Media Writer, the booting process hangs at a line saying:  A start job is running for Monitoring of LVM2 Mirrors, snapshots, etc. using dmeventd or progress polling ( min / no limit). It never gets past that point regardless of how long I leave it there. Eventually, I have to turn off the computer. 

Version-Release number of selected component (if applicable):
Fedora-Live-31

How reproducible:
Every time and on two different computers: a Dell Precision M6400 and an Acer Aspire Z24-890. Regardless if the image is created on  different USB drives or from Fedora Media Writer on Fedora 30 or Windows 10. However, the live image does boot correctly on a Dell Latitude E7450.


Steps to Reproduce:
1. create live image 
2. boot the computer with USB drive
3. hangs on that line

Actual results:
No booting of Fedora 31 live

Expected results:
Boot to live image

Additional info:
Never has been a problem before in the same PC (Dell M6400). Not a problem for live image of Fedora 30, 29, 28, etc.

Comment 1 Zbigniew Jędrzejewski-Szmek 2019-11-04 15:49:52 UTC
You'll need to provide additional debugging information. Right now, there just isn't enough
info to say what the problem is.

When you're booting from the live image, you should be able to switch to the text console
using ctrl-alt-f2 (or -f3, etc). Please attach the logs.

Comment 2 Enrique Gomezdelcampo 2019-11-21 17:23:55 UTC
Created attachment 1638545 [details]
picture from the screen showing when the live boot hangs

I apologize for taking so long to respond. I am not a complete newbie, but I cannot switch to the text console. I would need a little more detailed information on how to do that. I tried as requested, but nothing happened. I am attaching a picture from the screen showing when the live boot hangs.

Comment 3 Alex Vincent 2019-12-21 01:32:20 UTC
I've started hitting this as well.  Please specify the log files you refer to, and how I can get them for attaching here?  I cannot remember how to reach an emergency command line terminal.

Comment 4 Alex Vincent 2019-12-21 03:25:13 UTC
Correction, my problem is similar but not exactly the same.  I had been running Fedora 30 Workstation for months, and just attempted an upgrade to Fedora 31 by the current OS.

Comment 5 Alex Vincent 2019-12-21 03:47:24 UTC
OK, this page fixed my problem:
https://ask.fedoraproject.org/t/lvm2-monitor-service-runs-endlessly-on-startup/4370

Comment 6 Eduardo 2020-01-04 03:42:34 UTC
I was also having this problem after upgrading to Fedora 31. The first reboot went fine. The problem started happening after rebooting again. The problem was definitely the raid array. I did not mask the service, but I have been able to boot fine after a few changes in my configuration. I am not sure what fixed it so I am going to describe the different configurations as it may help diagnose the problem.

I was able to boot with runlevel 1 from grub

My original configuration was a couple of disk in raid 1 with BIOS raid (IMSM). In this configuration I was able to observe the following:
$ ls /dev/sdb* /dev/sdc* /dev/md12*
/dev/sdb
/dev/sdb1
/dev/sdc
[ not sure if /dev/sdc1  was there]
/dev/md127
/dev/md126
/dev/md126p1

$ fdisk -l /dev/md126
$ fdisk -l /dev/sdb
$ fdisk -l /dev/sdc
All returned one partition visible

I had luks on /dev/md126p1 and ext4 on that

After that:
- I stopped the raid array
- Deleted the partition on each disk
- Recreated the IMSM raid
- Created a partition on /dev/md126
- Created luks container on md126p1
- Created ext4 on the luks container

The problem persisted on the next reboot. This time I could see the same devices and partitions as before, but I am sure that /dev/sdc1 was not in the system this time. Though fdisk -l /dev/sdc was showing the partition

Then I tried:
- I stopped and destroyed the raid array zeroing out the first few 10MB's of /dev/sd[bc] (I know a lot less was needed to clear out any partition tables and metadata but it was just easy and quick to do)
- Created a SW raid md1 (internal bitmap with 1.2 metadata)
- Created a partition on /dev/md1
- Created luks container on md1p1
- Created ext4 on the luks container

So far the problem has not happened again. This time I see something very different:
$ ls /dev/sdb* /dev/sdc* /dev/md1*
/dev/sdb
/dev/sdc
/dev/md1
/dev/md1p1

( no /dev/sd[bc]1 shows up )


$ fdisk -l /dev/md1
Shows the created partition /dev/md1p1

$ fdisk -l /dev/sdb
$ fdisk -l /dev/sdc
Do not return any partition at all

lsblk -fs returns the following:

raid-data         ext4                                        [ UUID 1 ]      1.2T    11% /mnt/raid
└─md1p1           crypto_LUKS                                 [ UUID 2 ]
  └─md1                                                                                                             
    ├─sdb         linux_raid_membe hostname:1 [ UUID 3 ]
    └─sdc         linux_raid_membe hostname:1 [ UUID 3, same as sdb ]

I am not sure if I had cleared out the disks and recreated the BIOS raid would have worked. BIOS raids used to be unstable in older Fedora versions, not some much in the latest versions though until now. My only reason for having BIOS raid was to shared it with a Windows boot, but that is no longer the case for me so this was a good pretext to migrate to SW raid.

Comment 7 Alex Vincent 2020-05-08 07:25:36 UTC
This bug bit me again when I tried to upgrade to Fedora 32 from Fedora 31, only this time I wasn't able to figure it out.

That said, I have already ordered a newer custom machine (it should be here in a few weeks), so if I can find a RedHat engineer to borrow the old box to diagnose this before I have it recycled...

Comment 8 ss_various_email 2020-05-26 12:51:32 UTC
I had upgraded with command line from Fedora 30 -> 31. same boot freeze problem as others above

Found I could do a temp fix by:
manual edit on kernel command line during boot adding:
systemd.mask=lvm2-monitor.service

booted, so:
grubby --args=systemd.mask=lvm2-monitor.service --update-kernel /boot/vmlinuz-5.6.13-200.fc31.x86_64

then found this below. I have a very similar setup involving Intel Raid which seems to be common factor
triggering the problem for a few people:

https://bugs.centos.org/view.php?id=16869
CR lvm2 2.03.05-5.el8.x86_64 crashes if there are no LVM volumes

in /etc/lvm/lvm.conf I changed, as suggested:
external_device_info_source = "none" 
to 
external_device_info_source = "none"

fw_raid_component_detection = 0 
to
fw_raid_component_detection = 1

then
grubby --remove-args=systemd.mask=lvm2-monitor.service --update-kernel /boot/vmlinuz-5.6.13-200.fc31.x86_64

reboot, no errors and so far so good. Obviously I am cannot be sure of the implications/safety of these changes but
perhaps the info will be useful to others.  Hopefully the lvm.conf changes will persist when I upgrade to 32 but
not a big problem if not.

Comment 9 Michael Riss 2020-05-26 18:38:04 UTC
A follow up the previous post:

I didn't completely understand where the "suggested" values come from, so I searched and found this bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1787013

There 

  external_device_info_source = "udev"
  fw_raid_component_detection = 1

is suggested. I tried it and in my case it solved the problem.
My affected machine(s) all have Intel VROC BIOS/firmware based RAID 1 volumes and no LVM volumes.

Thanks for the hint ss_various!


Note You need to log in before you can comment on or make changes to this bug.