Bug 1357247

Summary: rhvh 4: reboot after install shows "4m[terminated]" and takes long to reboot
Product: Red Hat Enterprise Virtualization Manager Reporter: daniel <dmoessne>
Component: ovirt-node-ngAssignee: Yuval Turgeman <yturgema>
Status: CLOSED ERRATA QA Contact: Qin Yuan <qiyuan>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0.0CC: cshao, dfediuck, dougsland, huzhao, jkortus, leiwang, msekleta, sbueno, weiwang, yaniwang, ycui, yturgema
Target Milestone: ovirt-4.2.0   
Target Release: 4.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-15 17:57:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
screenshot from installer screen
none
screenshot from term4 showing error
none
sosreport from system none

Description daniel 2016-07-17 08:03:02 UTC
Created attachment 1180667 [details]
screenshot from installer screen

Description of problem:
After finishing installation in Anaconda and pressin "reboot" this seems to be stalled and shows "4m[terminated]"
on console 4 I can see
ERR lvm: There are still devices being monitored
ERR lvm: Refusing to exit
ERR lvm: There are still devices being monitored
ERR lvm: Refusing to exit
(see attached screenshots)

Version-Release number of selected component (if applicable): RHEV-H-7.2-20160627.2-RHVH-x86_64-dvd1.iso
 

How reproducible: 


Steps to Reproduce:
1. Install RHVh4.0 from above mentioned iso
2. follow the installation
3.at the end of inst process press reboot
4. see "4m[terminated]" and wait quite a long time until system really reboots

Actual results:

"4m[terminated]" shown on screen and reboot lasts in my opinion too long

Expected results:
just reboot without waiting

Additional info:

on console (4) I can find:
ERR lvm: There are still devices being monitored
ERR lvm: Refusing to exit
ERR lvm: There are still devices being monitored
ERR lvm: Refusing to exit

Comment 1 daniel 2016-07-17 08:03:54 UTC
Created attachment 1180668 [details]
screenshot from term4 showing error

Comment 2 daniel 2016-07-17 08:05:11 UTC
Created attachment 1180669 [details]
sosreport from system

Comment 3 Wei Wang 2016-07-22 03:30:06 UTC
Test Version
redhat-virtualization-host-4.0-20160714.3.x86_64

Test Steps:
1. Install RHVh4.0
2. At the end of install process press reboot, see "4m[terminated]" and wait quite a long time until system really reboots


Result:
"4m[terminated]" shown on screen and reboot lasts too long


The bug can be reproduced by QE.

Comment 4 Fabian Deutsch 2016-07-22 12:52:51 UTC
Vratislav, have you got an idea how we can debug this problem?
How to debug what is blocking the shutdown/reboot?

Comment 5 Ryan Barry 2016-07-22 13:50:15 UTC
[  OK  ] Started Restore /run/initramfs.
[ ***  ] A stop job is running for Anaconda (47s/ 1min30s)

So we know that Anaconda is blocking it, at least. But there's no CPU usage.

Most of the VTs are dead, but there's one still alive, and it looks like lvm2-monitor:

ERR lvm: There are still devices being monitored.
ERR lvm: Refusing to exit.

bz#681582 is an old but which references similar behavior, but I'm surprised to see this come up again. Some reports indicate that it's not visible without snapshots, so perhaps that's the reason.

Comment 6 Fabian Deutsch 2016-07-22 16:41:50 UTC
Good catch, this reminds me that I also saw this.

The question is now, how we can see what's blocking lvm.

Comment 7 Vratislav Podzimek 2016-07-27 14:50:57 UTC
(In reply to Fabian Deutsch from comment #4)
> Vratislav, have you got an idea how we can debug this problem?
> How to debug what is blocking the shutdown/reboot?

No idea, that would be a good question for systemd/dracut guys.

Comment 8 Fabian Deutsch 2016-09-02 10:23:10 UTC
Michal, can you give us a hint how we can investigate this shutdown issue?

Comment 9 Michal Sekletar 2016-09-02 11:30:01 UTC
(In reply to Fabian Deutsch from comment #8)
> Michal, can you give us a hint how we can investigate this shutdown issue?

systemctl start debug-shell.service

Above command should start root shell on tty9. This bash process won't be killed by systemd on shutdown and should stay up until systemd-shutdown does final killing spree before switch-root to initrd. If hang is happening before the switch root you should be able to switch to vt9 and introspect the system from the debug shell.

Comment 10 Yuval Turgeman 2016-12-28 09:40:07 UTC
This bug is caused due to missing device mapper event daemon (dmeventd) in the the ISO's rootfs.  Once anaconda runs the postinstall script runs in the chrooted environment (imgbase layout --init), dmeventd from the new fs gets executed, and its process holds that mount point busy.  The solution here would be to add dmeventd, which is actually a requirement for lvm2-monitor, to the ISO's rootfs.  I added it manually to an existing iso for test, and it took ~5 seconds to reboot instead of 1.5 minutes.

Comment 11 Yuval Turgeman 2017-01-04 10:12:01 UTC
Samantha, any idea why dmeventd is missing from anaconda rootfs ? it's a requirement for lvm2-monitor...

Comment 12 David Lehman 2017-02-03 15:33:48 UTC
dmeventd is not needed by the installer under any known circumstances until, so we don't include it. It was seen as a waste of space.

Comment 13 Sandro Bonazzola 2017-02-10 16:07:12 UTC
Yuval, David, Samantha, if dmeventd is required, let's open a bug about it.
In the meanwhile, I merged Yuval's patch which should workaround this.

Comment 14 Douglas Schilling Landgraf 2017-02-10 16:17:39 UTC
(In reply to Sandro Bonazzola from comment #13)
> Yuval, David, Samantha, if dmeventd is required, let's open a bug about it.
> In the meanwhile, I merged Yuval's patch which should workaround this.

+1

Comment 15 Yuval Turgeman 2017-02-12 09:36:19 UTC
(In reply to Sandro Bonazzola from comment #13)
> Yuval, David, Samantha, if dmeventd is required, let's open a bug about it.
> In the meanwhile, I merged Yuval's patch which should workaround this.

With anaconda's new snapshot feature, i think they must add it, otherwise they'll encounter this bug as well.  If lvcreate doesn't find a running dmeventd, it tries to run it.  If it isn't installed and running from anaconda's root fs, it will get executed from the new image and its process will hold /mnt/sysimage busy which leads to this bug.

Comment 16 Qin Yuan 2017-11-09 03:59:13 UTC
Verify Versions:
RHVH-4.2-20171105.2-RHVH-x86_64-dvd1.iso

Verify Steps:
1. Install RHVH iso
2. Follow the installation
3. At the end of installation process press reboot
4. Wait for reboot, and check console 4

Results:
1. "4m[terminated]" was shown on the screen (the same behaviour with RHEL)
2. There was no "ERR lvm:" shown on console 4.
3. It took about 20 seconds to reboot. (The waiting time is related to the machine, but 20 seconds is reasonable than 1.5 minutes)

According to the results, this bug is fixed, set the status to VERIFIED.

Comment 20 errata-xmlrpc 2018-05-15 17:57:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:1524

Comment 21 Franta Kust 2019-05-16 13:09:38 UTC
BZ<2>Jira Resync