Bug 1169364 - After reboot the host, vdsm does not run and the host locked on status Non Responsive
Summary: After reboot the host, vdsm does not run and the host locked on status Non Re...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-webadmin-portal
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.0
Assignee: Oved Ourfali
QA Contact: sefi litmanovich
URL:
Whiteboard: infra
Depends On:
Blocks: rhev35rcblocker rhev35gablocker
TreeView+ depends on / blocked
 
Reported: 2014-12-01 12:53 UTC by lkuchlan
Modified: 2016-02-10 19:46 UTC (History)
12 users (show)

Fixed In Version: v13
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 17:12:55 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Engine and vdsm logs (770.39 KB, application/x-gzip)
2014-12-01 12:53 UTC, lkuchlan
no flags Details
logs (1.54 MB, application/x-gzip)
2014-12-04 09:06 UTC, lkuchlan
no flags Details

Description lkuchlan 2014-12-01 12:53:35 UTC
Created attachment 963266 [details]
Engine and vdsm logs

Description of problem:
After reboot the host, vdsm does not run and the host locked on status Non Responsive 

Version-Release number of selected component (if applicable):
3.5 vt11

How reproducible:
100%

Steps to Reproduce:
1. Reboot the host
2. On host tab confirm ‘Host has been Rebooted’

Actual results:
The vdsm does not run 

Expected results:
Vdsm should run

Comment 1 Oved Ourfali 2014-12-03 07:09:19 UTC
According to the engine log, the host was down, but then was up again.
Then, it had some failures until it was elected to be the SPM.
For how long was the host in Non Responsive state?
Didn't it move to Up?
The log shows that InitVdsOnUp procedure was called, which is usually triggered when a host moves to Up.

Comment 2 Oved Ourfali 2014-12-03 07:11:30 UTC
I also see according to the vdsm log, that it was started at December 1st, 10:54, which fits the time in which InitVdsOnUp was called.

Comment 3 lkuchlan 2014-12-04 09:06:06 UTC
(In reply to Oved Ourfali from comment #1)
> According to the engine log, the host was down, but then was up again.
> Then, it had some failures until it was elected to be the SPM.
> For how long was the host in Non Responsive state?
> Didn't it move to Up?
> The log shows that InitVdsOnUp procedure was called, which is usually
> triggered when a host moves to Up.
The host stuck in Non Responsive until i started the vdsm via service (service vdsmd start)
Please find attached another logs from other reproduction

Comment 4 lkuchlan 2014-12-04 09:06:56 UTC
Created attachment 964487 [details]
logs

Comment 5 Yaniv Bronhaim 2014-12-04 09:21:34 UTC
sort of duplicate with Bug 1168689 , although the output is different
after installing and see an exception, vdsmd can run fine. but the installation skipped the chkconfig part due to the error
this was fixed as part of the dup bug

*** This bug has been marked as a duplicate of bug 1168689 ***

Comment 6 Yaniv Bronhaim 2014-12-04 09:22:47 UTC
keeping it open for verification. the fix is already merged

Comment 7 sefi litmanovich 2014-12-17 16:27:27 UTC
currently cannot verify this bug, as another bz (https://bugzilla.redhat.com/show_bug.cgi?id=1149832) which I had to re-open occurs during this scenario.
although that bug occurs, basically this bug seems to have been resolved as vdsm is up upon reboot.
Let me know if I should verify or put 1149832 as a blocker.

Comment 8 sefi litmanovich 2014-12-31 11:52:42 UTC
Verified with rhevm-3.5.0-0.27.el6ev.noarch.
on the rebooted host: vdsm-4.16.8.1-4.el7ev.x86_64.

1) have 2 hosts in 2 clusters.
2) stop one of the hosts manually.
3) host becomes non responsive.
4) power up the host.
5) choose 'confirm host has been rebooted' on rhevm.
6) host starts.
7) host state goes back to 'up' in rhevm.

Comment 9 Eyal Edri 2015-02-17 17:12:55 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.