Bug 894231 - Failed to move a host to 'Maintenance' mode when it is in 'Install Failed' state
Summary: Failed to move a host to 'Maintenance' mode when it is in 'Install Failed' state
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine-webadmin-portal
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.2.0
Assignee: Yaniv Bronhaim
QA Contact: Pavel Stehlik
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-11 06:08 UTC by Shruti Sampat
Modified: 2016-02-10 19:01 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2013-02-14 09:34:11 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine logs (3.93 KB, text/x-log)
2013-01-11 06:08 UTC, Shruti Sampat
no flags Details
full-engine-logs (543.98 KB, text/x-log)
2013-01-21 04:26 UTC, Shruti Sampat
no flags Details

Description Shruti Sampat 2013-01-11 06:08:11 UTC
Created attachment 676665 [details]
engine logs

Description of problem:
---------------------------------------
When the installation of a host fails, and it is in the 'Install Failed' state, trying to move it to 'Maintenance' mode fails in the first attempt. 

After clicking on 'Maintenance' button, a confirmation prompt is seen, clicking on 'OK' results in the host going to 'Maintenance' state briefly and then immediately going to 'Non-responsive' state. Trying to move it to 'Maintenance' mode the second time works fine.

Version-Release number of selected component (if applicable):
oVirt Engine Version: 3.2.0-4.el6ev 

How reproducible:
Always

Steps to Reproduce:
1. For a host that is in the 'Install Failed' state, click on the 'Maintenance' button. On the confirmation dialog box that appears, click on 'OK'.
  
Actual results:
The host goes to 'Maintenance' briefly and then goes to 'Non-responsive' state.

Expected results:
The host should remain in the 'Maintenance' mode.

Additional info:

Comment 1 Yaniv Bronhaim 2013-01-20 12:07:00 UTC
please attach full engine.log. can't see the reason for the installation failure and the exact flow that got you to non-responsive state.

Comment 2 Shruti Sampat 2013-01-21 04:26:28 UTC
Created attachment 684041 [details]
full-engine-logs

Comment 3 Yaniv Bronhaim 2013-01-21 16:43:44 UTC
The bug happens because the way we handle exception in vdsManager.
When setting the status to maintenance while vsd status is "install failed", we first set vsd status to PreparingForMaintenance first, this cause vdsManager.isMonitoringNeeded to return True, and this cause calls to VdsUpdateRunTimeInfo.refreshVdsRunTimeInfo. 

In refreshVdsRunTimeInfo we call to VdsUpdateRunTimeInfo.refreshVdsStats when vds status is PreparingForMaintenance. there we initiate GetStatsVDSCommand that fails on exception because vdsm is not installed. we assume that if exception is raised in this section, it means we have connection error and we need to turn the host to non-responsive, so we call to vdsManager.handleNetworkException.

As I see isMonitoringNeeded should return true when vds status is PreparingForMaintenance for processing the flow of preparingForMaintenance.

It can be handled by checking vdsStatus before calling to handleNetworkException, but also there if something failed with vds communication while we preparingForMaintenance we should turn the status to non-responsive. and anyway, the flow of preparingForMaintenance after install failed is redundant. 

So the only thing I can think of is to jump over preparingForMaintenance status when previous status was 'install failed', this happens in MaintananceNumberOfVdssCommand.setVdsStatusToPrepareForMaintaice that set the status to preparingForMaintenance. 

This is my suggestion: http://gerrit.ovirt.org/11272

Please correct me if I missed something.

Comment 4 Shruti Sampat 2013-01-22 04:59:22 UTC
Your suggestion of skipping the 'PreparingForMaintenance' state, for a host which is in the 'InstallFailed' state looks fine to me.

Comment 5 Yaniv Bronhaim 2013-01-24 14:09:53 UTC
please see comments of https://bugzilla.redhat.com/show_bug.cgi?id=702914

and decide if you want to fix this issue.


Note You need to log in before you can comment on or make changes to this bug.