Bug 1414311

Summary: After failed host-deploy and vdsmd, moving to maintenance hangs
Product: [oVirt] ovirt-engine Reporter: Yedidyah Bar David <didi>
Component: Backend.CoreAssignee: Moti Asayag <masayag>
Status: CLOSED INSUFFICIENT_DATA QA Contact: meital avital <mavital>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: futureCC: bugs, didi, mperina, oourfali
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-20 09:12:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine.log none

Description Yedidyah Bar David 2017-01-18 09:41:10 UTC
Created attachment 1242102 [details]
engine.log

Description of problem:

If host-deploy reinstall fails and leaves vdsmd down, moving the host to Maintenance gets stuck with status 'Preparing for Maintenance'.

A quick search in the code finds HostPreparingForMaintenanceIdleTime, which is set to 300 seconds (also verified that with engine-config), but after more than 5 minutes nothing changed and nothing was written to engine.log.

Restarting the engine solved this. Host was still non-responsive, and moving it to maintenance worked immediately.

Version-Release number of selected component (if applicable):

Current master snapshot

How reproducible:

Not sure, always on my current system.

Steps to Reproduce:
1. deploy a host successfully
2. reinstall it, and make this fail somehow, including vdsmd. Perhaps it's enough to manually stop vdsmd, didn't try.
3. Move the host to Maintenance.

Actual results:

Host is stuck in 'Preparing for Maintenance'.

Expected results:

Host moves immediately, or at most after some timeout.

Additional info:

Restarting the engine solves this.

Comment 1 Martin Perina 2017-02-07 09:43:13 UTC
Didi, is this error reproducable? If not I'd close this as WORKSFORME as we haven't found any clue why it happened and we failed all attempts to reproduce this issue (host was always moved to Maintenance at the end).

Comment 2 Red Hat Bugzilla 2023-09-14 03:52:24 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days