Bug 1147988

Summary: [engine-backend] Cannot put host in maintenance and remove it after a failed installation while the host is non-responsive
Product: Red Hat Enterprise Virtualization Manager Reporter: Elad <ebenahar>
Component: ovirt-engineAssignee: Eli Mesika <emesika>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: ecohen, gklein, iheim, lpeer, lsurette, oourfali, pstehlik, rbalakri, Rhev-m-bugs, sherold, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: x86_64   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: org.ovirt.engine-root-3.5.0-14 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 17:16:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs from host and engine none

Description Elad 2014-09-30 13:27:35 UTC
Created attachment 942712 [details]
logs from host and engine

Description of problem:
I've added a new host to the setup. The host installation falied with:
Host green-vdsc installation failed. Network error during communication with the host.
I tried to put the host in maintenance in order to remove it and nothing happened. The host cannot be removed/reinstalled, it is stuck in the setup.

Version-Release number of selected component (if applicable):
rhev3.5 vt4
rhevm-3.5.0-0.13.beta.el6ev.noarch

How reproducible:
Unknown

Steps to Reproduce:
1. Get a failure of network error while installing a new host. The host will enter to non-responsive state
2. Put the host in maintenane
3. Try also to click on 'confirm host has been rebooted' 


Actual results:
Host installation failure:

e75f, Call Stack: null, Custom Event ID: -1, Message: Host green-vdsc installation failed. Network error during communication with the host.
2014-09-30 15:49:30,176 INFO  [org.ovirt.engine.core.bll.InstallVdsInternalCommand] (org.ovirt.thread.pool-7-thread-3) [52598db7] Lock freed to object EngineLock [exclusiveLocks= key: b17b6647-ba94-4d4d-a45f-5a819
da40dd4 value: VDS
, sharedLocks= ]


Setting the host as maintenance doesn't work.
I tried also 'confirm host has been rebooted' and then maintenance. It didn't help
The task of SetVdsStatusVDSCommand doesn't seems to be finished:

2014-09-30 15:56:53,021 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-42) [5720dd7b] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:56:53,128 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-42) [5720dd7b] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:56:53,132 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-42) [5720dd7b] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 5a3954aa
2014-09-30 15:57:00,425 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-34) [c0cd8a] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:00,448 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-34) [c0cd8a] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:00,452 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-34) [c0cd8a] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 13c82083
2014-09-30 15:57:12,176 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (ajp-/127.0.0.1:8702-3) [2c4c8ea7] Lock Acquired to object EngineLock [exclusiveLocks= key: b17b6647-ba94-4d4d-a45f-5a819da40dd4 value: VDS_FENCE
, sharedLocks= ]
2014-09-30 15:57:12,178 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (org.ovirt.thread.pool-7-thread-5) [2c4c8ea7] Running command: FenceVdsManualyCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:12,187 INFO  [org.ovirt.engine.core.bll.ClearNonResponsiveVdsVmsCommand] (org.ovirt.thread.pool-7-thread-5) [5b0eb0af] Running command: ClearNonResponsiveVdsVmsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:12,193 INFO  [org.ovirt.engine.core.vdsbroker.UpdateVdsVMsClearedVDSCommand] (org.ovirt.thread.pool-7-thread-5) [5b0eb0af] START, UpdateVdsVMsClearedVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4), log id: 905c4ff
2014-09-30 15:57:13,492 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-40) [190595b3] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:13,571 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-40) [190595b3] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:13,575 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-40) [190595b3] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 19f8834b
2014-09-30 15:57:22,838 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-30) [33e47126] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:22,919 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-30) [33e47126] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:22,923 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-30) [33e47126] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 4b08dd6b
 


Expected results:
It should be possible to put a host which its installation had failed in maintenance in order to remove or re-install it.

Additional info: logs from host and engine

Comment 1 Eyal Edri 2014-10-07 07:13:16 UTC
this bug status was moved to MODIFIED before engine vt5 was built,
hence moving to on_qa, if this was mistake and the fix isn't in,
please contact rhev-integ

Comment 2 Petr Beňas 2014-10-13 13:02:17 UTC
in vt5 it's possible to move to maintanance a host, which installation failed

Comment 4 Eyal Edri 2015-02-17 17:16:59 UTC
rhev 3.5.0 was released. closing.