Bug 1147988 - [engine-backend] Cannot put host in maintenance and remove it after a failed installation while the host is non-responsive
Summary: [engine-backend] Cannot put host in maintenance and remove it after a failed ...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.5.0
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.0
Assignee: Eli Mesika
QA Contact: Pavel Stehlik
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-09-30 13:27 UTC by Elad
Modified: 2016-02-10 19:35 UTC (History)
11 users (show)

Fixed In Version: org.ovirt.engine-root-3.5.0-14
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 17:16:59 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs from host and engine (1.31 MB, application/x-gzip)
2014-09-30 13:27 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 33613 0 master MERGED core: allow to maintenance host if install failed 2020-03-02 23:17:27 UTC
oVirt gerrit 33616 0 ovirt-engine-3.5 MERGED core: allow to maintenance host if install failed 2020-03-02 23:17:27 UTC

Description Elad 2014-09-30 13:27:35 UTC
Created attachment 942712 [details]
logs from host and engine

Description of problem:
I've added a new host to the setup. The host installation falied with:
Host green-vdsc installation failed. Network error during communication with the host.
I tried to put the host in maintenance in order to remove it and nothing happened. The host cannot be removed/reinstalled, it is stuck in the setup.

Version-Release number of selected component (if applicable):
rhev3.5 vt4
rhevm-3.5.0-0.13.beta.el6ev.noarch

How reproducible:
Unknown

Steps to Reproduce:
1. Get a failure of network error while installing a new host. The host will enter to non-responsive state
2. Put the host in maintenane
3. Try also to click on 'confirm host has been rebooted' 


Actual results:
Host installation failure:

e75f, Call Stack: null, Custom Event ID: -1, Message: Host green-vdsc installation failed. Network error during communication with the host.
2014-09-30 15:49:30,176 INFO  [org.ovirt.engine.core.bll.InstallVdsInternalCommand] (org.ovirt.thread.pool-7-thread-3) [52598db7] Lock freed to object EngineLock [exclusiveLocks= key: b17b6647-ba94-4d4d-a45f-5a819
da40dd4 value: VDS
, sharedLocks= ]


Setting the host as maintenance doesn't work.
I tried also 'confirm host has been rebooted' and then maintenance. It didn't help
The task of SetVdsStatusVDSCommand doesn't seems to be finished:

2014-09-30 15:56:53,021 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-42) [5720dd7b] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:56:53,128 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-42) [5720dd7b] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:56:53,132 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-42) [5720dd7b] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 5a3954aa
2014-09-30 15:57:00,425 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-34) [c0cd8a] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:00,448 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-34) [c0cd8a] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:00,452 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-34) [c0cd8a] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 13c82083
2014-09-30 15:57:12,176 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (ajp-/127.0.0.1:8702-3) [2c4c8ea7] Lock Acquired to object EngineLock [exclusiveLocks= key: b17b6647-ba94-4d4d-a45f-5a819da40dd4 value: VDS_FENCE
, sharedLocks= ]
2014-09-30 15:57:12,178 INFO  [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (org.ovirt.thread.pool-7-thread-5) [2c4c8ea7] Running command: FenceVdsManualyCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:12,187 INFO  [org.ovirt.engine.core.bll.ClearNonResponsiveVdsVmsCommand] (org.ovirt.thread.pool-7-thread-5) [5b0eb0af] Running command: ClearNonResponsiveVdsVmsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:12,193 INFO  [org.ovirt.engine.core.vdsbroker.UpdateVdsVMsClearedVDSCommand] (org.ovirt.thread.pool-7-thread-5) [5b0eb0af] START, UpdateVdsVMsClearedVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4), log id: 905c4ff
2014-09-30 15:57:13,492 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-40) [190595b3] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:13,571 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-40) [190595b3] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:13,575 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-40) [190595b3] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 19f8834b
2014-09-30 15:57:22,838 INFO  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (org.ovirt.thread.pool-7-thread-30) [33e47126] Running command: MaintenanceNumberOfVdssCommand internal: false. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2014-09-30 15:57:22,919 INFO  [org.ovirt.engine.core.bll.MaintenanceVdsCommand] (org.ovirt.thread.pool-7-thread-30) [33e47126] Running command: MaintenanceVdsCommand internal: true. Entities affected :  ID: b17b6647-ba94-4d4d-a45f-5a819da40dd4 Type: VDS
2014-09-30 15:57:22,923 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-7-thread-30) [33e47126] START, SetVdsStatusVDSCommand(HostName = green-vdsc, HostId = b17b6647-ba94-4d4d-a45f-5a819da40dd4, status=Maintenance, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 4b08dd6b
 


Expected results:
It should be possible to put a host which its installation had failed in maintenance in order to remove or re-install it.

Additional info: logs from host and engine

Comment 1 Eyal Edri 2014-10-07 07:13:16 UTC
this bug status was moved to MODIFIED before engine vt5 was built,
hence moving to on_qa, if this was mistake and the fix isn't in,
please contact rhev-integ

Comment 2 Petr Beňas 2014-10-13 13:02:17 UTC
in vt5 it's possible to move to maintanance a host, which installation failed

Comment 4 Eyal Edri 2015-02-17 17:16:59 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.