Bug 991729 - [engine-backend] host cannot be activated after it had been updated to maintenance in DB, while engine has never got the response for DisconnectStoragePool
Summary: [engine-backend] host cannot be activated after it had been updated to mainte...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.3.0
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ---
: 3.3.0
Assignee: Martin Perina
QA Contact: Tareq Alayan
URL:
Whiteboard: infra
Depends On:
Blocks: 994608
TreeView+ depends on / blocked
 
Reported: 2013-08-04 06:28 UTC by Elad
Modified: 2016-02-10 19:21 UTC (History)
8 users (show)

Fixed In Version: is13
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 994608 (view as bug list)
Environment:
Last Closed: 2014-01-21 22:18:30 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (2.66 MB, application/x-gzip)
2013-08-04 06:28 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 18691 0 None None None Never

Description Elad 2013-08-04 06:28:38 UTC
Created attachment 782415 [details]
logs

Description of problem:
Engine cannot activate host when it had been updated as Maintenance status in DB and engine has never got a response to DisconnectStoragePool request. 

Version-Release number of selected component (if applicable):
rhevm-3.3.0-0.11.master.el6ev.noarch
vdsm-4.12.0-rc3.12.git139ec2f.el6ev.x86_64


How reproducible:
100%

Steps to Reproduce:
on 2 host cluster and active storage pool:
- set SPM to maintenance
- block connectivity between host to RHEVM with iptables right after engine set host to maintenance in DB


Actual results:

engine sets host to maintenance in DB:

2013-08-03 17:28:03,299 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-6) Updated vds status from Preparing for Maintenance to Maint
enance in database,  vds = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb : nott-vds2


DisconnectStoragePools is requested:

2013-08-03 17:28:03,307 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-6) START, DisconnectStoragePoolVDSComman
d(HostName = nott-vds2, HostId = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb, storagePoolId = aa047779-f7a9-4888-bd9c-fcf9d2f76e7e, vds_spm_id = 1), log id: 6ea687fd
2013-08-03 17:31:03,308 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-6) Command DisconnectStoragePoolVDS exec
ution failed. Exception: VDSNetworkException: java.util.concurrent.TimeoutException
2013-08-03 17:31:03,308 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-6) FINISH, DisconnectStoragePoolVDSComma
nd, log id: 6ea687fd


engine reports a problem with DisconnectStoragePools:

2013-08-03 17:31:03,333 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-6) Host encounter a problem moving to maintenance mode, proba
bly error during disconnecting it from pool org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.util.concurrent.TimeoutException (Failed with VDSM error VDS_NETWORK_ERROR and code 5022). The Host will stay in Maintenance

engine fails to activate host:

2013-08-03 17:32:02,152 INFO  [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (pool-5-thread-42) [5bdce1af] START, ActivateVdsVDSCommand(HostName = nott-vds2, HostId = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb), log id: 5afe05a3
2013-08-03 17:32:02,152 INFO  [org.ovirt.engine.core.vdsbroker.VdsManager] (pool-5-thread-42) [5bdce1af] Failed to activate VDS = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb with error: null.


engine fails to set host to maintenance because it is already updated as maintenance in DB:

2013-08-03 17:35:19,675 WARN  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (ajp-/127.0.0.1:8702-10) [7052eb92] CanDoAction of action MaintenanceNumberOfVdss failed
. Reasons:VAR__TYPE__HOST,VAR__ACTION__MAINTENANCE,VDS_CANNOT_MAINTENANCE_VDS_IS_IN_MAINTENANCE




Additional info: logs

Comment 2 Elad 2013-08-04 12:54:53 UTC
Host is stuck in 'Unassigned' state. There is nothing user can do in order to activate/remove the host

Comment 4 Tareq Alayan 2013-09-03 11:57:41 UTC
verified in is12. 
Host back to up again.

Comment 5 Itamar Heim 2014-01-21 22:18:30 UTC
Closing - RHEV 3.3 Released

Comment 6 Itamar Heim 2014-01-21 22:24:53 UTC
Closing - RHEV 3.3 Released


Note You need to log in before you can comment on or make changes to this bug.