Bug 991729 - [engine-backend] host cannot be activated after it had been updated to maintenance in DB, while engine has never got the response for DisconnectStoragePool
[engine-backend] host cannot be activated after it had been updated to mainte...
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.3.0
x86_64 Unspecified
unspecified Severity high
: ---
: 3.3.0
Assigned To: Martin Perina
Tareq Alayan
infra
: Regression
Depends On:
Blocks: 994608
  Show dependency treegraph
 
Reported: 2013-08-04 02:28 EDT by Elad
Modified: 2016-02-10 14:21 EST (History)
8 users (show)

See Also:
Fixed In Version: is13
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 994608 (view as bug list)
Environment:
Last Closed: 2014-01-21 17:18:30 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
logs (2.66 MB, application/x-gzip)
2013-08-04 02:28 EDT, Elad
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 18691 None None None Never

  None (edit)
Description Elad 2013-08-04 02:28:38 EDT
Created attachment 782415 [details]
logs

Description of problem:
Engine cannot activate host when it had been updated as Maintenance status in DB and engine has never got a response to DisconnectStoragePool request. 

Version-Release number of selected component (if applicable):
rhevm-3.3.0-0.11.master.el6ev.noarch
vdsm-4.12.0-rc3.12.git139ec2f.el6ev.x86_64


How reproducible:
100%

Steps to Reproduce:
on 2 host cluster and active storage pool:
- set SPM to maintenance
- block connectivity between host to RHEVM with iptables right after engine set host to maintenance in DB


Actual results:

engine sets host to maintenance in DB:

2013-08-03 17:28:03,299 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-6) Updated vds status from Preparing for Maintenance to Maint
enance in database,  vds = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb : nott-vds2


DisconnectStoragePools is requested:

2013-08-03 17:28:03,307 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-6) START, DisconnectStoragePoolVDSComman
d(HostName = nott-vds2, HostId = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb, storagePoolId = aa047779-f7a9-4888-bd9c-fcf9d2f76e7e, vds_spm_id = 1), log id: 6ea687fd
2013-08-03 17:31:03,308 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-6) Command DisconnectStoragePoolVDS exec
ution failed. Exception: VDSNetworkException: java.util.concurrent.TimeoutException
2013-08-03 17:31:03,308 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (DefaultQuartzScheduler_Worker-6) FINISH, DisconnectStoragePoolVDSComma
nd, log id: 6ea687fd


engine reports a problem with DisconnectStoragePools:

2013-08-03 17:31:03,333 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-6) Host encounter a problem moving to maintenance mode, proba
bly error during disconnecting it from pool org.ovirt.engine.core.common.errors.VdcBLLException: VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
java.util.concurrent.TimeoutException (Failed with VDSM error VDS_NETWORK_ERROR and code 5022). The Host will stay in Maintenance

engine fails to activate host:

2013-08-03 17:32:02,152 INFO  [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (pool-5-thread-42) [5bdce1af] START, ActivateVdsVDSCommand(HostName = nott-vds2, HostId = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb), log id: 5afe05a3
2013-08-03 17:32:02,152 INFO  [org.ovirt.engine.core.vdsbroker.VdsManager] (pool-5-thread-42) [5bdce1af] Failed to activate VDS = 223b05cc-4797-4a4f-9f2a-c4be0fa232eb with error: null.


engine fails to set host to maintenance because it is already updated as maintenance in DB:

2013-08-03 17:35:19,675 WARN  [org.ovirt.engine.core.bll.MaintenanceNumberOfVdssCommand] (ajp-/127.0.0.1:8702-10) [7052eb92] CanDoAction of action MaintenanceNumberOfVdss failed
. Reasons:VAR__TYPE__HOST,VAR__ACTION__MAINTENANCE,VDS_CANNOT_MAINTENANCE_VDS_IS_IN_MAINTENANCE




Additional info: logs
Comment 2 Elad 2013-08-04 08:54:53 EDT
Host is stuck in 'Unassigned' state. There is nothing user can do in order to activate/remove the host
Comment 4 Tareq Alayan 2013-09-03 07:57:41 EDT
verified in is12. 
Host back to up again.
Comment 5 Itamar Heim 2014-01-21 17:18:30 EST
Closing - RHEV 3.3 Released
Comment 6 Itamar Heim 2014-01-21 17:24:53 EST
Closing - RHEV 3.3 Released

Note You need to log in before you can comment on or make changes to this bug.