Hide Forgot
(public comment) some of the vms faield to migrate and the host is in recovering state for long time. also I tried to stop all the running vms in the host manually via 'kill' command. the engine didnt refresh the vm monitor and it still showing 54 vms on it. i tired to push the host into "confirm host rebooted" and maintenance state but it fails seems like the host state cannot be released. 2016-11-09 09:37:48,084 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-33) [3ce4c3ec] START, SetVdsStatusVDSCommand(HostName = , SetVdsStatusVDSCommandParameters:{runAsync='true', hostId='7c2a9b0b-4fca-4cd1-8950-01ad5af9ea68', status='PreparingForMaintenance', nonOperationalReason='NONE', stopSpmFailureLogged='true', maintenanceReason='null'}), log id: 5b20d170 2016-11-09 09:37:49,940 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-33) [7bb7776] Correlation ID: 3ce4c3ec, Job ID: 86167ccb-08a7-4706-be4a-956c84b00df1, Call Stack: null, Custom Event ID: -1, Message: Host cannot change into maintenance mode - not all Vms have been migrated successfully. Consider manual intervention: stopping/migrating Vms: seems like this problem depends on BZ https://bugzilla.redhat.com/show_bug.cgi?id=1390296
Created attachment 1218871 [details] Thread dumps
Please provide VDSM logs, it's hard to say what happened and why the is recovering for a long time. Also it would help to engine.log from the time that host recovering issue started to appear
(In reply to Martin Perina from comment #4) > Please provide VDSM logs, it's hard to say what happened and why the is > recovering for a long time. Also it would help to engine.log from the time > that host recovering issue started to appear is it hard to say when it appears, we have a lot of hosts and the problem sporadically occur. there is another method we can take?
The host id should be part of the log, so please provide the host logs for this one once this happens.
Please re-open if it happens again and provide all details.
this bug might related to the monitoring lock issue... afaik, we agreed to pending with it, until the https://bugzilla.redhat.com/show_bug.cgi?id=1364791 will be fixed. this bug https://bugzilla.redhat.com/show_bug.cgi?id=1364791 generates lots of side affects mainly around vds.