| Summary: | Host remains locked in the system after manual fencing | ||
|---|---|---|---|
| Product: | [oVirt] vdsm-jsonrpc-java | Reporter: | Moti Asayag <masayag> |
| Component: | Core | Assignee: | Piotr Kliczewski <pkliczew> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Pavol Brilla <pbrilla> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 1.2 | CC: | bugs, inetkach, mgoldboi, oourfali, pbrilla, pkliczew, pzhukov |
| Target Milestone: | ovirt-3.6.5 | Keywords: | WorkAround, ZStream |
| Target Release: | 1.1.9 | Flags: | rule-engine:
ovirt-3.6.z+
mgoldboi: planning_ack+ masayag: devel_ack+ pstehlik: testing_ack+ |
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2016-04-21 14:40:57 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Engine restart is a way to workaround the issue. Therefore, not a blocker for 3.6.3. Moving to 3.6.5. *** Bug 1273754 has been marked as a duplicate of this bug. *** Verified on rhevm-3.6.5.3-0.1.el6.noarch & vdsm-jsonrpc-java-1.1.9-1.el6ev.noarch |
Description of problem: After manual reboot of the host, the host was set into maintenance mode. There engine reports there are still vms running on that host. The 'Confirm host was rebooted' button was clicked, but this action never ends. A second attempt to click on 'Confirm host was rebooted' produces 'The host is locked. Other action is already in progress'. The logs shows the thread of 'Confirm host was rebooted' was never completed. The Thread dump reveals the thread which occupies the lock on monitoring object of the host, but never releases it. Version-Release number of selected component (if applicable): ovirt-engine-3.6 How reproducible: Sometimes Steps to Reproduce: There are no exact steps to reproduce this issue: The scenario occurred after data-center power-outage. Two of the hosts were manually rebooted, and remain stuck as described above. The rest of the hosts in the data-center didn't face the same issues. Actual results: Host cannot be confirmed as rebooted, no other action can be invoked on it. Expected results: Host state should be recoverable: "Confirm host was rebooted" should clear the running vms from the host. Additional info: The specific thread which holds the monitoring object: "DefaultQuartzScheduler_Worker-36" prio=10 tid=0x00007f9c054ed800 nid=0xac9 waiting on condition [0x00007f9bea6e4000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x000000078ff5c530> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:236) at org.ovirt.vdsm.jsonrpc.client.utils.OneTimeCallback.await(OneTimeCallback.java:27) at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.connect(ReactorClient.java:94) at org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.getClient(JsonRpcClient.java:114) at org.ovirt.vdsm.jsonrpc.client.JsonRpcClient.call(JsonRpcClient.java:73) at org.ovirt.engine.core.vdsbroker.jsonrpc.FutureMap.<init>(FutureMap.java:68) at org.ovirt.engine.core.vdsbroker.jsonrpc.JsonRpcVdsServer.getCapabilities(JsonRpcVdsServer.java:268) at org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand.executeVdsBrokerCommand(GetCapabilitiesVDSCommand.java:15) at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:110) at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65) at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467) at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:647) at org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:119) at org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84) at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:227) - locked <0x000000078ff9cfb8> (a java.lang.Object) at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81) at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52) at org.quartz.core.JobRunShell.run(JobRunShell.java:213) at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) - locked <0x000000078e22a8f8> (a org.quartz.simpl.SimpleThreadPool$WorkerThread)