Description of problem: Hosts are non responsive and power management/fencing does not work. When I check the hosts they appear connected and registered themselves to the engine. Pinging from the nodes works as well as telnet/curl to and from the engine and the other node. So I don't know if power management doesn't work because they are non responsive or they are non responsive because power management doesn't work. Version-Release number of selected component (if applicable): 3.4.0beta2 Manual fencing works from the node: fence_ipmilan -A password -p ************* -a xxxx01.xxxxx.xxxx -P -l power -o status Getting status of IPMI:xxxx01.xxxxx.xxxx...Chassis power = On Done fence_ipmilan -A password -p ************* -a xxxx01.xxxxx.xxxx -P -l power -o reboot Rebooting machine @ IPMI:xxxx01.xxxxx.xxxx...Done When I leave out "-P" (-P Use Lanplus to improve security of connection) it fails. fence_ipmilan -A password -p *********** -a xxxx01.xxxxx.xxxx -l power -o status Getting status of IPMI:xxxx01.xxxxx.xxxx...Chassis power = Unknown Failed from the engine.log: 014-02-11 22:24:28,551 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-1) Command GetCapabilitiesVDSCommand(HostName = xxxx02.xxxxx.xxxxx, HostId = 9067a99f-992b-41cf-89b5-fbf385b5473d, vds=Host[xxxx02.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:29,338 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (ajp--127.0.0.1-8702-8) [52012893] Lock Acquired to object EngineLock [exclusiveLocks= key: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 value: VDS , sharedLocks= ] 2014-02-11 22:24:29,420 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] Running command: ActivateVdsCommand internal: false. Entities affected : ID: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 Type: VDS 2014-02-11 22:24:29,421 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] Before acquiring lock in order to prevent monitoring for host xxxx01.xxxxx.xxxxx from data-center Default 2014-02-11 22:24:29,421 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] Lock acquired, from now a monitoring of host will be skipped for host xxxx01.xxxxx.xxxxx from data-center Default 2014-02-11 22:24:29,482 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] START, SetVdsStatusVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, status=Unassigned, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 2e0ef762 2014-02-11 22:24:29,497 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] FINISH, SetVdsStatusVDSCommand, log id: 2e0ef762 2014-02-11 22:24:29,523 INFO [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] START, ActivateVdsVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00), log id: 7ce5d852 2014-02-11 22:24:29,545 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (org.ovirt.thread.pool-6-thread-11) [52012893] Command GetCapabilitiesVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, vds=Host[xxxx01.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:29,546 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (org.ovirt.thread.pool-6-thread-11) [52012893] Server failed to respond, vds_id = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, vds_name = xxxx01.xxxxx.xxxxx, vm_count = 0, spm_status = None, non-responsive_timeout (seconds) = 120, error = java.net.ConnectException: Connection refused 2014-02-11 22:24:29,576 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-11) [52012893] Correlation ID: null, Call Stack: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.ConnectException: Connection refused at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:102) at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56) at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:28) at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:537) at org.ovirt.engine.core.vdsbroker.VdsManager.refreshHost(VdsManager.java:403) at org.ovirt.engine.core.vdsbroker.VdsManager.activate(VdsManager.java:389) at org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand.executeVdsIdCommand(ActivateVdsVDSCommand.java:13) at org.ovirt.engine.core.vdsbroker.VdsIdVDSCommandBase.executeVDSCommand(VdsIdVDSCommandBase.java:42) at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:56) at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:28) at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:399) at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) at org.ovirt.engine.core.bll.CommandBase.runVdsCommand(CommandBase.java:1971) at org.ovirt.engine.core.bll.ActivateVdsCommand.executeCommand(ActivateVdsCommand.java:53) at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1123) at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1208) at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1884) at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1228) at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:351) at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommand(MultipleActionsRunner.java:179) at org.ovirt.engine.core.bll.MultipleActionsRunner.runCommands(MultipleActionsRunner.java:151) at org.ovirt.engine.core.bll.MultipleActionsRunner$2.run(MultipleActionsRunner.java:160) at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:97) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:579) at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:618) at org.ovirt.engine.core.utils.ssl.AuthSSLProtocolSocketFactory.createSocket(AuthSSLProtocolSocketFactory.java:288) at org.apache.commons.httpclient.HttpConnection.open(Unknown Source) at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$HttpConnectionAdapter.open(Unknown Source) at org.apache.commons.httpclient.HttpMethodDirector.executeWithRetry(Unknown Source) at org.apache.commons.httpclient.HttpMethodDirector.executeMethod(Unknown Source) at org.apache.commons.httpclient.HttpClient.executeMethod(Unknown Source) at org.apache.commons.httpclient.HttpClient.executeMethod(Unknown Source) at org.apache.xmlrpc.client.XmlRpcCommonsTransport.writeRequest(XmlRpcCommonsTransport.java:227) at org.apache.xmlrpc.client.XmlRpcStreamTransport.sendRequest(XmlRpcStreamTransport.java:151) at org.apache.xmlrpc.client.XmlRpcHttpTransport.sendRequest(XmlRpcHttpTransport.java:143) at org.apache.xmlrpc.client.XmlRpcClientWorker.execute(XmlRpcClientWorker.java:56) at org.apache.xmlrpc.client.XmlRpcClient.execute(XmlRpcClient.java:167) at org.apache.xmlrpc.client.XmlRpcClient.execute(XmlRpcClient.java:137) at org.apache.xmlrpc.client.XmlRpcClient.execute(XmlRpcClient.java:126) at org.apache.xmlrpc.client.util.ClientFactory$1.invoke(ClientFactory.java:140) at com.sun.proxy.$Proxy54.getVdsCapabilities(Unknown Source) at sun.reflect.GeneratedMethodAccessor39.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.ovirt.engine.core.vdsbroker.xmlrpc.XmlRpcUtils$AsyncProxy$InternalCallable.call(XmlRpcUtils.java:225) at java.util.concurrent.FutureTask.run(FutureTask.java:262) ... 5 more , Custom Event ID: -1, Message: Host xxxx01.xxxxx.xxxxx is non responsive. 2014-02-11 22:24:29,609 INFO [org.ovirt.engine.core.bll.VdsEventListener] (org.ovirt.thread.pool-6-thread-8) [52012893] ResourceManager::vdsNotResponding entered for Host b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, 10.50.50.111 2014-02-11 22:24:29,676 INFO [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand] (org.ovirt.thread.pool-6-thread-8) [14d591cc] Lock Acquired to object EngineLock [exclusiveLocks= key: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 value: VDS_FENCE , sharedLocks= ] 2014-02-11 22:24:29,720 INFO [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (org.ovirt.thread.pool-6-thread-11) [228b1556] Running command: HandleVdsCpuFlagsOrClusterChangedCommand internal: true. Entities affected : ID: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 Type: VDS 2014-02-11 22:24:29,726 ERROR [org.ovirt.engine.core.bll.HandleVdsCpuFlagsOrClusterChangedCommand] (org.ovirt.thread.pool-6-thread-11) [228b1556] Could not find server cpu for server b08aba0a-ad0a-4960-a4ca-8b9abbd18c00:xxxx01.xxxxx.xxxxx, flags: null 2014-02-11 22:24:29,769 WARN [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (org.ovirt.thread.pool-6-thread-11) [558c5410] CanDoAction of action HandleVdsVersion failed. Reasons:VDS_CANNOT_CHECK_VERSION_HOST_NON_RESPONSIVE 2014-02-11 22:24:29,774 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (org.ovirt.thread.pool-6-thread-11) [558c5410] Failed to activate VDS = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 with error: java.net.ConnectException: Connection refused. 2014-02-11 22:24:29,775 INFO [org.ovirt.engine.core.vdsbroker.ActivateVdsVDSCommand] (org.ovirt.thread.pool-6-thread-11) [558c5410] FINISH, ActivateVdsVDSCommand, return: Host[xxxx01.xxxxx.xxxxx], log id: 7ce5d852 2014-02-11 22:24:29,887 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (org.ovirt.thread.pool-6-thread-11) [558c5410] Activate finished. Lock released. Monitoring can run now for host xxxx01.xxxxx.xxxxx from data-center Default 2014-02-11 22:24:29,898 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-11) [558c5410] Correlation ID: 52012893, Job ID: a1ffea0f-33dd-4850-aa1a-90f9d4ca7d23, Call Stack: null, Custom Event ID: -1, Message: Host xxxx01.xxxxx.xxxxx was activated by admin. 2014-02-11 22:24:29,909 INFO [org.ovirt.engine.core.bll.ActivateVdsCommand] (org.ovirt.thread.pool-6-thread-11) [558c5410] Lock freed to object EngineLock [exclusiveLocks= key: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 value: VDS , sharedLocks= ] 2014-02-11 22:24:29,954 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-8) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Host xxxx02.xxxxx.xxxxx from cluster Default was chosen as a proxy to execute Restart command on Host xxxx01.xxxxx.xxxxx. 2014-02-11 22:24:29,954 INFO [org.ovirt.engine.core.bll.FenceExecutor] (org.ovirt.thread.pool-6-thread-8) Using Host xxxx02.xxxxx.xxxxx from cluster Default as proxy to execute Restart command on Host xxxx01.xxxxx.xxxxx 2014-02-11 22:24:29,976 INFO [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand] (org.ovirt.thread.pool-6-thread-8) Running command: VdsNotRespondingTreatmentCommand internal: true. Entities affected : ID: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 Type: VDS 2014-02-11 22:24:30,053 INFO [org.ovirt.engine.core.bll.StopVdsCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Running command: StopVdsCommand internal: true. Entities affected : ID: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 Type: VDS 2014-02-11 22:24:30,073 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] START, SetVdsStatusVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, status=Reboot, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 247e5902 2014-02-11 22:24:30,087 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] FINISH, SetVdsStatusVDSCommand, log id: 247e5902 2014-02-11 22:24:30,209 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Host xxxx02.xxxxx.xxxxx from cluster Default was chosen as a proxy to execute Stop command on Host xxxx01.xxxxx.xxxxx. 2014-02-11 22:24:30,210 INFO [org.ovirt.engine.core.bll.FenceExecutor] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Using Host xxxx02.xxxxx.xxxxx from cluster Default as proxy to execute Stop command on Host xxxx01.xxxxx.xxxxx 2014-02-11 22:24:30,228 INFO [org.ovirt.engine.core.bll.FenceExecutor] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Executing <Stop> Power Management command, Proxy Host:xxxx02.xxxxx.xxxxx, Agent:ipmilan, Target Host:xxxx01.xxxxx.xxxxx, Management IP:xxxx01ipmi.xxxxx.xxxxx, User:power, Options:lanplus=yes 2014-02-11 22:24:30,250 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] START, FenceVdsVDSCommand(HostName = xxxx02.xxxxx.xxxxx, HostId = 9067a99f-992b-41cf-89b5-fbf385b5473d, targetVdsId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, action = Stop, ip = xxxx01ipmi.xxxxx.xxxxx, port = , type = ipmilan, user = power, password = ******, options = 'lanplus=yes'), log id: 9216d3c 2014-02-11 22:24:30,269 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Command FenceVdsVDSCommand(HostName = xxxx02.xxxxx.xxxxx, HostId = 9067a99f-992b-41cf-89b5-fbf385b5473d, targetVdsId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, action = Stop, ip = xxxx01ipmi.xxxxx.xxxxx, port = , type = ipmilan, user = power, password = ******, options = 'lanplus=yes') execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:30,270 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] FINISH, FenceVdsVDSCommand, log id: 9216d3c 2014-02-11 22:24:30,270 ERROR [org.ovirt.engine.core.bll.StopVdsCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Failed to stop VDS using Primary Power Management agent 2014-02-11 22:24:30,290 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Host xxxx01.xxxxx.xxxxx became non responsive. It has no power management configured. Please check the host status, manually reboot it, and click "Confirm Host Has Been Rebooted" 2014-02-11 22:24:30,311 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to power fence host xxxx01.xxxxx.xxxxx. Please check the host status and it's power management settings, and then manually reboot it and click "Confirm Host Has Been Rebooted" 2014-02-11 22:24:30,312 ERROR [org.ovirt.engine.core.bll.StopVdsCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Command org.ovirt.engine.core.bll.StopVdsCommand throw Vdc Bll exception. With error message VdcBLLException: VDS_FENCE_OPERATION_FAILED (Failed with error VDS_FENCE_OPERATION_FAILED and code 5021) 2014-02-11 22:24:30,331 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Correlation ID: 4636cc31, Job ID: e60365f1-1fcb-49a8-8402-2e9d4c6e15b6, Call Stack: null, Custom Event ID: -1, Message: Failed to stop Host xxxx01.xxxxx.xxxxx, (User: Engine). 2014-02-11 22:24:30,342 WARN [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Restart host action failed, updating host b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 to NonResponsive 2014-02-11 22:24:30,394 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] START, SetVdsStatusVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, status=NonResponsive, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 25d0c0dc 2014-02-11 22:24:30,409 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] FINISH, SetVdsStatusVDSCommand, log id: 25d0c0dc 2014-02-11 22:24:30,420 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Correlation ID: 14d591cc, Job ID: 0cdb31d5-bb77-42d1-8764-9bb473b5ced3, Call Stack: null, Custom Event ID: -1, Message: Host xxxx01.xxxxx.xxxxx is rebooting. 2014-02-11 22:24:30,431 INFO [org.ovirt.engine.core.bll.VdsNotRespondingTreatmentCommand] (org.ovirt.thread.pool-6-thread-8) [4636cc31] Lock freed to object EngineLock [exclusiveLocks= key: b08aba0a-ad0a-4960-a4ca-8b9abbd18c00 value: VDS_FENCE , sharedLocks= ] 2014-02-11 22:24:31,862 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-61) Command GetCapabilitiesVDSCommand(HostName = xxxx02.xxxxx.xxxxx, HostId = 9067a99f-992b-41cf-89b5-fbf385b5473d, vds=Host[xxxx02.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:33,415 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-71) Command GetCapabilitiesVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, vds=Host[xxxx01.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:34,964 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-60) Command GetCapabilitiesVDSCommand(HostName = xxxx02.xxxxx.xxxxx, HostId = 9067a99f-992b-41cf-89b5-fbf385b5473d, vds=Host[xxxx02.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:36,770 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-100) Command GetCapabilitiesVDSCommand(HostName = xxxx01.xxxxx.xxxxx, HostId = b08aba0a-ad0a-4960-a4ca-8b9abbd18c00, vds=Host[xxxx01.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-02-11 22:24:38,085 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-91) Command GetCapabilitiesVDSCommand(HostName = xxxx02.xxxxx.xxxxx, HostId = 9067a99f-992b-41cf-89b5-fbf385b5473d, vds=Host[xxxx02.xxxxx.xxxxx]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused
Created attachment 862013 [details] screenshot of node connection status
Setting target release to current version for consideration and review. please do not push non-RFE bugs to an undefined target release to make sure bugs are reviewed for relevancy, fix, closure, etc.
Can you please explain/show how did you configured the PM agent in the UI? Few comments : 1) you have to make sure that you have an additional host that can server as a proxy for fencing operations in your Cluster/DC 2) You can add/set the lanplus in the options field 3) What is happened after you do 1) and 2) and click the TEST button for the host that has PM configured ?
1) All the nodes are in an unresponding state so I don't know if they will be able to act as proxy. 2)I have "lanplus=yes¨ specified in the options field but it doesn't appear to help 3)In beta2 there were no options and now with the premature beta3 upgrade it at least gives me cluster and dc. The error I get is: Test Failed, VdcBLLException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.ConnectException: Connection refused (Failed with error VDS_NETWORK_ERROR and code 5022) It could be that something got broken during all the beta upgrades so I will do a clean redeployment of the entire system with the latest beta3 tomorrow. If I can reproduce, I will sniff the traffic to see what happens.
(In reply to Netbulae from comment #4) > 1) All the nodes are in an unresponding state so I don't know if they will > be able to act as proxy. This points to a deeper problem in your env , from the log I can see that there are lot of network errors > 2)I have "lanplus=yes¨ specified in the options field but it doesn't appear > to help BTW key=value is for non boolean keys , you should but only "lanplus" in the options field > 3)In beta2 there were no options and now with the premature beta3 upgrade it > at least gives me cluster and dc. The error I get is: > > Test Failed, VdcBLLException: > org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: > java.net.ConnectException: Connection refused (Failed with error > VDS_NETWORK_ERROR and code 5022) > > It could be that something got broken during all the beta upgrades so I will > do a clean redeployment of the entire system with the latest beta3 tomorrow. Good idea > > If I can reproduce, I will sniff the traffic to see what happens. Please keep us updated on that , Thanks
Created attachment 865921 [details] power management I redeployed everything from scratch with beta3, but still the same issue
I do have some problems installing the node: 2014-02-21 11:28:08,619 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (ajp--127.0.0.1-8702-5) [7c3ea78a] Initialize vdsBroker (xx.xx.xx.xx1,54,321) 2014-02-21 11:28:08,752 INFO [org.ovirt.engine.core.vdsbroker.ResourceManager] (ajp--127.0.0.1-8702-5) [7c3ea78a] VDS bec0ed78-d8c0-453c-a69e-eb4d337d8cda was added to the Resource Manager 2014-02-21 11:28:08,752 INFO [org.ovirt.engine.core.vdsbroker.AddVdsVDSCommand] (ajp--127.0.0.1-8702-5) [7c3ea78a] FINISH, AddVdsVDSCommand, log id: 78a939c0 2014-02-21 11:28:08,798 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-5) [7c3ea78a] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Failed to verify Power Management configuration for Host node1. 2014-02-21 11:28:09,279 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-5) [7c3ea78a] Correlation ID: 73d3c257, Call Stack: null, Custom Event ID: -1, Message: Host node1 was added by Non interactive user. 2014-02-21 11:28:09,279 INFO [org.ovirt.engine.core.bll.RegisterVdsQuery] (ajp--127.0.0.1-8702-5) [7c3ea78a] RegisterVdsQuery::Register - Registered a new VDS pending approval - Name: node1, Hostname: xx.xx.xx.xx1, UniqueID: 54443858-4E54-2500-904B-0025904B9C22 2014-02-21 11:28:09,289 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-5) [7c3ea78a] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Host node1 registered. 2014-02-21 11:28:25,249 INFO [org.ovirt.engine.core.services.RegisterServlet] (ajp--127.0.0.1-8702-1) Registration request: source='xx.xx.xx.xx1', secured='true', address='xx.xx.xx.xx1', vdsPort=54321, name='node1', uniqueId='54443858-4E54-2500-904B-0025904B9C22_00:25:90:01:d1:74' 2014-02-21 11:28:25,268 ERROR [org.ovirt.engine.core.bll.RegisterVdsQuery] (ajp--127.0.0.1-8702-1) Query execution failed due to invalid inputs. VDS_STATUS_NOT_VALID_FOR_UPDATE 2014-02-21 11:28:44,316 ERROR [org.ovirt.engine.core.bll.GetHostArchitectureQuery] (ajp--127.0.0.1-8702-8) Query GetHostArchitectureQuery failed. Exception message is null : java.lang.NullPointerException: java.lang.NullPointerException at org.ovirt.engine.core.bll.GetHostArchitectureQuery.executeQueryCommand(GetHostArchitectureQuery.java:21) [bll.jar:] at org.ovirt.engine.core.bll.QueriesCommandBase.executeCommand(QueriesCommandBase.java:66) [bll.jar:] at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:28) [dal.jar:] at org.ovirt.engine.core.bll.Backend.runQueryImpl(Backend.java:480) [bll.jar:] at org.ovirt.engine.core.bll.Backend.runQuery(Backend.java:457) [bll.jar:] at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) [:1.7.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_51] at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_51] at org.jboss.as.ee.component.ManagedReferenceMethodInterceptorFactory$ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptorFactory.java:72) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:374) [jboss-invocation.jar:1.1.1.Final] at org.ovirt.engine.core.bll.interceptors.ThreadLocalSessionCleanerInterceptor.injectWebContextToThreadLocal(ThreadLocalSessionCleanerInterceptor.java:13) [bll.jar:] at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) [:1.7.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_51] at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_51] at org.jboss.as.ee.component.ManagedReferenceLifecycleMethodInterceptorFactory$ManagedReferenceLifecycleMethodInterceptor.processInvocation(ManagedReferenceLifecycleMethodInterceptorFactory.java:123) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.WeavedInterceptor.processInvocation(WeavedInterceptor.java:53) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:36) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:21) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:53) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:211) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final] at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:363) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final] at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:194) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:59) [jboss-as-ejb3-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ee.component.TCCLInterceptor.processInvocation(TCCLInterceptor.java:45) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:165) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:173) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.1.Final] at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:61) [jboss-invocation.jar:1.1.1.Final] at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:72) [jboss-as-ee-7.1.1.Final.jar:7.1.1.Final] at org.ovirt.engine.core.common.interfaces.BackendLocal$$$view9.runQuery(Unknown Source) [common.jar:] at org.ovirt.engine.ui.frontend.server.gwt.GenericApiGWTServiceImpl.RunQuery(GenericApiGWTServiceImpl.java:59) at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source) [:1.7.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_51] at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_51] at com.google.gwt.rpc.server.RPC.invokeAndStreamResponse(RPC.java:196) at com.google.gwt.rpc.server.RpcServlet.processCall(RpcServlet.java:172) at com.google.gwt.rpc.server.RpcServlet.processPost(RpcServlet.java:233) at com.google.gwt.user.server.rpc.AbstractRemoteServiceServlet.doPost(AbstractRemoteServiceServlet.java:62) at javax.servlet.http.HttpServlet.service(HttpServlet.java:754) [jboss-servlet-3.0-api.jar:1.0.1.Final] at javax.servlet.http.HttpServlet.service(HttpServlet.java:847) [jboss-servlet-3.0-api.jar:1.0.1.Final] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:329) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.ovirt.engine.ui.frontend.server.gwt.GwtCachingFilter.doFilter(GwtCachingFilter.java:132) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.ovirt.engine.core.branding.BrandingFilter.doFilter(BrandingFilter.java:72) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.ovirt.engine.core.utils.servlet.LocaleFilter.doFilter(LocaleFilter.java:64) [utils.jar:] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.ovirt.engine.core.bll.AutomaticLoginFilter.doFilter(AutomaticLoginFilter.java:58) [bll.jar:] at org.ovirt.engine.core.bll.AutomaticLoginFilter.doFilter(AutomaticLoginFilter.java:49) [bll.jar:] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.ovirt.engine.core.authentication.AuthenticationFilter.doFilter(AuthenticationFilter.java:80) [common.jar:] at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:280) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:248) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:275) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:161) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:489) at org.jboss.as.web.security.SecurityContextAssociationValve.invoke(SecurityContextAssociationValve.java:153) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:155) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.rewrite.RewriteValve.invoke(RewriteValve.java:466) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:368) at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:505) at org.apache.coyote.ajp.AjpProtocol$AjpConnectionHandler.process(AjpProtocol.java:445) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:930) at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] 2014-02-21 11:29:13,760 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (ajp--127.0.0.1-8702-6) Unable to get value of property: vdsName for class org.ovirt.engine.core.common.businessentities.VdsStatic 2014-02-21 11:29:13,760 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (ajp--127.0.0.1-8702-6) Unable to get value of property: vdsName for class org.ovirt.engine.core.common.businessentities.VdsStatic 2014-02-21 11:29:13,761 INFO [org.ovirt.engine.core.bll.UpdateVdsCommand] (ajp--127.0.0.1-8702-6) [1f1fa082] Running command: UpdateVdsCommand internal: false. Entities affected : ID: bec0ed78-d8c0-453c-a69e-eb4d337d8cda Type: VDS 2014-02-21 11:29:14,101 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (ajp--127.0.0.1-8702-6) Unable to get value of property: vdsName for class org.ovirt.engine.core.common.businessentities.VdsStatic 2014-02-21 11:29:14,101 WARN [org.ovirt.engine.core.compat.backendcompat.PropertyInfo] (ajp--127.0.0.1-8702-6) Unable to get value of property: vdsName for class org.ovirt.engine.core.common.businessentities.VdsStatic 2014-02-21 11:29:14,499 ERROR [org.ovirt.engine.core.bll.FenceExecutor] (ajp--127.0.0.1-8702-6) [1f1fa082] Failed to run Power Management command on Host node1, no running proxy Host was found. 2014-02-21 11:29:14,845 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-6) [1f1fa082] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: Power Management test failed for Host node1.There is no other host in the data center that can be used to test the power management settings. 2014-02-21 11:29:15,012 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-6) [1f1fa082] Correlation ID: 1f1fa082, Call Stack: null, Custom Event ID: -1, Message: Host node1 configuration was updated by admin. 2014-02-21 11:29:15,390 INFO [org.ovirt.engine.core.bll.ApproveVdsCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Running command: ApproveVdsCommand internal: false. Entities affected : ID: bec0ed78-d8c0-453c-a69e-eb4d337d8cda Type: VDS 2014-02-21 11:29:15,423 INFO [org.ovirt.engine.core.bll.InstallVdsCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Before Installation host bec0ed78-d8c0-453c-a69e-eb4d337d8cda, node1 2014-02-21 11:29:15,485 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] START, SetVdsStatusVDSCommand(HostName = node1, HostId = bec0ed78-d8c0-453c-a69e-eb4d337d8cda, status=Installing, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 1fa6d77b 2014-02-21 11:29:15,518 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] FINISH, SetVdsStatusVDSCommand, log id: 1fa6d77b 2014-02-21 11:29:15,946 INFO [org.ovirt.engine.core.bll.InstallerMessages] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Installation xx.xx.xx.xx1: Connected to host xx.xx.xx.xx1 with SSH key fingerprint: e5:a4:07:da:3a:42:62:52:87:92:50:26:ea:2a:66:33 2014-02-21 11:29:15,979 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Correlation ID: 54b06ade, Call Stack: null, Custom Event ID: -1, Message: Installing Host node1. Connected to host xx.xx.xx.xx1 with SSH key fingerprint: e5:a4:07:da:3a:42:62:52:87:92:50:26:ea:2a:66:33. 2014-02-21 11:29:16,005 INFO [org.ovirt.engine.core.bll.VdsDeploy] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Installation of xx.xx.xx.xx1. Executing command via SSH umask 0077; MYTMP="$(mktemp -t ovirt-XXXXXXXXXX)"; trap "chmod -R u+rwX \"${MYTMP}\" > /dev/null 2>&1; rm -fr \"${MYTMP}\" > /dev/null 2>&1" 0; rm -fr "${MYTMP}" && mkdir "${MYTMP}" && tar --warning=no-timestamp -C "${MYTMP}" -x && "${MYTMP}"/setup DIALOG/dialect=str:machine DIALOG/customization=bool:True < /var/cache/ovirt-engine/ovirt-host-deploy.tar 2014-02-21 11:29:16,006 INFO [org.ovirt.engine.core.utils.archivers.tar.CachedTar] (org.ovirt.thread.pool-6-thread-4) Tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar' is missing, creating 2014-02-21 11:29:16,008 ERROR [org.ovirt.engine.core.utils.archivers.tar.CachedTar] (org.ovirt.thread.pool-6-thread-4) Cannot create tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar': java.io.FileNotFoundException: File or directory /usr/share/ovirt-host-deploy/interface-3 not found at org.ovirt.engine.core.utils.archivers.tar.Tar.doTar(Tar.java:97) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.create(CachedTar.java:41) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.ensure(CachedTar.java:105) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.getFile(CachedTar.java:151) [utils.jar:] at org.ovirt.engine.core.bll.VdsDeploy.execute(VdsDeploy.java:1045) [bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.installHost(InstallVdsCommand.java:225) [bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.executeCommand(InstallVdsCommand.java:140) [bll.jar:] at org.ovirt.engine.core.bll.ApproveVdsCommand.executeCommand(ApproveVdsCommand.java:45) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1123) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1208) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1884) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1228) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:351) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommand(MultipleActionsRunner.java:189) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.runCommands(MultipleActionsRunner.java:156) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner$2.run(MultipleActionsRunner.java:165) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:97) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] 2014-02-21 11:29:16,013 ERROR [org.ovirt.engine.core.bll.VdsDeploy] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Error during host xx.xx.xx.xx1 install: java.io.IOException: Cannot create tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar' at org.ovirt.engine.core.utils.archivers.tar.CachedTar.create(CachedTar.java:46) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.ensure(CachedTar.java:105) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.getFile(CachedTar.java:151) [utils.jar:] at org.ovirt.engine.core.bll.VdsDeploy.execute(VdsDeploy.java:1045) [bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.installHost(InstallVdsCommand.java:225) [bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.executeCommand(InstallVdsCommand.java:140) [bll.jar:] at org.ovirt.engine.core.bll.ApproveVdsCommand.executeCommand(ApproveVdsCommand.java:45) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1123) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1208) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1884) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1228) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:351) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommand(MultipleActionsRunner.java:189) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.runCommands(MultipleActionsRunner.java:156) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner$2.run(MultipleActionsRunner.java:165) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:97) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] Caused by: java.io.FileNotFoundException: File or directory /usr/share/ovirt-host-deploy/interface-3 not found at org.ovirt.engine.core.utils.archivers.tar.Tar.doTar(Tar.java:97) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.create(CachedTar.java:41) [utils.jar:] ... 22 more 2014-02-21 11:29:16,023 ERROR [org.ovirt.engine.core.bll.InstallVdsCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Host installation failed for host bec0ed78-d8c0-453c-a69e-eb4d337d8cda, node1.: java.io.IOException: Cannot create tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar' at org.ovirt.engine.core.utils.archivers.tar.CachedTar.create(CachedTar.java:46) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.ensure(CachedTar.java:105) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.getFile(CachedTar.java:151) [utils.jar:] at org.ovirt.engine.core.bll.VdsDeploy.execute(VdsDeploy.java:1045) [bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.installHost(InstallVdsCommand.java:225) [bll.jar:] at org.ovirt.engine.core.bll.InstallVdsCommand.executeCommand(InstallVdsCommand.java:140) [bll.jar:] at org.ovirt.engine.core.bll.ApproveVdsCommand.executeCommand(ApproveVdsCommand.java:45) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1123) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1208) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1884) [bll.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:] at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:] at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1228) [bll.jar:] at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:351) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.executeValidatedCommand(MultipleActionsRunner.java:189) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner.runCommands(MultipleActionsRunner.java:156) [bll.jar:] at org.ovirt.engine.core.bll.MultipleActionsRunner$2.run(MultipleActionsRunner.java:165) [bll.jar:] at org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil$InternalWrapperRunnable.run(ThreadPoolUtil.java:97) [utils.jar:] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_51] at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_51] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_51] at java.lang.Thread.run(Thread.java:744) [rt.jar:1.7.0_51] Caused by: java.io.FileNotFoundException: File or directory /usr/share/ovirt-host-deploy/interface-3 not found at org.ovirt.engine.core.utils.archivers.tar.Tar.doTar(Tar.java:97) [utils.jar:] at org.ovirt.engine.core.utils.archivers.tar.CachedTar.create(CachedTar.java:41) [utils.jar:] ... 22 more 2014-02-21 11:29:16,071 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] START, SetVdsStatusVDSCommand(HostName = node1, HostId = bec0ed78-d8c0-453c-a69e-eb4d337d8cda, status=InstallFailed, nonOperationalReason=NONE, stopSpmFailureLogged=false), log id: 31165e3e 2014-02-21 11:29:16,112 INFO [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (org.ovirt.thread.pool-6-thread-4) [54b06ade] FINISH, SetVdsStatusVDSCommand, log id: 31165e3e 2014-02-21 11:29:16,134 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-4) [54b06ade] Correlation ID: 54b06ade, Job ID: 4a971c3d-4a79-4bde-98bb-ca4ee29d1d20, Call Stack: null, Custom Event ID: -1, Message: Host node1 installation failed. Cannot create tarball '/var/cache/ovirt-engine/ovirt-host-deploy.tar'. 2014-02-21 11:29:17,954 INFO [org.ovirt.engine.core.vdsbroker.VdsManager] (DefaultQuartzScheduler_Worker-67) Initializing Host: node1 2014-02-21 11:33:11,802 INFO [org.ovirt.engine.core.services.RegisterServlet] (ajp--127.0.0.1-8702-6) Registration request: source='xx.xx.xx.xx1', secured='true', address='xx.xx.xx.xx1', vdsPort=54321, name='node1', uniqueId='54443858-4E54-2500-904B-0025904B9C22_00:25:90:01:d1:74' 2014-02-21 11:33:11,842 ERROR [org.ovirt.engine.core.bll.RegisterVdsQuery] (ajp--127.0.0.1-8702-6) Query execution failed due to invalid inputs. VDS_STATUS_NOT_VALID_FOR_UPDATE
Vitor, see comment 7 , this seems to be a result of NPE on GetHostArchitectureQuery.java: ---snip--- ServerCpu sc = CpuFlagsManagerHandler.FindMaxServerCpuByFlags(host.getCpuFlags(), host.getVdsGroupCompatibilityVersion()); getQueryReturnValue().setReturnValue(sc.getArchitecture()); ---snip--- It seems that sc is NULL ... What can cause that ???
This can happen if the engine cannot retrieve the host cpu flags (which seems to be the case) or if the CPU is unsupported (I guess this is impossible in this context).
cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU E5620 @ 2.40GHz stepping : 2 cpu MHz : 2400.045 cache size : 12288 KB physical id : 1 siblings : 8 core id : 10 cpu cores : 4 apicid : 53 initial apicid : 53 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 popcnt lahf_lm ida arat epb dts tpr_shadow vnmi flexpriority ept vpid bogomips : 4801.95 clflush size : 64 cache_alignment : 64 address sizes : 40 bits physical, 48 bits virtual power management:
Vitor, please look at info given by the BZ reporter on comment 10 and advise how to check/resolve Thanks
What is the output of: vdsClient -s 1.2.3.4 getVdsCaps (Replacing 1.2.3.4 by the ip address of the non-responsive host and calling it from the machine with the ovirt-engine installed first and then from the host itself later)
Sorry, I got frustrated and wiped the nodes to install centos 6 with VDSM on it. I only have limited time to test things between my other projects :-( Now I have working nodes but can't test with vdsClient properly anymore. So terribly sorry I cannot debug anymore and thanks for all your help. I will try testing another node image when released (RC?) but for now I closed the bug. I need to get on with testing other parts.
Hi, I´m having the same non-responsive issues again with 3.4.0 RC. When I install the nodes, login with admin, register the node and reboot right after registering. I see the node registering in the management console and the nodes become "Up". When I reboot again after that however, the nodes become unresponsive and I cannot get them Up again. It doesn't matter if I use power management to reboot or do it manually from the admin console/root. When I try vdsClient -s node1 getVdsCaps from the engine (after installing vdsm-cli), I get: "Exception: No permission to read file: /etc/pki/vdsm/keys/vdsmkey.pem" I attached the output when I try the vdsClient from node2. When I try the same from node1 to node2, I get: Connection to node2:54321 refused from engine.log: 2014-03-06 15:31:40,274 INFO [org.ovirt.engine.core.bll.ClearNonResponsiveVdsVmsCommand] (org.ovirt.thread.pool-6-thread-4) [2e8ff470] Running command: ClearNonResponsiveVdsVmsCommand internal: true. Entities affected : ID: 79f25466-7989-4bb1-9f8b-9b8657a64b7b Type: VDS 2014-03-06 15:31:40,292 INFO [org.ovirt.engine.core.vdsbroker.UpdateVdsVMsClearedVDSCommand] (org.ovirt.thread.pool-6-thread-4) [2e8ff470] START, UpdateVdsVMsClearedVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b), log id: 6eb1c2f4 2014-03-06 15:31:40,293 INFO [org.ovirt.engine.core.vdsbroker.UpdateVdsVMsClearedVDSCommand] (org.ovirt.thread.pool-6-thread-4) [2e8ff470] FINISH, UpdateVdsVMsClearedVDSCommand, log id: 6eb1c2f4 2014-03-06 15:31:40,692 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-4) [2e8ff470] Correlation ID: 2e8ff470, Call Stack: null, Custom Event ID: -1, Message: All VMs' status on Non Responsive Host node2 were changed to 'Down' by admin 2014-03-06 15:31:40,737 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-6-thread-4) [2e8ff470] Correlation ID: 304920df, Call Stack: null, Custom Event ID: -1, Message: Manual fence for host node2 was started. 2014-03-06 15:31:40,737 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (org.ovirt.thread.pool-6-thread-4) [2e8ff470] Lock freed to object EngineLock [exclusiveLocks= key: 79f25466-7989-4bb1-9f8b-9b8657a64b7b value: VDS_FENCE , sharedLocks= ] 2014-03-06 15:31:41,767 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-25) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:44,825 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-32) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:47,902 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-26) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused [root@manage01en01 ovirt-engine]# tail engine.log 2014-03-06 15:31:41,767 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-25) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:44,825 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-32) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:47,902 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-26) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:50,990 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-39) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:54,058 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-42) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:31:57,114 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-50) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:32:00,179 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-68) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:32:03,235 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-75) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:32:06,503 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-86) Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused 2014-03-06 15:32:09,869 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.GetCapabilitiesVDSCommand] (DefaultQuartzScheduler_Worker-80) [21b5b0a2] Command GetCapabilitiesVDSCommand(HostName = node2, HostId = 79f25466-7989-4bb1-9f8b-9b8657a64b7b, vds=Host[node2]) execution failed. Exception: VDSNetworkException: java.net.ConnectException: Connection refused
Created attachment 871499 [details] getVdsCaps output
please attach full engine.log and VDSM log
I redeployed a full setup because I've done a lot of messing about. I now did a wipe and reinstall of the nodes and the engine. I then logged in to the admin on the nodes, enabled dhcp and registered the management server. This time I didn't get the nodes connected even before reboot. In /var/log/messages, I see these errrors: Mar 11 12:43:33 localhost libvirtd: Could not find keytab file: /etc/libvirt/krb5.tab: No such file or directory Mar 11 12:43:48 localhost python: error in accessing vdsm log file Mar 11 12:43:48 localhost respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died too quickly, respawning slave Mar 11 12:43:48 localhost python: error in accessing vdsm log file Mar 11 12:43:48 localhost respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died too quickly, respawning slave Mar 11 12:43:48 localhost python: error in accessing vdsm log file Mar 11 12:43:48 localhost respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died too quickly, respawning slave <<<<snip>>>> Mar 11 12:44:19 localhost python: error in accessing vdsm log file Mar 11 12:44:19 localhost respawn: slave '/usr/share/vdsm/vdsm --pidfile /var/run/vdsm/vdsmd.pid' died too quickly for more than 30 seconds, master sleeping for 900 seconds I see the vdsm.log is owned by root and 0 bytes, shouldn't this be vdsm ? [root@node1 vdsm]# ls -lart total 24 drwxr-xr-x. 2 vdsm kvm 4096 2014-03-11 09:45 backup drwxr-xr-x. 13 root root 4096 2014-03-11 12:43 .. -rw-r--r--. 1 root root 5464 2014-03-11 12:43 supervdsm.log -rw-r--r--. 1 root root 0 2014-03-11 12:43 vdsm.log -rw-r--r--. 1 root root 0 2014-03-11 12:43 metadata.log drwxr-xr-x. 3 vdsm kvm 4096 2014-03-11 12:43 . -rw-r--r--. 1 vdsm kvm 352 2014-03-11 12:43 upgrade.log It appears vdsm is not running at all: [root@node1 vdsm]# /etc/init.d/vdsmd status VDS daemon is not running, and its watchdog is running When I chown /var/log/vdsm/* to user vdsm:kvm, I get vdsm started but throwing this errror: Mar 11 13:04:57 localhost vdsm vds WARNING Unable to load the json rpc server module. Please make sure it is installed. Mar 11 13:05:17 localhost vdsm vds WARNING Unable to load the json rpc server module. Please make sure it is installed. But the nodes are now up. So it appears to be a simple permission problem.
the json warning is not related, ignore it for now. the bug is that the log becomes owned by root. Douglas, is it refer to bug 1055153 ?
(In reply to Yaniv Bronhaim from comment #18) > the json warning is not related, ignore it for now. the bug is that the log > becomes owned by root. Douglas, is it refer to bug 1055153 ? Hello Yaniv, BZ#1055153 should resolves this bug on ovit 3.4 beta3.
I think we need a new node image from fabian then, I use the latest available ovirt-node-iso-3.0.4-1.0.201401291204.vdsm34.el6.iso and it's dated 11-Feb-2014... before the bugfix. There will probably be some more fixes that didn't make this node image.
may i close as duplicated?
Yes you may close it. I'd still like a new node image to test with though.... Thanks for the help.
*** This bug has been marked as a duplicate of bug 1055153 ***