Hide Forgot
Created attachment 561518 [details] logs Description of problem: I have two host in the cluster with NFS storage and even distribution policy set to 51. I ran 7 Vms on the host and generated load on CPU of 90. libvirt crashed because if bug 785789 .Host was fenced instead of migrate the VMs to another hosts. 2012-02-13 11:11:13,277 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-44) START, FenceSpmStorageVDSCommand(vdsId = 89575b00-50a9-11e1-8b40-001a4a169745, storagePoolId = cd84d709-d762-4df6-9667 -a7d0981bd8ed, prevId=1, prevLVER=3), log id: 16593439 2012-02-13 11:11:13,332 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-44) Command org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusOnlyReturnForXmlRpc mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 654 mMessage Not SPM 2012-02-13 11:11:13,332 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (pool-5-thread-44) Vds: zeus-vds3.qa.lab.tlv.redhat.com 2012-02-13 11:11:13,333 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (pool-5-thread-44) Command FenceSpmStorageVDS execution failed. Exception: IRSNonOperationalException: IRSGenericException: IRSErrorException: IRSNonOperationalException: Not SPM 2012-02-13 11:11:13,333 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.FenceSpmStorageVDSCommand] (pool-5-thread-44) FINISH, FenceSpmStorageVDSCommand, log id: 16593439 2012-02-13 11:11:13,333 WARN [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-44) Could not fence spm on vds zeus-vds3.qa.lab.tlv.redhat.com 2012-02-13 11:11:13,335 INFO [org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand] (pool-5-thread-44) Lock freed to object EngineLock [exclusiveLocks= key: org.ovirt.engine.core.bll.storage.FenceVdsManualyCommand value: 99caa5ac-50a7-11e1-9b49-001a4a169745 , sharedLocks= ] 2012-02-13 11:11:13,348 ERROR [org.ovirt.engine.core.bll.CommandsFactory] (pool-5-thread-44) CommandsFactory [parameter: VdcActionParametersBase]: Failed to get type information using reflection for Action: StartVds: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) [:1.6.0_22] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) [:1.6.0_22] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [:1.6.0_22] at java.lang.reflect.Constructor.newInstance(Constructor.java:532) [:1.6.0_22] at org.ovirt.engine.core.bll.CommandsFactory.CreateCommand(CommandsFactory.java:85) [engine-bll.jar:] at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:257) [engine-bll.jar:] at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:240) [engine-bll.jar:] at sun.reflect.GeneratedMethodAccessor287.invoke(Unknown Source) [:1.6.0_22] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [:1.6.0_22] at java.lang.reflect.Method.invoke(Method.java:616) [:1.6.0_22] Version-Release number of selected component (if applicable): ovirt-engine-backend-3.0.0_0001-1.4.fc16.x86_64 How reproducible: Hard to repro Steps to Reproduce: 1. Have to hosts in the cluster and run 7 VMs on one of the hosts 2. Generate CPU load on that host. 3. Actual results: Exception in the log and host was fenced. Expected results: Additional info:
I can reproduce the same issue with the following scenario: I have two hosts in the cluster/DC with PM configured 1. Add disk to a a VM and while that operation block the connection from the SPM host to the backend. In that case after a while host moves to non responsive status and it fenced. After a fence action host stays in non operational status. *logs attached
Created attachment 562447 [details] logs
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.