Red Hat Bugzilla – Bug 981960
NullPointerException on Power Management command
Last modified: 2016-02-10 14:28:42 EST
Description of problem:
NullPointerException on Power Management command.
2013-07-07 11:51:45,083 - MainThread - plmanagement.error_fetcher - ERROR - Errors fetched from VDC(jenkins-automation-rpm-vm36.eng.lab.tlv.redhat.com): 2013-07-07 11:38:02,997 ERROR [org.ovirt.engine.core.bll.StartVdsCommand] (pool-4-thread-47) Failed to verify host cinteg35.ci.lab.tlv.redhat.com start status. Have retried 18 times with delay of 10 seconds between each retry.
2013-07-07 11:38:03,036 INFO [org.ovirt.engine.core.bll.FenceExecutor] (pool-4-thread-47) Executing <Start> Power Management command, Proxy Host:null, Agent:ipmilan, Target Host:cinteg35.ci.lab.tlv.redhat.com, Management IP:10.35.148.109, User:admin, Options:
2013-07-07 11:38:03,036 ERROR [org.ovirt.engine.core.vdsbroker.ResourceManager] (pool-4-thread-47) CreateCommand failed: java.lang.NullPointerException
at java.util.concurrent.ConcurrentHashMap.hash(ConcurrentHashMap.java:333) [rt.jar:1.7.0_25]
at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:988) [rt.jar:1.7.0_25]
at org.ovirt.engine.core.vdsbroker.ResourceManager.GetVdsManager(ResourceManager.java:192) [vdsbroker.jar:]
at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.initializeVdsBroker(VdsBrokerCommand.java:47) [vdsbroker.jar:]
at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.<init>(VdsBrokerCommand.java:28) [vdsbroker.jar:]
at org.ovirt.engine.core.vdsbroker.vdsbroker.FenceVdsVDSCommand.<init>(FenceVdsVDSCommand.java:18) [vdsbroker.jar:]
at sun.reflect.GeneratedConstructorAccessor113.newInstance(Unknown Source) [:1.7.0_25]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) [rt.jar:1.7.0_25]
at java.lang.reflect.Constructor.newInstance(Constructor.java:526) [rt.jar:1.7.0_25]
at org.ovirt.engine.core.vdsbroker.ResourceManager.CreateCommand(ResourceManager.java:319) [vdsbroker.jar:]
at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:356) [vdsbroker.jar:]
at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) [bll.jar:]
at org.ovirt.engine.core.bll.FenceExecutor.runFenceAction(FenceExecutor.java:192) [bll.jar:]
at org.ovirt.engine.core.bll.FenceExecutor.Fence(FenceExecutor.java:166) [bll.jar:]
at org.ovirt.engine.core.bll.FenceVdsBaseCommand.handleWaitFailure(FenceVdsBaseCommand.java:367) [bll.jar:]
at org.ovirt.engine.core.bll.FenceVdsBaseCommand.handleSingleAgent(FenceVdsBaseCommand.java:192) [bll.jar:]
at org.ovirt.engine.core.bll.FenceVdsBaseCommand.executeCommand(FenceVdsBaseCommand.java:159) [bll.jar:]
at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1068) [bll.jar:]
at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1153) [bll.jar:]
at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1623) [bll.jar:]
at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:174) [utils.jar:]
at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:116) [utils.jar:]
at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1171) [bll.jar:]
at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:317) [bll.jar:]
Version-Release number of selected component (if applicable):
Steps to Reproduce:
I've noticed strange thing in engine.log, SSH Soft Fencing has not been executed prior to real fencing even when these patches are on.
I've not been able to reproduce NullPointerException, are there precise steps to reproduce it?
I've tried 2 scenarios:
1) I've stopped VDSM on host and after time interval, SSH Soft Fencing has been executed and host became up without any errors
2) I've set SSH Soft Fencing command to value, that will not start VDSM ("echo 0"), then I've stop VDSM, after time interval SSH Soft Fencing has been executed,
host hasn't come up, so after another time interval real fencing has been executed and after PM restart, host came up without any errors
Kiril, could you please provide some info how to reproduce this bug?
(In reply to Martin Perina from comment #5)
> Kiril, could you please provide some info how to reproduce this bug?
You can take a look in the job - comment #2 and flow. I have no idea how to reproduce it and seems like this exception disappeared, I can't see it anymore in the job.
To reproduce this error, enable concurrent option under power management with two correct power management agents, error appear in versions sf18 and is6, for more details see bug: https://bugzilla.redhat.com/show_bug.cgi?id=977689
Since patch http://gerrit.ovirt.org/#/c/17206/ prevents executing power management commands with proxy host set to null, it should also fix this bug. Could you please retest with patch 17206 included?
Can't check path because this bug https://bugzilla.redhat.com/show_bug.cgi?id=982266, have host just with power managements types apc_snmp and ipmilan, and I need host with two power management for checking.
Check with the same power management as first and second pm's with concurrent enabled, fence worked fine.(No have hosts with two different power managements of same type)
Proxy host not null now and has correct value
Verified on is9.1
Closing - RHEV 3.3 Released