1293828 – NPE when referring to channel without checking status

Bug 1293828 - NPE when referring to channel without checking status

Summary: NPE when referring to channel without checking status

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	vdsm-jsonrpc-java
Classification:	oVirt
Component:	Core
Sub Component:
Version:	1.1.4
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	ovirt-3.6.2
Target Release:	1.1.6
Assignee:	Piotr Kliczewski
QA Contact:	Lukas Svaty
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2015-12-23 08:15 UTC by Oved Ourfali
Modified:	2016-02-18 11:20 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2016-02-18 11:20:02 UTC
oVirt Team:	Infra
Embargoed:
Flags:	rule-engine: ovirt-3.6.z+ mgoldboi: planning_ack+ oourfali: devel_ack+ pstehlik: testing_ack+

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
oVirt gerrit	50927	0	master	MERGED	scheduled tasks do not check whether a channel is there	2016-01-08 12:35:04 UTC

Description Oved Ourfali 2015-12-23 08:15:14 UTC

Description of problem:
When we invoke postConnect method we schedule a task which register a
channel in selector but in between we may close the channel due to
issues. This causes NPE which can be mitigated by checking actual state
of the channel.


Version-Release number of selected component (if applicable):


How reproducible:
Occasionly.

Steps to Reproduce:
Hard to reproduce.
Might happen in slow network and short heartbrat interval.

Additional info:
2015-12-21 18:21:41,230 INFO  [org.ovirt.engine.core.vdsbroker.PollVmStatsRefresher] (DefaultQuartzScheduler_Worker-3) [] Failed to fetch vms info for host 'buri05' - skipping VMs monitoring.
2015-12-21 18:21:41,229 ERROR [org.ovirt.engine.core.vdsbroker.HostMonitoring] (DefaultQuartzScheduler_Worker-58) [] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:634) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:119) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:226) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) [:1.7.0_91]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_91]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_91]
        at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81) [scheduler.jar:]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:]
Caused by: java.lang.NullPointerException
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient$2.call(SSLClient.java:137) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient$2.call(SSLClient.java:133) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.utils.retry.Retryable.call(Retryable.java:27) [vdsm-jsonrpc-java-client.jar:]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_91]
        at org.ovirt.vdsm.jsonrpc.client.utils.ReactorScheduler.performPendingOperations(ReactorScheduler.java:28) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.run(Reactor.java:61) [vdsm-jsonrpc-java-client.jar:]

Comment 1 Lukas Svaty 2016-01-25 14:58:00 UTC

What flow do we need to cover for verification of this?

Comment 2 Red Hat Bugzilla Rules Engine 2016-01-25 14:58:01 UTC

Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 3 Lukas Svaty 2016-01-25 14:59:39 UTC

moving back to ON_QA (fixing my mistake from comment#1)

Comment 4 Red Hat Bugzilla Rules Engine 2016-01-25 14:59:41 UTC

Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 5 Piotr Kliczewski 2016-01-25 15:40:42 UTC

In order to test it is required to check host connectivity using slow network.

Comment 6 Lukas Svaty 2016-02-02 09:48:54 UTC

Anything special I should look for? (Specific VDSM/engine errors?

Comment 7 Piotr Kliczewski 2016-02-02 10:34:44 UTC

There are no special errors. It is highly time dependent and if we were slow enough it could throw NPE.

Comment 8 Lukas Svaty 2016-02-02 11:32:53 UTC

As this bug does not have any specific test and has low reproducibility, tests will be run over multiple runs (host operation) on slow network. I'll provide information once I'll either find the mentioned NPE or will have enough run so I can verify this functionality. For slow connection will be used either connection of BRQ-TLV, or BRQ-BOSTON.

Comment 9 Lukas Svaty 2016-02-04 11:58:26 UTC

Tested it on multiple scenarios (deploy, move to maintenance, activate, reinstall, PM actions) over night. No NPE appeared in the logs. Moving to verified if in the feature this issue will re-appear please re-open the bug.

Tested on rhevm-3.6.3-0.1.el6.noarch

Note You need to log in before you can comment on or make changes to this bug.