Bug 1293828 - NPE when referring to channel without checking status
Summary: NPE when referring to channel without checking status
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm-jsonrpc-java
Classification: oVirt
Component: Core
Version: 1.1.4
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-3.6.2
: 1.1.6
Assignee: Piotr Kliczewski
QA Contact: Lukas Svaty
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-12-23 08:15 UTC by Oved Ourfali
Modified: 2016-02-18 11:20 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-02-18 11:20:02 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-3.6.z+
mgoldboi: planning_ack+
oourfali: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 50927 0 master MERGED scheduled tasks do not check whether a channel is there 2016-01-08 12:35:04 UTC

Description Oved Ourfali 2015-12-23 08:15:14 UTC
Description of problem:
When we invoke postConnect method we schedule a task which register a
channel in selector but in between we may close the channel due to
issues. This causes NPE which can be mitigated by checking actual state
of the channel.


Version-Release number of selected component (if applicable):


How reproducible:
Occasionly.

Steps to Reproduce:
Hard to reproduce.
Might happen in slow network and short heartbrat interval.

Additional info:
2015-12-21 18:21:41,230 INFO  [org.ovirt.engine.core.vdsbroker.PollVmStatsRefresher] (DefaultQuartzScheduler_Worker-3) [] Failed to fetch vms info for host 'buri05' - skipping VMs monitoring.
2015-12-21 18:21:41,229 ERROR [org.ovirt.engine.core.vdsbroker.HostMonitoring] (DefaultQuartzScheduler_Worker-58) [] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:634) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:119) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:226) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) [:1.7.0_91]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_91]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_91]
        at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81) [scheduler.jar:]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:]
Caused by: java.lang.NullPointerException
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient$2.call(SSLClient.java:137) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient$2.call(SSLClient.java:133) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.utils.retry.Retryable.call(Retryable.java:27) [vdsm-jsonrpc-java-client.jar:]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_91]
        at org.ovirt.vdsm.jsonrpc.client.utils.ReactorScheduler.performPendingOperations(ReactorScheduler.java:28) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.run(Reactor.java:61) [vdsm-jsonrpc-java-client.jar:]

Comment 1 Lukas Svaty 2016-01-25 14:58:00 UTC
What flow do we need to cover for verification of this?

Comment 2 Red Hat Bugzilla Rules Engine 2016-01-25 14:58:01 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 3 Lukas Svaty 2016-01-25 14:59:39 UTC
moving back to ON_QA (fixing my mistake from comment#1)

Comment 4 Red Hat Bugzilla Rules Engine 2016-01-25 14:59:41 UTC
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.

Comment 5 Piotr Kliczewski 2016-01-25 15:40:42 UTC
In order to test it is required to check host connectivity using slow network.

Comment 6 Lukas Svaty 2016-02-02 09:48:54 UTC
Anything special I should look for? (Specific VDSM/engine errors?

Comment 7 Piotr Kliczewski 2016-02-02 10:34:44 UTC
There are no special errors. It is highly time dependent and if we were slow enough it could throw NPE.

Comment 8 Lukas Svaty 2016-02-02 11:32:53 UTC
As this bug does not have any specific test and has low reproducibility, tests will be run over multiple runs (host operation) on slow network. I'll provide information once I'll either find the mentioned NPE or will have enough run so I can verify this functionality. For slow connection will be used either connection of BRQ-TLV, or BRQ-BOSTON.

Comment 9 Lukas Svaty 2016-02-04 11:58:26 UTC
Tested it on multiple scenarios (deploy, move to maintenance, activate, reinstall, PM actions) over night. No NPE appeared in the logs. Moving to verified if in the feature this issue will re-appear please re-open the bug.

Tested on rhevm-3.6.3-0.1.el6.noarch


Note You need to log in before you can comment on or make changes to this bug.