Bug 1293828 - NPE when referring to channel without checking status
NPE when referring to channel without checking status
Status: CLOSED CURRENTRELEASE
Product: vdsm-jsonrpc-java
Classification: oVirt
Component: Core (Show other bugs)
1.1.4
Unspecified Unspecified
unspecified Severity medium (vote)
: ovirt-3.6.2
: 1.1.6
Assigned To: Piotr Kliczewski
Lukas Svaty
: CodeChange
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-23 03:15 EST by Oved Ourfali
Modified: 2016-02-18 06:20 EST (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-18 06:20:02 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
mgoldboi: planning_ack+
oourfali: devel_ack+
pstehlik: testing_ack+


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 50927 master MERGED scheduled tasks do not check whether a channel is there 2016-01-08 07:35 EST

  None (edit)
Description Oved Ourfali 2015-12-23 03:15:14 EST
Description of problem:
When we invoke postConnect method we schedule a task which register a
channel in selector but in between we may close the channel due to
issues. This causes NPE which can be mitigated by checking actual state
of the channel.


Version-Release number of selected component (if applicable):


How reproducible:
Occasionly.

Steps to Reproduce:
Hard to reproduce.
Might happen in slow network and short heartbrat interval.

Additional info:
2015-12-21 18:21:41,230 INFO  [org.ovirt.engine.core.vdsbroker.PollVmStatsRefresher] (DefaultQuartzScheduler_Worker-3) [] Failed to fetch vms info for host 'buri05' - skipping VMs monitoring.
2015-12-21 18:21:41,229 ERROR [org.ovirt.engine.core.vdsbroker.HostMonitoring] (DefaultQuartzScheduler_Worker-58) [] Exception: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.createNetworkException(VdsBrokerCommand.java:157) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand.executeVDSCommand(VdsBrokerCommand.java:120) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.executeCommand(VDSCommandBase.java:65) [vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.execute(VdcCommandBase.java:33) [dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:467) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.refreshCapabilities(VdsManager.java:634) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:119) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.HostMonitoring.refresh(HostMonitoring.java:84) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.onTimer(VdsManager.java:226) [vdsbroker.jar:]
        at sun.reflect.GeneratedMethodAccessor22.invoke(Unknown Source) [:1.7.0_91]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_91]
        at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_91]
        at org.ovirt.engine.core.utils.timer.JobWrapper.invokeMethod(JobWrapper.java:81) [scheduler.jar:]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:52) [scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz.jar:]
Caused by: java.lang.NullPointerException
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient$2.call(SSLClient.java:137) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient$2.call(SSLClient.java:133) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.utils.retry.Retryable.call(Retryable.java:27) [vdsm-jsonrpc-java-client.jar:]
        at java.util.concurrent.FutureTask.run(FutureTask.java:262) [rt.jar:1.7.0_91]
        at org.ovirt.vdsm.jsonrpc.client.utils.ReactorScheduler.performPendingOperations(ReactorScheduler.java:28) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.run(Reactor.java:61) [vdsm-jsonrpc-java-client.jar:]
Comment 1 Lukas Svaty 2016-01-25 09:58:00 EST
What flow do we need to cover for verification of this?
Comment 2 Red Hat Bugzilla Rules Engine 2016-01-25 09:58:01 EST
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.
Comment 3 Lukas Svaty 2016-01-25 09:59:39 EST
moving back to ON_QA (fixing my mistake from comment#1)
Comment 4 Red Hat Bugzilla Rules Engine 2016-01-25 09:59:41 EST
Bug tickets that are moved to testing must have target release set to make sure tester knows what to test. Please set the correct target release before moving to ON_QA.
Comment 5 Piotr Kliczewski 2016-01-25 10:40:42 EST
In order to test it is required to check host connectivity using slow network.
Comment 6 Lukas Svaty 2016-02-02 04:48:54 EST
Anything special I should look for? (Specific VDSM/engine errors?
Comment 7 Piotr Kliczewski 2016-02-02 05:34:44 EST
There are no special errors. It is highly time dependent and if we were slow enough it could throw NPE.
Comment 8 Lukas Svaty 2016-02-02 06:32:53 EST
As this bug does not have any specific test and has low reproducibility, tests will be run over multiple runs (host operation) on slow network. I'll provide information once I'll either find the mentioned NPE or will have enough run so I can verify this functionality. For slow connection will be used either connection of BRQ-TLV, or BRQ-BOSTON.
Comment 9 Lukas Svaty 2016-02-04 06:58:26 EST
Tested it on multiple scenarios (deploy, move to maintenance, activate, reinstall, PM actions) over night. No NPE appeared in the logs. Moving to verified if in the feature this issue will re-appear please re-open the bug.

Tested on rhevm-3.6.3-0.1.el6.noarch

Note You need to log in before you can comment on or make changes to this bug.