Bug 1532884 - NullPointerException after failure to refresh VDS
Summary: NullPointerException after failure to refresh VDS
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Infra
Version: 4.2.1
Hardware: x86_64
OS: Unspecified
unspecified
low
Target Milestone: ovirt-4.2.2
: 4.2.2
Assignee: Ravi Nori
QA Contact: Elad
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-09 23:59 UTC by Elad
Modified: 2018-04-18 12:25 UTC (History)
5 users (show)

Fixed In Version: ovirt-engine-4.2.2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-04-18 12:25:27 UTC
oVirt Team: Infra
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
engine and vdsm logs (1.72 MB, application/x-gzip)
2018-01-09 23:59 UTC, Elad
no flags Details
engine.log-debug (17.48 MB, application/x-gzip)
2018-02-02 01:37 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 86337 0 'None' MERGED Engine: Misleading Log Message For Job Search 2020-12-18 14:45:59 UTC
oVirt gerrit 87073 0 'None' MERGED engine : NullPointerException after failure to refresh VDS 2020-12-18 14:46:01 UTC
oVirt gerrit 87615 0 'None' MERGED engine : NullPointerException after failure to refresh VDS 2020-12-18 14:46:31 UTC

Description Elad 2018-01-09 23:59:44 UTC
Created attachment 1379290 [details]
engine and vdsm logs

Description of problem:
The following exception is thrown after a failure to refresh VDS:

2018-01-10 01:23:51,025+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failure to refresh host 'host_mixed_2' runtime info: null
2018-01-10 01:23:51,043+02 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failed to refresh VDS , vds = 'host_mixed_2' : '60b8472d-3fc0-48a1-964e-a1364d1
b4607', error = 'null', continuing.
2018-01-10 01:23:51,043+02 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Exception: java.lang.NullPointerException

Version-Release number of selected component (if applicable):
rhvm-4.2.1-0.2.el7.noarch
vdsm-4.20.11-1.el7ev.x86_64

How reproducible:
Happens after failure to refresh host. The NPE doesn't seem to affect anything, hence, severity is set to low.


Steps to Reproduce:
The exact steps are unclear but it happens almost every time in RHV automation exectuion, for example, during execution of https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM3-5994


The only thing that might be related in vdsm.log is:

2018-01-10 01:23:42,315+0200 WARN  (vdsm.Scheduler) [Executor] executor state: count=5 workers=set([<Worker name=periodic/14 running <Task discardable <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor obj
ect at 0x7faf0bb8e950> at 0x7faf0bb294d0> timeout=7.5, duration=0 at 0x7faf0b7fb850> task#=66 at 0x7faf0b84c290>, <Worker name=periodic/4 running <Task discardable <Operation action=<VmDispatcher operation=<clas
s 'vdsm.virt.periodic.BlockjobMonitor'> at 0x7faf0bacaf10> at 0x7faf0bacaf50> timeout=7.5, duration=0 at 0x7faf0b8b74d0> task#=772 at 0x7faf0ba890d0>, <Worker name=periodic/13 running <Task discardable <Operatio
n action=<VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x7faf0bacaf90> at 0x7faf0bacafd0> timeout=1.0, duration=1 at 0x7faf0b70aed0> discarded task#=232 at 0x7faf0bb0ccd0>, <Worke
r name=periodic/16 waiting task#=0 at 0x7faf0ba11b10>, <Worker name=periodic/15 running <Task discardable <Operation action=<VmDispatcher operation=<class 'vdsm.virt.periodic.UpdateVolumes'> at 0x7faf0baf9f90> a
t 0x7faf0bacad90> timeout=30.0, duration=1 at 0x7faf0b540f50> task#=60 at 0x7faf0b7d5690>]) (executor:213)



Expected results:
No NPE

Additional info:
engine and vdsm logs

Comment 2 Martin Perina 2018-01-10 08:55:24 UTC
I see 2 errors in the log:

1. Ravi, please focus on the NPE during monitoring, which this bug is talking about

2. Ori, any idea about below error?
2018-01-09 21:52:45,608+02 ERROR [org.ovirt.engine.core.bll.SearchQuery] (default task-12) [8004bfc1-76c5-43eb-bfb9-491c610f7dc7] Search object type not handled: Job

Comment 3 Ori Liel 2018-01-15 09:48:55 UTC
As for (2), the reason is a missing break statement, which results in the code executing properly but still issuing the misleading log error. This patch fixes this: 

  https://gerrit.ovirt.org/86337

(1) is a different problem, so perhaps this bugzilla should be split.

Comment 5 Ravi Nori 2018-01-31 17:57:08 UTC
Can you attach debug logs

Comment 6 Elad 2018-02-02 01:37:22 UTC
Created attachment 1389839 [details]
engine.log-debug

In DEBUG:


2018-02-02 02:46:02,850+02 DEBUG [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Executed batch
2018-02-02 02:46:02,850+02 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Failed to refresh VDS , vds = 'host_mixed_1' : 'e00894e8-ac29-4b20-9d8b-b45e02
3b4be5', error = 'null', continuing.
2018-02-02 02:46:02,851+02 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Exception: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshCommitedMemory(HostMonitoring.java:687) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:128) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refresh(HostMonitoring.java:85) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.refresh(VdsManager.java:267) [vdsbroker.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_151]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_151]
        at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) [javax.enterprise.concurrent.jar:1.0.
0.redhat-1]
        at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) [javax.enterprise.concurrent.jar:1.0.0.redha
t-1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_151]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_151]
        at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
        at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent.jar:1.0.0.redhat-1]
        at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78)

2018-02-02 02:46:02,858+02 DEBUG [org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil] (EE-ManagedThreadFactory-engine-Thread-12382) [] Executing task: EE-ManagedThreadFactory-engine-Thread-12382


==============
ovirt-engine-4.2.1.4-0.1.el7.noarch

Comment 7 Elad 2018-04-12 14:15:16 UTC
No NPE was thrown during automation execution of TestCase5994 (were the bug was found on)

Used:
rhvm-4.2.2.6-0.1.el7.noarch
vdsm-4.20.23-1.el7ev.x86_64

Comment 8 Sandro Bonazzola 2018-04-18 12:25:27 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.