1532884 – NullPointerException after failure to refresh VDS

Bug 1532884 - NullPointerException after failure to refresh VDS

Summary: NullPointerException after failure to refresh VDS

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	ovirt-engine
Classification:	oVirt
Component:	BLL.Infra
Sub Component:
Version:	4.2.1
Hardware:	x86_64
OS:	Unspecified
Priority:	unspecified
Severity:	low
Target Milestone:	ovirt-4.2.2
Target Release:	4.2.2
Assignee:	Ravi Nori
QA Contact:	Elad
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-01-09 23:59 UTC by Elad
Modified:	2018-04-18 12:25 UTC (History)
CC List:	5 users (show)
Fixed In Version:	ovirt-engine-4.2.2
Clone Of:
Environment:
Last Closed:	2018-04-18 12:25:27 UTC
oVirt Team:	Infra
Embargoed:
Dependent Products:
Flags:	rule-engine: ovirt-4.2+

Attachments	(Terms of Use)
engine and vdsm logs (1.72 MB, application/x-gzip) 2018-01-09 23:59 UTC, Elad	no flags	Details
engine.log-debug (17.48 MB, application/x-gzip) 2018-02-02 01:37 UTC, Elad	no flags	Details
View All

Links
System	ID	Priority	Status	Summary	Last Updated
oVirt gerrit	86337	'None'	MERGED	Engine: Misleading Log Message For Job Search	2020-12-18 14:45:59 UTC
oVirt gerrit	87073	'None'	MERGED	engine : NullPointerException after failure to refresh VDS	2020-12-18 14:46:01 UTC
oVirt gerrit	87615	'None'	MERGED	engine : NullPointerException after failure to refresh VDS	2020-12-18 14:46:31 UTC

Description Elad 2018-01-09 23:59:44 UTC

Created attachment 1379290 [details]
engine and vdsm logs

Description of problem:
The following exception is thrown after a failure to refresh VDS:

2018-01-10 01:23:51,025+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring] (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failure to refresh host 'host_mixed_2' runtime info: null
2018-01-10 01:23:51,043+02 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Failed to refresh VDS , vds = 'host_mixed_2' : '60b8472d-3fc0-48a1-964e-a1364d1
b4607', error = 'null', continuing.
2018-01-10 01:23:51,043+02 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-2) [] Exception: java.lang.NullPointerException

Version-Release number of selected component (if applicable):
rhvm-4.2.1-0.2.el7.noarch
vdsm-4.20.11-1.el7ev.x86_64

How reproducible:
Happens after failure to refresh host. The NPE doesn't seem to affect anything, hence, severity is set to low.


Steps to Reproduce:
The exact steps are unclear but it happens almost every time in RHV automation exectuion, for example, during execution of https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM3-5994


The only thing that might be related in vdsm.log is:

2018-01-10 01:23:42,315+0200 WARN  (vdsm.Scheduler) [Executor] executor state: count=5 workers=set([<Worker name=periodic/14 running <Task discardable <Operation action=<vdsm.virt.sampling.VMBulkstatsMonitor obj
ect at 0x7faf0bb8e950> at 0x7faf0bb294d0> timeout=7.5, duration=0 at 0x7faf0b7fb850> task#=66 at 0x7faf0b84c290>, <Worker name=periodic/4 running <Task discardable <Operation action=<VmDispatcher operation=<clas
s 'vdsm.virt.periodic.BlockjobMonitor'> at 0x7faf0bacaf10> at 0x7faf0bacaf50> timeout=7.5, duration=0 at 0x7faf0b8b74d0> task#=772 at 0x7faf0ba890d0>, <Worker name=periodic/13 running <Task discardable <Operatio
n action=<VmDispatcher operation=<class 'vdsm.virt.periodic.DriveWatermarkMonitor'> at 0x7faf0bacaf90> at 0x7faf0bacafd0> timeout=1.0, duration=1 at 0x7faf0b70aed0> discarded task#=232 at 0x7faf0bb0ccd0>, <Worke
r name=periodic/16 waiting task#=0 at 0x7faf0ba11b10>, <Worker name=periodic/15 running <Task discardable <Operation action=<VmDispatcher operation=<class 'vdsm.virt.periodic.UpdateVolumes'> at 0x7faf0baf9f90> a
t 0x7faf0bacad90> timeout=30.0, duration=1 at 0x7faf0b540f50> task#=60 at 0x7faf0b7d5690>]) (executor:213)



Expected results:
No NPE

Additional info:
engine and vdsm logs

Comment 2 Martin Perina 2018-01-10 08:55:24 UTC

I see 2 errors in the log:

1. Ravi, please focus on the NPE during monitoring, which this bug is talking about

2. Ori, any idea about below error?
2018-01-09 21:52:45,608+02 ERROR [org.ovirt.engine.core.bll.SearchQuery] (default task-12) [8004bfc1-76c5-43eb-bfb9-491c610f7dc7] Search object type not handled: Job

Comment 3 Ori Liel 2018-01-15 09:48:55 UTC

As for (2), the reason is a missing break statement, which results in the code executing properly but still issuing the misleading log error. This patch fixes this: 

  https://gerrit.ovirt.org/86337

(1) is a different problem, so perhaps this bugzilla should be split.

Comment 5 Ravi Nori 2018-01-31 17:57:08 UTC

Can you attach debug logs

Comment 6 Elad 2018-02-02 01:37:22 UTC

Created attachment 1389839 [details]
engine.log-debug

In DEBUG:


2018-02-02 02:46:02,850+02 DEBUG [org.ovirt.engine.core.dal.dbbroker.BatchProcedureExecutionConnectionCallback] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Executed batch
2018-02-02 02:46:02,850+02 WARN  [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Failed to refresh VDS , vds = 'host_mixed_1' : 'e00894e8-ac29-4b20-9d8b-b45e02
3b4be5', error = 'null', continuing.
2018-02-02 02:46:02,851+02 ERROR [org.ovirt.engine.core.vdsbroker.VdsManager] (EE-ManagedThreadFactory-engineScheduled-Thread-38) [] Exception: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshCommitedMemory(HostMonitoring.java:687) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refreshVdsRunTimeInfo(HostMonitoring.java:128) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.monitoring.HostMonitoring.refresh(HostMonitoring.java:85) [vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VdsManager.refresh(VdsManager.java:267) [vdsbroker.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_151]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_151]
        at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) [javax.enterprise.concurrent.jar:1.0.
0.redhat-1]
        at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) [javax.enterprise.concurrent.jar:1.0.0.redha
t-1]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_151]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_151]
        at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_151]
        at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent.jar:1.0.0.redhat-1]
        at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78)

2018-02-02 02:46:02,858+02 DEBUG [org.ovirt.engine.core.utils.threadpool.ThreadPoolUtil] (EE-ManagedThreadFactory-engine-Thread-12382) [] Executing task: EE-ManagedThreadFactory-engine-Thread-12382


==============
ovirt-engine-4.2.1.4-0.1.el7.noarch

Comment 7 Elad 2018-04-12 14:15:16 UTC

No NPE was thrown during automation execution of TestCase5994 (were the bug was found on)

Used:
rhvm-4.2.2.6-0.1.el7.noarch
vdsm-4.20.23-1.el7ev.x86_64

Comment 8 Sandro Bonazzola 2018-04-18 12:25:27 UTC

This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Note You need to log in before you can comment on or make changes to this bug.