Bug 1245088

Summary: failed DI causes NPE, due to VdcCommands invoked without ResourceManager
Product: [Retired] oVirt Reporter: Marcin Mirecki <mmirecki>
Component: ovirt-engine-coreAssignee: Omer Frenkel <ofrenkel>
Status: CLOSED CURRENTRELEASE QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.6CC: bugs, ecohen, gklein, istein, lsurette, michal.skrivanek, mmirecki, ofrenkel, rbalakri, yeylon
Target Milestone: ---Keywords: CodeChange
Target Release: 3.6.0   
Hardware: All   
OS: All   
Whiteboard: virt
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-08-04 16:06:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marcin Mirecki 2015-07-21 07:56:27 UTC
Description of problem:
Some VDS commands are created using the constuctor, instead of being created using the ResourceManager. This causes some of the DI fields to not be initialized, resulting in NullPointerExceptions.
For example:
MigrateVDSCommand (line 24):
MigrateBrokerVDSCommand<?> command = new MigrateBrokerVDSCommand<>(getParameters());

How reproducible:


Steps to Reproduce:
1. Modify vds API.py migrate to return a non zero status
2. Run migration on a vm running on that vds
3. An NPE will be thrown because of auditLogDirector being null.

Set a breakpoint at: BrokerCommandBase.logToAudit() to inspect.

Log message:
2015-07-21 09:53:27,246 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.MigrateBrokerVDSCommand] (org.ovirt.thread.pool-8-thread-30) [5cb6f5b0] Failed in 'MigrateBrokerVDS' method, for vds: 'f4'; host: '192.168.120.14': null

Actual results:
NullPointerException due to fields not populated
Expected results:
Fields populated properly

Comment 1 Ilanit Stein 2015-07-30 14:08:40 UTC
Hi Marcin,

Any idea on how this problem can be reproduced by user,
so that QE can verify the bug, once fixed?

Thanks,
Ilanit.

Comment 2 Marcin Mirecki 2015-08-04 12:44:03 UTC
The exception can is logged to the engine log, right after the command is executed. The problem is with creating an error condition on vdsm. I created one artificially by modyfing API.py, so that it always returns an error.
You can do this as follows:

1. Log on to the host
2. cd /usr/share/vdsm
3. sed -i "/def migrate(self/a \ \ \ \ \ \ \ \ if True:\n\ \ \ \ \ \ \ \ \ \ \ \ return errCode['noVM']\n" API.py
4. systemctl restart vdsmd

To revert to the original state:
1. Log on to the host
2. cd /usr/share/vdsm
3. sed -i "/def migrate(self/{n;N;N;N;d}" API.py
4. systemctl restart vdsmd

With the changed code, try to migrate a host.
The error should be displayed in the audit log, and there should be no 'null' error in the log.

Comment 3 Omer Frenkel 2015-08-04 16:06:13 UTC
since this is not a "user" use case,
there is no real way for qa to test this (changing the product code is not a valid test case)
marking as code change and closing.