Bug 853747
Summary: | [Log][engine] NullPointerException in spmStart in case storage is inaccessible or not connected (Command SpmStartVDS execution failed. Exception: NullPointerException) | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Gadi Ickowicz <gickowic> | ||||
Component: | ovirt-engine | Assignee: | Federico Simoncelli <fsimonce> | ||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | Gadi Ickowicz <gickowic> | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 3.1.0 | CC: | abaron, amureini, dyasny, fsimonce, hateya, iheim, lpeer, nlevinki, Rhev-m-bugs, sgrinber, yeylon, ykaul | ||||
Target Milestone: | --- | ||||||
Target Release: | 3.1.0 | ||||||
Hardware: | All | ||||||
OS: | Linux | ||||||
Whiteboard: | storage | ||||||
Fixed In Version: | SI20 | Doc Type: | Bug Fix | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2012-12-04 20:06:47 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
i wonder why no stacktrace on the NPE. hiding in the log is missing toString on StatusForXmlRpc? 2012-09-02 18:10:38,903 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-39) [18c0d8a4] Command org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand return value Class Name: org.ovirt.engine.core.vdsbroker.irsbroker.OneUuidReturnForXmlRpc mUuid 0e4f76c6-051b-4eb1-9d5b-7a799db786f7 mStatus Class Name: org.ovirt.engine.core.vdsbroker.vdsbroker.StatusForXmlRpc mCode 0 mMessage OK commit 8d4b56a2385fd71f2d0b4e152284371d3e519c05 Author: Federico Simoncelli <fsimonce> Date: Wed Sep 19 17:48:37 2012 -0400 core: trust the SpmStart task result during election After the spmStart task ended an additional getSpmStatus was issued to verify whether the host really became the SPM or not. This second command could fail on its own for several reasons (eg: temporary network failure, etc.) and its result wouldn't reflect the actual outcome of the spmStart task. Worst scenario: the SpmStart task succeeded and the getSpmStatus temporarily failed; this would cause the engine to proceed with the election on the next host. This patch is removing the additional getSpmStatus command trusting the spmStart task result. Bug-Url: https://bugzilla.redhat.com/show_bug.cgi?id=853747 Signed-off-by: Federico Simoncelli <fsimonce> Change-Id: I832957996226cf091b1b7fe8fa3cc7657507795a http://gerrit.ovirt.org/#/c/8072/ Merged change id I832957996226cf091b1b7fe8fa3cc7657507795a Verified on SI20 - no exception in the logs. rhevm-3.1.0-20.el6ev.noarch |
Created attachment 609123 [details] engine logs Description of problem: The following exception occurs when spmStart fails on a host because spmStart was sent when storage was not connected: 2012-09-02 18:10:38,904 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-39) [18c0d8a4] Vds: green-vdsa.qa.lab.tlv.redhat.com 2012-09-02 18:10:38,904 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.VdsBrokerCommand] (QuartzScheduler_Worker-39) [18c0d8a4] Failed in SpmStartVDS method, for vds: green-vdsa.qa.lab.tlv.redhat.com; host: 10.3 5.102.10 2012-09-02 18:10:38,904 ERROR [org.ovirt.engine.core.vdsbroker.VDSCommandBase] (QuartzScheduler_Worker-39) [18c0d8a4] Command SpmStartVDS execution failed. Exception: NullPointerException: 2012-09-02 18:10:38,904 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStartVDSCommand] (QuartzScheduler_Worker-39) [18c0d8a4] FINISH, SpmStartVDSCommand, log id: 658c9ed5 2012-09-02 18:10:38,907 INFO [org.ovirt.engine.core.bll.storage.SetStoragePoolStatusCommand] (QuartzScheduler_Worker-39) [5c454588] Running command: SetStoragePoolStatusCommand internal: true. Entities affected : ID: bd560b80-c245-46b3-ad8c-b142a3460cf6 Type: StoragePool Version-Release number of selected component (if applicable): rhevm-3.1.0-14.el6ev.noarch How reproducible: ? Steps to Reproduce: 1. Block connection between spm and master (single) storage 2. Check engine logs during spmStart Actual results: exception is visible in logs. spmStart is automatically resent after and succeeds