Bug 1575996
Summary: | Hosted engine: ProcessOvfUpdateForStoragePoolCommand fails with NPE | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [oVirt] ovirt-engine | Reporter: | Elad <ebenahar> | ||||||
Component: | BLL.Storage | Assignee: | Tal Nisan <tnisan> | ||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | meital avital <mavital> | ||||||
Severity: | urgent | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 4.2.3.2 | CC: | aefrat, ahadas, bugs, cshao, dfediuck, ebenahar, huzhao, michal.skrivanek, mkalinin, qiyuan, ratamir, sfishbai, tnisan, usurse, weiwang, yaniwang, ycui | ||||||
Target Milestone: | --- | Keywords: | Automation, AutomationBlocker, Regression, Reopened | ||||||
Target Release: | --- | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2018-10-28 16:12:43 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | Virt | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
The failure is in writing the network interfaces to one of the OVF, moving to Virt This occurs on hosted engine every time for all storage domain types. Raising severity and marking as a regression This issue prevents OVF update upon domain deactivation. This is crucial for DR, therefore, should be a blocker for upcoming GA Removing blocker? as the issue is not reproducible so far Reproduced on one HE environment for every triggered OVF update. As Raz mentioned above, so far we're unable to reproduce with a different environment. Looks like a same issue as bug 1570349 *** This bug has been marked as a duplicate of bug 1570349 *** Michal, How exactly is it a dup of bug 1570349? I don't see the connection you likely a setup from before 4.2.2 which you've been upgrading downstream (doesn't matter if it's 4.1 or some early <4.2.2) So if you'd look at the vm devices you'd see unmanaged devices, which then fail on start up. It's the same serialization in OVF...it's just that other VMs were either not running during that past upgrade, or they were fixed manually already, or you didn't try to run them since. it won't happen for a new VM or for an upgrade of 4.1 to 4.2.3+. Except for the remaining case of having VMs running on host while upgrading RHEL 7.4 to 7.5 and 4.1 to 4.2 at the same time. That is still pending a fix (blocking such update) I can doublecheck all that if you provide access details This environment is a fresh install - it wasn't upgraded. More than that, the used VMs are created newly on this env as well. The environment does not exist anymore so nothing to check there. I'm opening this bug as this is not related to upgrading but only to ovf update I see. Well, there's nothing to check then if you cannot reproduce this, and the only thing in the provided log points to bug 1570349 I can keep it open for a week or two, but without updates it's going to end up with the same resolution anyway. Sure, We'll try to reproduce it and in case we can't we will take care of it. Thanks *** Bug 1576766 has been marked as a duplicate of this bug. *** Haven't seen this reproduced ever since. Created attachment 1450511 [details]
Attachments
The following error appears an hour after the update was done on the engine. 2018-06-10 19:11:00,134+03 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-62) [21646704] Exception: java.lang.NullPointer Exception The error keeps occurring on an hourly basis since the error above starts to appear. (Logs attached) Shir, while the issue reported in comment 16 leads to the same result, NPE during an update of the OVF store, it is completely different as it is related to memory that saved within snapshots and not to writing the NICs to an OVF. Please file a new bug about that and let's close this one if the original issue has not reproduced. The current status of this bug is misleading. Tal, Arik's comment #18 sounds to me like the area of bug 1573600. I don't think so, bug 1573600 is about registering a VM with memory images and comment #16 and #18 are about OVF update flow Hey, I have SHE environment upgraded all the way from 3.6 to rhvm-4.2.4.5-0.1.el7_3.noarch and seem to have the same problem. From engine.log: ~~~ 2018-07-26 18:57:37,535-04 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-86) [648e79cc] Command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand' failed: null 2018-07-26 18:57:37,535-04 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-86) [648e79cc] Exception: java.lang.NullPointerException at org.ovirt.engine.core.vdsbroker.builder.vminfo.LibvirtVmXmlBuilder.writeInterface(LibvirtVmXmlBuilder.java:2069) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.builder.vminfo.LibvirtVmXmlBuilder.lambda$writeInterfaces$25(LibvirtVmXmlBuilder.java:1120) [vdsbroker.jar:] ~~~ FWIW, I was trying to update the serial console of HE VM properties and it would not do it. I would uncheck it and click save. Open the Edit dialog again and the option would be still checked. This is what brought me checking the engine.log. I have the environment available, if you would like to ssh. Arik, it's your call here, how would you like to proceed? can you rule out that it was ever running on 4.2.2? If not, do you have any logs from upgrade from 4.1 to 4.2? There were also some fixes in 4.2.5 around HE serial console. In general HE issues would need to be handled by Integration team. Old deployments have specific hw configuration Virt team is not familiar with. Closing for the lack of reproduction. Please re-open if available with all the relevant information. (In reply to Michal Skrivanek from comment #24) > can you rule out that it was ever running on 4.2.2? If not, do you have any > logs from upgrade from 4.1 to 4.2? > There were also some fixes in 4.2.5 around HE serial console. > > In general HE issues would need to be handled by Integration team. Old > deployments have specific hw configuration Virt team is not familiar with. Sorry, seems like I never replied to this question. I do not think so. Unfortunately, I do not recall what environment it was at this point. |
Created attachment 1433220 [details] logs Description of problem: On hosted engine env, storage domain deactivation sometimes ends with a failure due to NPE for ProcessOvfUpdateForStoragePoolCommand. Version-Release number of selected component (if applicable): ovirt-engine-4.2.3.4-0.1.el7.noarch How reproducible: Happened once Steps to Reproduce: Hosted engine env: 1. Create a VM with disk on iSCSI domain 2. Deactivate this domain Actual results: Right before the NPE is thrown, this message appears: No host NUMA nodes found for vm HostedEngine Stack trace: 2018-05-05 08:09:19,169+03 INFO [org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand] (default task-26) [storagedomains_syncAction_d13f1d4e-1] Running command: UpdateOvfStoreForStorag eDomainCommand internal: true. Entities affected : ID: 9507864d-92e1-496c-a060-6bf935b5663c Type: StorageAction group MANIPULATE_STORAGE_DOMAIN with role type ADMIN 2018-05-05 08:09:19,176+03 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (default task-26) [65e247a3] Running command: ProcessOvfUpdateForStoragePoolCommand internal: t rue. Entities affected : ID: d1d01ef4-4f0c-11e8-a408-00163e7be007 Type: StoragePool 2018-05-05 08:09:19,187+03 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (default task-26) [65e247a3] Attempting to update VM OVFs in Data Center 'golden_env_mixed' 2018-05-05 08:09:19,266+03 WARN [org.ovirt.engine.core.vdsbroker.builder.vminfo.VmInfoBuildUtils] (default task-26) [65e247a3] No host NUMA nodes found for vm HostedEngine (1d7f6b2b-3657-4780-8275-b249e63a5a81) 2018-05-05 08:09:19,273+03 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (default task-26) [65e247a3] Command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpda teForStoragePoolCommand' failed: null 2018-05-05 08:09:19,273+03 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (default task-26) [65e247a3] Exception: java.lang.NullPointerException at org.ovirt.engine.core.vdsbroker.builder.vminfo.LibvirtVmXmlBuilder.writeInterface(LibvirtVmXmlBuilder.java:2045) [vdsbroker.jar:] at org.ovirt.engine.core.vdsbroker.builder.vminfo.LibvirtVmXmlBuilder.lambda$writeInterfaces$24(LibvirtVmXmlBuilder.java:1096) [vdsbroker.jar:] at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184) [rt.jar:1.8.0_171] at java.util.stream.SortedOps$SizedRefSortingSink.end(SortedOps.java:352) [rt.jar:1.8.0_171] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) [rt.jar:1.8.0_171] Additional info: logs