Bug 1694116 - NPE in OvfVmWriter.writeSnapshotsSection
Summary: NPE in OvfVmWriter.writeSnapshotsSection
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.3.2
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Michal Skrivanek
QA Contact: meital avital
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-03-29 14:47 UTC by André Liebe
Modified: 2020-04-01 14:51 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-04-01 14:47:26 UTC
oVirt Team: Virt
Embargoed:


Attachments (Terms of Use)
engine logs (756.55 KB, application/zip)
2019-04-10 13:36 UTC, André Liebe
no flags Details
engine log while deleteing a snapshot (with missing img-file) (248.69 KB, text/plain)
2019-06-25 11:49 UTC, André Liebe
no flags Details

Description André Liebe 2019-03-29 14:47:31 UTC
Description of problem:
hosted-engine deploy isn't able to restore a hosted engine (from hyperconverged glsusterfs to a different storage)


Version-Release number of selected component (if applicable):
ovirt-ansible-hosted-engine-setup-1.0.13-1.el7.noarch
ovirt-hosted-engine-setup-2.3.6-1.el7.noarch

How reproducible: always
Having 4 hosts lvh1..lvh4, where HE gluster storage runs on lvh2..lvh4. HE runs on lvh4.


Steps to Reproduce:
1. put lvh3 to maintenance (without stopping glusterfs on this host)
2. doing an engine-backup
3. put HE in global maintenance mode
4. power down he-engine from within engine
5. on lvh3 doing a hosted-engine --deploy --restore-from-file=ovirt-engine-backup.tar.gz --config-append=answers.conf

Actual results:
[ INFO  ] TASK [ovirt.hosted_engine_setup : Wait until OVF update finishes]
[ ERROR ] fatal: [localhost]: FAILED! => {"ansible_facts": {"ovirt_storage_domains": [{"available": 6570226221056, "backup": false, "block_size": 512, "comment": "", "committed": 57982058496, "critical_space_action_blocker": 5, "data_centers": [{"href": "/ovirt-engine/api/datacenters/57ff41c2-00f4-00e2-02e7-0000000000f7", "id": "57ff41c2-00f4-00e2-02e7-0000000000f7"}], "description": "", "discard_after_delete": false, "disk_profiles": [{"id": "08bd03d0-3f08-49ef-9633-1bbef7b34be9", "name": "hosted_storage"}], "disk_snapshots": [], "disks": [{"id": "d9669327-15ca-4e75-8a52-f504e35240db", "image_id": "f29c7adc-8cba-40f6-9969-0ebd5142d89c", "name": "he_metadata"}, {"id": "1b499bd2-5c50-4906-9e5c-0d46c90c03bc", "image_id": "efb9d6fa-4b38-4245-b2af-411a639fe263", "name": "HostedEngineConfigurationImage"}, {"id": "0c6a2d8a-90dc-4ce7-ae32-a11fddcbc8d7", "image_id": "e5d6e871-e2a1-472a-bcda-e16f1ee8cb02", "name": "he_sanlock"}, {"id": "62572233-3e53-4aac-8733-0303983a208e", "image_id": "6d42f0fb-8c65-4d61-a279-e2d9bbff37c1", "name": "he_virtio_disk"}], "external_status": "ok", "href": "/ovirt-engine/api/storagedomains/d315ea71-d0f7-4638-a5cc-75f1cadc6b27", "id": "d315ea71-d0f7-4638-a5cc-75f1cadc6b27", "master": false, "name": "hosted_storage", "permissions": [{"id": "4d2d6706-079c-44e2-94f9-10bffeacf506"}, {"id": "57ff41cd-0037-0079-010a-000000000083"}, {"id": "57ff41f8-035e-026a-0360-000000000119"}, {"id": "6b0ed12a-6a78-4c91-aaf2-5a41766d7ef5"}, {"id": "7c841084-dd29-42c4-b8ac-002bc16d5140"}, {"id": "95cba4f2-29fc-11e9-9010-00163e2e0d81"}, {"id": "db55898a-337e-43b6-9e3e-75a93c8a97b2"}, {"id": "de173d32-14d1-40ca-8067-2a1e7700e647"}, {"id": "fddcd04c-5848-4924-bdfa-60f75cbc8421"}], "storage": {"address": "nas.fqdn", "mount_options": "", "nfs_version": "auto", "path": "/volume1/engine", "type": "nfs"}, "storage_connections": [{"id": "47f830cf-9f17-4266-a6a8-d3acb70ea15a"}], "storage_format": "v4", "supports_discard": false, "supports_discard_zeroes_data": false, "templates": [], "type": "data", "used": 3516504473600, "vms": [{"id": "8e5b6556-09b7-4d73-8f23-5674780a8718", "name": "HostedEngine"}], "warning_low_space_indicator": 10, "wipe_after_delete": false}]}, "attempts": 12, "changed": false}
[ ERROR ] Failed to execute stage 'Closing up': Failed executing ansible-playbook


Within temporary vm this probably correlates with engine.log entries:

2019-03-29 11:36:41,437+01 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.OvfDataUpdater] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [] Attempting to update VMs/Templates Ovf.
2019-03-29 11:36:41,439+01 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Before acquiring and wait lock 'EngineLock:{exclusiveLocks='[57ff41c2-00f4-0
0e2-02e7-0000000000f7=OVF_UPDATE]', sharedLocks=''}'
2019-03-29 11:36:41,439+01 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Lock-wait acquired to object 'EngineLock:{exclusiveLocks='[57ff41c2-00f4-00e
2-02e7-0000000000f7=OVF_UPDATE]', sharedLocks=''}'
2019-03-29 11:36:41,440+01 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Running command: ProcessOvfUpdateForStoragePoolCommand internal: true. Entit
ies affected :  ID: 57ff41c2-00f4-00e2-02e7-0000000000f7 Type: StoragePool
2019-03-29 11:36:41,495+01 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Attempting to update VM OVFs in Data Center 'gematik'
2019-03-29 11:36:41,906+01 WARN  [org.ovirt.engine.core.vdsbroker.builder.vminfo.VmInfoBuildUtils] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] No host NUMA nodes found for vm HostedEngine (8e5b6556-09b7-4d73-8f23-5674780a8718)
2019-03-29 11:36:42,577+01 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Command 'org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStora
gePoolCommand' failed: Index: 0, Size: 0
2019-03-29 11:36:42,577+01 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Exception: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
        at java.util.ArrayList.rangeCheck(ArrayList.java:657) [rt.jar:1.8.0_201]
        at java.util.ArrayList.get(ArrayList.java:433) [rt.jar:1.8.0_201]
        at org.ovirt.engine.core.utils.ovf.OvfVmWriter.writeSnapshotsSection(OvfVmWriter.java:135) [utils.jar:]
        at org.ovirt.engine.core.utils.ovf.OvfVmWriter.writeHardware(OvfVmWriter.java:108) [utils.jar:]
        at org.ovirt.engine.core.utils.ovf.OvfWriter.buildVirtualSystem(OvfWriter.java:190) [utils.jar:]
        at org.ovirt.engine.core.utils.ovf.IOvfBuilder.build(IOvfBuilder.java:34) [utils.jar:]
        at org.ovirt.engine.core.utils.ovf.OvfManager.exportVm(OvfManager.java:76) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.OvfUpdateProcessHelper.generateVmMetadata(OvfUpdateProcessHelper.java:155) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.OvfUpdateProcessHelper.buildMetadataDictionaryForVm(OvfUpdateProcessHelper.java:80) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand.populateVmsMetadataForOvfUpdate(ProcessOvfUpdateForStoragePoolCommand.java:364) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand.updateOvfForVmsOfStoragePool(ProcessOvfUpdateForStoragePoolCommand.java:187) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand.executeCommand(ProcessOvfUpdateForStoragePoolCommand.java:121) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeWithoutTransaction(CommandBase.java:1157) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeActionInTransactionScope(CommandBase.java:1315) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.runInTransaction(CommandBase.java:1964) [bll.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInSuppressed(TransactionSupport.java:164) [utils.jar:]
        at org.ovirt.engine.core.utils.transaction.TransactionSupport.executeInScope(TransactionSupport.java:103) [utils.jar:]
        at org.ovirt.engine.core.bll.CommandBase.execute(CommandBase.java:1375) [bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.executeAction(CommandBase.java:419) [bll.jar:]
        at org.ovirt.engine.core.bll.executor.DefaultBackendActionExecutor.execute(DefaultBackendActionExecutor.java:13) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runAction(Backend.java:450) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runActionImpl(Backend.java:432) [bll.jar:]
        at org.ovirt.engine.core.bll.Backend.runInternalAction(Backend.java:377) [bll.jar:]
        at sun.reflect.GeneratedMethodAccessor296.invoke(Unknown Source) [:1.8.0_201]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_201]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_201]
        at org.jboss.as.ee.component.ManagedReferenceMethodInterceptor.processInvocation(ManagedReferenceMethodInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
        at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.delegateInterception(Jsr299BindingsInterceptor.java:79)
        at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.doMethodInterception(Jsr299BindingsInterceptor.java:89)
        at org.jboss.as.weld.interceptors.Jsr299BindingsInterceptor.processInvocation(Jsr299BindingsInterceptor.java:102)
        at org.jboss.as.ee.component.interceptors.UserInterceptorFactory$1.processInvocation(UserInterceptorFactory.java:63)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.invocationmetrics.ExecutionTimeInterceptor.processInvocation(ExecutionTimeInterceptor.java:43) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ee.concurrent.ConcurrentContextInterceptor.processInvocation(ConcurrentContextInterceptor.java:45) [wildfly-ee-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.InitialInterceptor.processInvocation(InitialInterceptor.java:40)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
        at org.jboss.as.ee.component.interceptors.ComponentDispatcherInterceptor.processInvocation(ComponentDispatcherInterceptor.java:52)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.singleton.SingletonComponentInstanceAssociationInterceptor.processInvocation(SingletonComponentInstanceAssociationInterceptor.java:53) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:216) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:418) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:148) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.InterceptorContext$Invocation.proceed(InterceptorContext.java:509)
        at org.jboss.weld.module.ejb.AbstractEJBRequestScopeActivationInterceptor.aroundInvoke(AbstractEJBRequestScopeActivationInterceptor.java:81) [weld-ejb-3.0.5.Final.jar:3.0.5.Final]
        at org.jboss.as.weld.ejb.EjbRequestScopeActivationInterceptor.processInvocation(EjbRequestScopeActivationInterceptor.java:89)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:47) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.security.SecurityContextInterceptor.processInvocation(SecurityContextInterceptor.java:100) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.deployment.processors.StartupAwaitInterceptor.processInvocation(StartupAwaitInterceptor.java:22) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ejb3.component.interceptors.LoggingInterceptor.processInvocation(LoggingInterceptor.java:67) [wildfly-ejb3-15.0.1.Final.jar:15.0.1.Final]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.as.ee.component.NamespaceContextInterceptor.processInvocation(NamespaceContextInterceptor.java:50)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.ContextClassLoaderInterceptor.processInvocation(ContextClassLoaderInterceptor.java:60)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.InterceptorContext.run(InterceptorContext.java:438)
        at org.wildfly.security.manager.WildFlySecurityManager.doChecked(WildFlySecurityManager.java:618)
        at org.jboss.invocation.AccessCheckingInterceptor.processInvocation(AccessCheckingInterceptor.java:57)
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:422)
        at org.jboss.invocation.ChainedInterceptor.processInvocation(ChainedInterceptor.java:53)
        at org.jboss.as.ee.component.ViewService$View.invoke(ViewService.java:198)
        at org.jboss.as.ee.component.ViewDescription$1.processInvocation(ViewDescription.java:185)
        at org.jboss.as.ee.component.ProxyInvocationHandler.invoke(ProxyInvocationHandler.java:81)
        at org.ovirt.engine.core.bll.interfaces.BackendInternal$$$view4.runInternalAction(Unknown Source) [bll.jar:]
        at sun.reflect.GeneratedMethodAccessor294.invoke(Unknown Source) [:1.8.0_201]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.8.0_201]
        at java.lang.reflect.Method.invoke(Method.java:498) [rt.jar:1.8.0_201]
        at org.jboss.weld.util.reflection.Reflections.invokeAndUnwrap(Reflections.java:410) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
        at org.jboss.weld.module.ejb.EnterpriseBeanProxyMethodHandler.invoke(EnterpriseBeanProxyMethodHandler.java:134) [weld-ejb-3.0.5.Final.jar:3.0.5.Final]
        at org.jboss.weld.bean.proxy.EnterpriseTargetBeanInstance.invoke(EnterpriseTargetBeanInstance.java:56) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
        at org.jboss.weld.module.ejb.InjectionPointPropagatingEnterpriseTargetBeanInstance.invoke(InjectionPointPropagatingEnterpriseTargetBeanInstance.java:68) [weld-ejb-3.0.5.Final.jar:3.0.5.Final]
        at org.jboss.weld.bean.proxy.ProxyMethodHandler.invoke(ProxyMethodHandler.java:106) [weld-core-impl-3.0.5.Final.jar:3.0.5.Final]
        at org.ovirt.engine.core.bll.BackendCommandObjectsHandler$BackendInternal$BackendLocal$2049259618$Proxy$_$$_Weld$EnterpriseProxy$.runInternalAction(Unknown Source) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.OvfDataUpdater.performOvfUpdateForStoragePool(OvfDataUpdater.java:65) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.OvfDataUpdater.updateOvfData(OvfDataUpdater.java:85) [bll.jar:]
        at org.ovirt.engine.core.bll.storage.ovfstore.OvfDataUpdater.ovfUpdate(OvfDataUpdater.java:72) [bll.jar:]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_201]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [rt.jar:1.8.0_201]
        at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) [javax.enterprise.concurrent-1.0.jar:]
        at org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) [javax.enterprise.concurrent-1.0.jar:]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [rt.jar:1.8.0_201]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [rt.jar:1.8.0_201]
        at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_201]
        at org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) [javax.enterprise.concurrent-1.0.jar:]
        at org.jboss.as.ee.concurrent.service.ElytronManagedThreadFactory$ElytronManagedThread.run(ElytronManagedThreadFactory.java:78)

2019-03-29 11:36:42,581+01 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStoragePoolCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Lock freed to object 'EngineLock:{exclusiveLocks='[57ff41c2-00f4-00e2-02e7-0
000000000f7=OVF_UPDATE]', sharedLocks=''}'
2019-03-29 11:36:42,581+01 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.OvfDataUpdater] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Exception while trying to update or remove VMs/Templates ovf in Data Center 'gematik'.
2019-03-29 11:36:42,581+01 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.OvfDataUpdater] (EE-ManagedThreadFactory-engineScheduled-Thread-88) [6dd0f9] Data Center '{}' domains list for OVF update returned as NULL


Expected results:
hosted-engine-setup finish successfully

Additional info:

Comment 1 Simone Tiraboschi 2019-04-01 12:00:53 UTC
It comes from here:
https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/utils/src/main/java/org/ovirt/engine/core/utils/ovf/OvfVmWriter.java#L134

after the restore of a backup.

Not sure if just the result of a VM with an inconsistent snapshot in the backup or something like that.

Comment 2 André Liebe 2019-04-01 13:48:26 UTC
I'm not sure if any of my vms causes this as I purposely don't have any additional vms beside HostedEngine on HE-storage

  - name: Wait until OVF update finishes
    ovirt_storage_domain_facts:
      auth: "{{ ovirt_auth }}"
      fetch_nested: true
      nested_attributes:
        - name
        - image_id
        - id
      pattern: "name={{ he_storage_domain_name }}"
    retries: 12
    delay: 10
    until: "ovirt_storage_domains[0].disks | selectattr('name', 'match', '^OVF_STORE$') | list"


As a workaround I just allowed the task to fail to finish my recovery by adding a "ignore_errors: yes"

Let me know if there are any API calls or SQL scripts to run to get to the bottom of this

Comment 3 Simone Tiraboschi 2019-04-01 13:53:04 UTC
(In reply to André Liebe from comment #2)
> I'm not sure if any of my vms causes this as I purposely don't have any
> additional vms beside HostedEngine on HE-storage

I fear it's due to another VM with not existing/inconsistent/... snapshot in the same datacenter (not on the same storage domain).

Comment 4 André Liebe 2019-04-01 15:56:47 UTC
Seems so, I remember getting OVF update errors for a while while trying to put domains into maintenance, but I've got no clue where this comes from as I hit many bugs in the past, like bug 1670774 or noumerous hyperconverged glusterfs problems
Is there a sanity-check tool for database available?

Comment 5 Ryan Barry 2019-04-01 22:34:49 UTC
What's the version of ovirt-engine?

Comment 6 André Liebe 2019-04-08 05:11:45 UTC
That's ovirt-engine-4.3.2.1-1.el7.noarch

Comment 7 Ryan Barry 2019-04-09 03:12:44 UTC
Shmuel, thoughts? This doesn't look like the other OVF issues resolved last week

Comment 8 André Liebe 2019-04-10 11:54:18 UTC
If you need anything more (like a database snapshot, backup file, ...), you'll better tell me now, as I'll be out of office for tree weeks starting friday afternoon (MEST)

Comment 9 Ryan Barry 2019-04-10 12:21:58 UTC
If you can reproduce and attach a full log (or grab the engine log for the day this occurred), that would be great

Comment 10 André Liebe 2019-04-10 13:36:38 UTC
Created attachment 1554236 [details]
engine logs

I just did a reinstall today (just ignore error messages from missing old HA storage)


Is there a way to restrict attachments access to employees, so logs are non public?

Comment 12 Simone Tiraboschi 2019-04-10 13:42:51 UTC
(In reply to André Liebe from comment #10)
> Is there a way to restrict attachments access to employees, so logs are non
> public?

Yes, I just uploded a copy of it as a private attachment.
Your can simply delete the file you uploaded.

Comment 13 Shmuel Melamud 2019-04-10 18:28:23 UTC
(In reply to Ryan Barry from comment #7)
> Shmuel, thoughts? This doesn't look like the other OVF issues resolved last
> week

The NPE occurs because there is a snapshot with memory and the memory volume does not have a storage ID assigned. This should never happen in normal situation, as far as I understand.

But I'm not sure it is related to the backup/restore of the hosted-engine. Is that broken memory snapshot of the hosted-engine VM? Does engine-backup finishes without errors?

Comment 14 André Liebe 2019-04-11 04:54:41 UTC
I remember seeing this NPE in logs for while. I'm not sure when it appeared first. Probably sometime around 4.1->4.2 upgrade, as I had sereve problems with gluster, NFS (Synology/BtrFS) and sanlock, I igored this error in logs.

> Does engine-backup finishes without errors?
logs are fine, no errors logged

Comment 15 André Liebe 2019-05-08 11:46:13 UTC
Any news on this issue?

I updated my environment yesterday to 4.3.3.7-1.el7, but still get this NPE when selecting my master storage domain and selecting "Update OVFs".

Additionally my hosted-engine hosts started to log this (which I assume to be a side effect of this)
ovirt-ha-agent[28894]: ovirt-ha-agent ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine.config.vm ERROR Unable to identify the OVF_STORE volume, falling back to initial vm.conf. Please ensure you already added your first data domain for regular VMs

Comment 16 Ryan Barry 2019-05-08 12:18:26 UTC
No reproducer yet 

Do you have a memory snapshot of the HE VM?

Comment 17 André Liebe 2019-05-08 12:25:35 UTC
No.

Comment 18 André Liebe 2019-05-08 12:28:39 UTC
But there may be snapshots on other vms, as I migrated a lot vms from (already deleted) gluster and nfs volumes. Is there a SQL Statement I could use to check?

Comment 19 André Liebe 2019-05-29 09:09:51 UTC
(In reply to Shmuel Melamud from comment #13)
> (In reply to Ryan Barry from comment #7)
> > Shmuel, thoughts? This doesn't look like the other OVF issues resolved last
> > week
> 
> The NPE occurs because there is a snapshot with memory and the memory volume
> does not have a storage ID assigned. This should never happen in normal
> situation, as far as I understand.
> 
> But I'm not sure it is related to the backup/restore of the hosted-engine.
> Is that broken memory snapshot of the hosted-engine VM? Does engine-backup
> finishes without errors?

How do I identify the invalid snaphots (and purge them)?

Comment 20 André Liebe 2019-06-25 11:46:54 UTC
Okay, fixed it myself by attaching remotely to engine jvm and stepped through each vm untill NPE happend. Remebering vm's name I purged it's snapshots from web gui.

Would be nice do add a log.debug statement for each VM beeing handled. On the other hand, I would expect engine to gracefully handle a missing disk by throwing a useful warning/error/alert in webgui.

Comment 21 André Liebe 2019-06-25 11:49:22 UTC
Created attachment 1584289 [details]
engine log while deleteing a snapshot (with missing img-file)

Comment 22 Michal Skrivanek 2020-03-18 15:46:21 UTC
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 23 Michal Skrivanek 2020-03-18 15:51:09 UTC
This bug didn't get any attention for a while, we didn't have the capacity to make any progress. If you deeply care about it or want to work on it please assign/target accordingly

Comment 24 Michal Skrivanek 2020-04-01 14:47:26 UTC
ok, closing. Please reopen if still relevant/you want to work on it.

Comment 25 Michal Skrivanek 2020-04-01 14:51:00 UTC
ok, closing. Please reopen if still relevant/you want to work on it.


Note You need to log in before you can comment on or make changes to this bug.