Bug 1779085 - Storage domain can not be deactivated with error Failed executing step 'UPDATE_OVF_STORE'
Summary: Storage domain can not be deactivated with error Failed executing step 'UPDAT...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.4.0
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ovirt-4.4.0
: ---
Assignee: Artur Socha
QA Contact: Yosi Ben Shimon
URL:
Whiteboard:
: 1781102 (view as bug list)
Depends On:
Blocks: 1785113
TreeView+ depends on / blocked
 
Reported: 2019-12-03 09:17 UTC by Avihai
Modified: 2020-05-20 20:01 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-20 20:01:12 UTC
oVirt Team: Storage
Embargoed:
pm-rhel: ovirt-4.4+
pm-rhel: blocker?


Attachments (Terms of Use)
engine and vdsm logs (876.69 KB, application/gzip)
2019-12-03 09:17 UTC, Avihai
no flags Details
After deactivate the iscsi SD (test_detach_attach_new_domain[iscsi] testcase5300 log) (1.24 MB, application/zip)
2020-01-06 12:40 UTC, Shir Fishbain
no flags Details
After updating OVF (Test_change_ovf_store_count[iscsi] testcase6261 log) (313.29 KB, application/zip)
2020-01-06 12:41 UTC, Shir Fishbain
no flags Details
After updating OVF (Test_disk_on_master_domain[iscsi] testcase6247 log ) (134.24 KB, application/zip)
2020-01-06 12:42 UTC, Shir Fishbain
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 105316 0 'None' MERGED core: content-length already present - fix 2021-02-09 16:31:28 UTC
oVirt gerrit 105797 0 master MERGED core: missing Task-id header fix 2021-02-09 16:31:28 UTC

Description Avihai 2019-12-03 09:17:10 UTC
Created attachment 1641621 [details]
engine and vdsm logs

Description of problem:
Moving storage domain to maintenance cause SPM host to lose connectivity for a couple of seconds and if fails with :

2019-12-03 11:02:55,308+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135
] Command 'UploadStreamVDSCommand(HostName = host_mixed_3, UploadStreamVDSCommandParameters:{hostId='152a58c6-7669-4e35-8319-47a913f8fd1e'})' execution failed: org.apache.http.ProtocolException: Content-Length 
header already present
2019-12-03 11:02:55,308+02 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135
] FINISH, UploadStreamVDSCommand, return: , log id: 411db439
2019-12-03 11:02:55,309+02 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] 
Command 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: org.apache.http.ProtocolException: Content-Length
 header already present (Failed with error VDS_NETWORK_ERROR and code 5022)

2019-12-03 11:02:55,427+02 WARN  [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] 
Validation of action 'UploadStream' failed for user admin@internal-authz. Reasons: VAR__TYPE__STORAGE__DOMAIN,VAR__HOST_STATUS__UP,ACTION_TYPE_FAILED_VDS_STATUS_ILLEGAL
2019-12-03 11:02:55,433+02 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] 
Lock freed to object 'EngineLock:{exclusiveLocks='', sharedLocks='[152a58c6-7669-4e35-8319-47a913f8fd1e=VDS_EXECUTION]'}'
2019-12-03 11:02:55,449+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb3
6135] EVENT_ID: UPDATE_FOR_OVF_STORES_FAILED(1,016), Failed to update OVF disks 86b81a6f-7b33-4428-b829-dc1138646f8f, 51306724-d933-424e-b039-5055b3cbac01, OVF data isn't updated on those OVF stores (Data Cente
r golden_env_mixed, Storage Domain nfs_0).
2019-12-03 11:02:55,466+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb3
6135] EVENT_ID: UPDATE_OVF_FOR_STORAGE_DOMAIN_FAILED(190), Failed to update VMs/Templates OVF data for Storage Domain nfs_0 in Data Center golden_env_mixed.
2019-12-03 11:02:55,471+02 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b
-8590-3745ebb36135] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[a7850294-595f-4921-9719-15aabbe0507e=OVF_UPDATE]'}'
2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' with children [] failed when attempting to perform the next operation, marking as 'FAILED'
2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

2019-12-03 11:02:55,476+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' child commands '[]' executions were completed, status 'FAILED'
2019-12-03 11:02:55,476+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands.
2019-12-03 11:02:56,492+02 ERROR [org.ovirt.engine.core.bll.storage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-5) [f02313c1-8639-457b-8590-3745ebb36135] Command 'DeactivateStorageDomainWithOvfUpdate' id 'b9230777-f4c0-4783-886b-7d7e3b5e5d18' failed executing step 'UPDATE_OVF_STORE'

And then with:
2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' with children [] failed when attempting to perform the next operation, marking as 'FAILED'
2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

2019-12-03 11:02:55,476+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' child commands '[]' executions were completed, status 'FAILED'
2019-12-03 11:02:55,476+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands.
2019-12-03 11:02:56,492+02 ERROR [org.ovirt.engine.core.bll.storage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-5) [f02313c1-8639-457b-8590-3745ebb36135] Command 'DeactivateStorageDomainWithOvfUpdate' id 'b9230777-f4c0-4783-886b-7d7e3b5e5d18' failed executing step 'UPDATE_OVF_STORE'


Version-Release number of selected component (if applicable):
rhv 4.4.0-6

How reproducible:
100% manually also

Steps to Reproduce:
1.Create storage domain( I used NFS)
2. Move storage domain to maintanance


Actual results:
Storage domain fails to move to maintenance.

Expected results:


Additional info:

Comment 2 RHEL Program Management 2019-12-03 09:42:02 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Tal Nisan 2019-12-03 11:40:44 UTC
This seems to be caused by the moving from Apache HTTP client - https://gerrit.ovirt.org/#/q/topic:remove-commons-httpclient+(status:open+OR+status:merged)
Infra team is currently investigating

Comment 4 Artur Socha 2019-12-03 13:08:48 UTC
I confirm it was due to http client change. Reassigning bug to myself

Comment 5 Michal Skrivanek 2019-12-10 06:02:13 UTC
*** Bug 1781102 has been marked as a duplicate of this bug. ***

Comment 7 Lukas Svaty 2019-12-17 13:54:38 UTC
FailedQE. Can you please check if patches got through or we have different path reproducer?

Comment 12 Artur Socha 2019-12-17 16:47:35 UTC
Looks like previous patch fixed only part of the problem. There was another issue which I have addressed in the the new patch.

Verification I run:
1) created new storage domain (nfs,  data)
2) activated it
3) once it was 'green' I put it in maintenance mode(before this patch it failed at this step)
3) detached domain
4) removed domain

Please add other relevant verification paths if there is anything more

Comment 13 RHV bug bot 2019-12-20 17:44:37 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Tag 'ovirt-engine-4.4.0' doesn't contain patch 'https://gerrit.ovirt.org/105316']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.4.0

For more info please contact: infra

Comment 17 Avihai 2020-01-02 16:00:15 UTC
Sorry Michael but we checked on multiple ENVs on rhv-4.4.0-13 and we do not see the issue anymore.
I checked on 5 different ENVs which were all clean reprovisioned.
Maybe as you upgraded manually you get to see this issue?

Can you(Michael) share the ENV you see this issue with the bug assignee so he can debug further?

reproduction/verification step is as simple as deactivating a storage domain, that's it no other magic here.

Comment 18 Michael Burman 2020-01-05 13:46:59 UTC
(In reply to Avihai from comment #17)
> Sorry Michael but we checked on multiple ENVs on rhv-4.4.0-13 and we do not
> see the issue anymore.
> I checked on 5 different ENVs which were all clean reprovisioned.
> Maybe as you upgraded manually you get to see this issue?
> 
> Can you(Michael) share the ENV you see this issue with the bug assignee so
> he can debug further?
> 
> reproduction/verification step is as simple as deactivating a storage
> domain, that's it no other magic here.

Hi
I don't have the setup available at the moment. will update once i have reproduction. If you guys don't see this bug on 4.4.0-13 than OK

Comment 19 Avihai 2020-01-06 06:58:46 UTC
Indeed manually we can not reproduce this issue BUT we see many such issues in rhv-4.4.0-13 Tier2 regression suit run.

Shir will update will all the data.

Comment 20 Avihai 2020-01-06 07:19:35 UTC
Move back to ASSIGNED as checking https://code.engineering.redhat.com/gerrit/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/rhv-4.4.0-13  shows the two related patches being included in rhv-4.4.0-13.

@e still see the same issue meaning storage domain deactivation fails but with different scenarios/flows and only in running automation regression suits.

Also with automation, this is hard to reproduce with a running a single testcase but only when runnning the test suit.
Shir is working to find a scenario the will be more easy to reproduce.

Comment 21 Avihai 2020-01-06 07:39:11 UTC
Hi Artur,

I have one ENV I see this issue 100% reproduce when I force update OVF on a storage domain via webadmin.
Sending you a mail with the ENV details so he can debug further.

Engine:
2020-01-06 09:21:40,395+02 INFO  [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[b218297a-8654-4ebf-910c-bc20e3932acb=OVF_UPDATE]'}'
2020-01-06 09:21:40,398+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] Command 'UpdateOvfStoreForStorageDomain' id: '54f6fe34-d9a1-4adc-932d-73863de3fc4a' with children [] failed when attempting to perform the next operation, marking as 'FAILED'
2020-01-06 09:21:40,398+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
        at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
        at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

2020-01-06 09:21:40,399+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] Command 'UpdateOvfStoreForStorageDomain' id: '54f6fe34-d9a1-4adc-932d-73863de3fc4a' child commands '[]' executions were completed, status 'FAILED'
2020-01-06 09:21:41,464+02 INFO  [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-92) [4290cd59] Command 'ProcessOvfUpdateForStorageDomain' id: '30035f6b-5418-41b8-801c-a4563d32683c' child commands '[fbdb6127-7f78-4c0f-9b82-18b441b65fa3, f0798b16-ef34-479d-bf08-bbf6a528daa2]' executions were completed, status 'FAILED'
2020-01-06 09:21:41,476+02 ERROR [org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-92) [574895d7-91b2-4d39-9770-d37f07b0f345] Ending command 'org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand' with failure.

Comment 22 Shir Fishbain 2020-01-06 12:39:04 UTC
The errors below appear in the running of rhv-4.4.0-13 Tier2 (logs attached for each scenario):

1. After deactivate the iscsi SD (test_detach_attach_new_domain[iscsi] testcase5300 log):

from engine.log:
2020-01-03 14:06:07,834+02 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [storagedomains_syncAction_90d78ecd-6] Failed to update OVF_STORE content

2020-01-03 14:06:07,834+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [storagedomains_syncAction_90d78ecd-6] Command 'ProcessOvfUpdateForStorageDomain' id: 'ea5ff01e-2fa5-4696-b1b8-33334b07c35d' with children [ff8286eb-94d7-45d2-a96e-eacf672a82d1, bf3f7dd9-53ac-4487-a0b1-57a313f9d519] failed when attempting to perform the next operation, marking as 'ACTIVE'

2020-01-03 14:06:07,834+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [storagedomains_syncAction_90d78ecd-6] null: java.lang.RuntimeException
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand.performNextOperation(ProcessOvfUpdateForStorageDomainCommand.java:443)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

2. After updating OVF (Test_change_ovf_store_count[iscsi] testcase6261 log):

from engine.log:
2020-01-03 23:40:10,506+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-19) [53bed88e] Domain 'e7315b3e-565e-4333-a485-0a2b4020f5fe:iscsi_1' was reported with error code '2001'

2020-01-03 23:40:10,506+02 ERROR [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (EE-ManagedThreadFactory-engine-Thread-19) [53bed88e] Storage Domain 'iscsi_1' of pool 'golden_env_mixed' is in problem in host 'host_mixed_2'

2020-01-03 23:40:10,520+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-19) [53bed88e] EVENT_ID: VDS_STORAGE_VDS_STATS_FAILED(189), Host host_mixed_2 reports about one of the Active Storage Domains as Problematic.

from art.log:
2020-01-03 23:48:35,849 - MainThread - rhevmtests.storage.storage_ovf_on_any_domain.helpers - ERROR Unable to activate the OVF Store LV in order to allow for it to be copied

from engine.log:
2020-01-03 23:49:25,850+02 ERROR [org.ovirt.engine.core.bll.CpuFlagsManagerHandler] (ServerService Thread Pool -- 51) [] Error getting info for CPU ' ', not in expected format.
2020-01-03 23:49:25,851+02 ERROR [org.ovirt.engine.core.bll.CpuFlagsManagerHandler] (ServerService Thread Pool -- 51) [] Error getting info for CPU ' ', not in expected format.


3. After updating OVF (Test_disk_on_master_domain[iscsi] testcase6247 log ):

from art.log
2020-01-03 23:32:44,946 - MainThread - rhevmtests.storage.storage_ovf_on_any_domain.helpers - ERROR - Unable to activate the OVF Store LV in order to allow for it to be copied

from engine.log (almost same errors from the description of this bug):
2020-01-03 23:33:33,100+02 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32) [6dd13d10] Command 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.SocketException: Connection reset (Failed with error VDS_NETWORK_ERROR and code 5022)

2020-01-03 23:33:33,123+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32) [6dd13d10] Command 'UpdateOvfStoreForStorageDomain' id: '440351a1-be0a-4449-a6db-8dc9d0f8cee3' with children [] failed when attempting to perform the next operation, marking as 'FAILED'

2020-01-03 23:33:33,123+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32) [6dd13d10] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175)
	at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
	at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)

Comment 23 Shir Fishbain 2020-01-06 12:40:34 UTC
Created attachment 1650105 [details]
After deactivate the iscsi SD (test_detach_attach_new_domain[iscsi] testcase5300 log)

Comment 24 Shir Fishbain 2020-01-06 12:41:23 UTC
Created attachment 1650106 [details]
After updating OVF (Test_change_ovf_store_count[iscsi] testcase6261 log)

Comment 25 Shir Fishbain 2020-01-06 12:42:12 UTC
Created attachment 1650114 [details]
After updating OVF (Test_disk_on_master_domain[iscsi] testcase6247 log )

Comment 26 Artur Socha 2020-01-07 10:54:02 UTC
(In reply to Avihai from comment #21)
> Hi Artur,
> 
> I have one ENV I see this issue 100% reproduce when I force update OVF on a
> storage domain via webadmin.
> Sending you a mail with the ENV details so he can debug further.
> 

<cut>

Hi Avihai,

I  was checking  (and comparing) logs you attached with those from BZ. It seems that on the environment you provided there is a problem with invalid certificate:
"Certificate of host storage-ge5-vdsm2.scl.lab.tlv.redhat.com is invalid. The certificate doesn't contain valid subject alternative name, please enroll new certificate for the host.
1/2/202:42:20 PM"
However, in BZ logs (uploaded by Shir) I can see a bunch of 'connection reset' errors. Perhaps, that is a matter of http/socket timeout settings. Next thing I'm going to do, is to review them. The original change includes replacing old http client with never one which makes it quite possible to have different defaults (if not specified explicitly)

Definitely,  previously reported issues  are already fixed with provided patched.

Comment 27 Martin Perina 2020-01-20 13:21:31 UTC
Parallel certificate issues should be solved in BZ1787195, so moving this one also MODIFIED, if still reproduced please fail  by QE

Comment 28 RHV bug bot 2020-01-24 19:50:48 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Tag 'ovirt-engine-4.4.0' doesn't contain patch 'https://gerrit.ovirt.org/105316']
gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.4.0

For more info please contact: infra

Comment 29 Lukas Svaty 2020-01-27 13:39:48 UTC
Can we retry this on latest build?

Comment 31 Yosi Ben Shimon 2020-01-27 15:40:08 UTC
Tested on:
ovirt-engine-4.4.0-0.17.master.el7.noarch
vdsm-4.40.1-1.el8ev.x86_64

Tried to:
- Activate
- Deactivate
- force OVF update
- start/stop VM with disk running on the storage domain

All went fine without any errors.

VERIFIED

Comment 32 Sandro Bonazzola 2020-05-20 20:01:12 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.