Created attachment 1641621 [details] engine and vdsm logs Description of problem: Moving storage domain to maintenance cause SPM host to lose connectivity for a couple of seconds and if fails with : 2019-12-03 11:02:55,308+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135 ] Command 'UploadStreamVDSCommand(HostName = host_mixed_3, UploadStreamVDSCommandParameters:{hostId='152a58c6-7669-4e35-8319-47a913f8fd1e'})' execution failed: org.apache.http.ProtocolException: Content-Length header already present 2019-12-03 11:02:55,308+02 INFO [org.ovirt.engine.core.vdsbroker.irsbroker.UploadStreamVDSCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135 ] FINISH, UploadStreamVDSCommand, return: , log id: 411db439 2019-12-03 11:02:55,309+02 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: org.apache.http.ProtocolException: Content-Length header already present (Failed with error VDS_NETWORK_ERROR and code 5022) 2019-12-03 11:02:55,427+02 WARN [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Validation of action 'UploadStream' failed for user admin@internal-authz. Reasons: VAR__TYPE__STORAGE__DOMAIN,VAR__HOST_STATUS__UP,ACTION_TYPE_FAILED_VDS_STATUS_ILLEGAL 2019-12-03 11:02:55,433+02 INFO [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Lock freed to object 'EngineLock:{exclusiveLocks='', sharedLocks='[152a58c6-7669-4e35-8319-47a913f8fd1e=VDS_EXECUTION]'}' 2019-12-03 11:02:55,449+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb3 6135] EVENT_ID: UPDATE_FOR_OVF_STORES_FAILED(1,016), Failed to update OVF disks 86b81a6f-7b33-4428-b829-dc1138646f8f, 51306724-d933-424e-b039-5055b3cbac01, OVF data isn't updated on those OVF stores (Data Cente r golden_env_mixed, Storage Domain nfs_0). 2019-12-03 11:02:55,466+02 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb3 6135] EVENT_ID: UPDATE_OVF_FOR_STORAGE_DOMAIN_FAILED(190), Failed to update VMs/Templates OVF data for Storage Domain nfs_0 in Data Center golden_env_mixed. 2019-12-03 11:02:55,471+02 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b -8590-3745ebb36135] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[a7850294-595f-4921-9719-15aabbe0507e=OVF_UPDATE]'}' 2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' with children [] failed when attempting to perform the next operation, marking as 'FAILED' 2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) 2019-12-03 11:02:55,476+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' child commands '[]' executions were completed, status 'FAILED' 2019-12-03 11:02:55,476+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2019-12-03 11:02:56,492+02 ERROR [org.ovirt.engine.core.bll.storage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-5) [f02313c1-8639-457b-8590-3745ebb36135] Command 'DeactivateStorageDomainWithOvfUpdate' id 'b9230777-f4c0-4783-886b-7d7e3b5e5d18' failed executing step 'UPDATE_OVF_STORE' And then with: 2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' with children [] failed when attempting to perform the next operation, marking as 'FAILED' 2019-12-03 11:02:55,475+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) 2019-12-03 11:02:55,476+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' child commands '[]' executions were completed, status 'FAILED' 2019-12-03 11:02:55,476+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-87) [f02313c1-8639-457b-8590-3745ebb36135] Command 'UpdateOvfStoreForStorageDomain' id: 'c07980bd-fc35-4813-8284-1da57209138f' Updating status to 'FAILED', The command end method logic will be executed by one of its parent commands. 2019-12-03 11:02:56,492+02 ERROR [org.ovirt.engine.core.bll.storage.domain.DeactivateStorageDomainWithOvfUpdateCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-5) [f02313c1-8639-457b-8590-3745ebb36135] Command 'DeactivateStorageDomainWithOvfUpdate' id 'b9230777-f4c0-4783-886b-7d7e3b5e5d18' failed executing step 'UPDATE_OVF_STORE' Version-Release number of selected component (if applicable): rhv 4.4.0-6 How reproducible: 100% manually also Steps to Reproduce: 1.Create storage domain( I used NFS) 2. Move storage domain to maintanance Actual results: Storage domain fails to move to maintenance. Expected results: Additional info:
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
This seems to be caused by the moving from Apache HTTP client - https://gerrit.ovirt.org/#/q/topic:remove-commons-httpclient+(status:open+OR+status:merged) Infra team is currently investigating
I confirm it was due to http client change. Reassigning bug to myself
*** Bug 1781102 has been marked as a duplicate of this bug. ***
FailedQE. Can you please check if patches got through or we have different path reproducer?
Looks like previous patch fixed only part of the problem. There was another issue which I have addressed in the the new patch. Verification I run: 1) created new storage domain (nfs, data) 2) activated it 3) once it was 'green' I put it in maintenance mode(before this patch it failed at this step) 3) detached domain 4) removed domain Please add other relevant verification paths if there is anything more
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason: [Tag 'ovirt-engine-4.4.0' doesn't contain patch 'https://gerrit.ovirt.org/105316'] gitweb: https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/ovirt-engine-4.4.0 For more info please contact: infra
Sorry Michael but we checked on multiple ENVs on rhv-4.4.0-13 and we do not see the issue anymore. I checked on 5 different ENVs which were all clean reprovisioned. Maybe as you upgraded manually you get to see this issue? Can you(Michael) share the ENV you see this issue with the bug assignee so he can debug further? reproduction/verification step is as simple as deactivating a storage domain, that's it no other magic here.
(In reply to Avihai from comment #17) > Sorry Michael but we checked on multiple ENVs on rhv-4.4.0-13 and we do not > see the issue anymore. > I checked on 5 different ENVs which were all clean reprovisioned. > Maybe as you upgraded manually you get to see this issue? > > Can you(Michael) share the ENV you see this issue with the bug assignee so > he can debug further? > > reproduction/verification step is as simple as deactivating a storage > domain, that's it no other magic here. Hi I don't have the setup available at the moment. will update once i have reproduction. If you guys don't see this bug on 4.4.0-13 than OK
Indeed manually we can not reproduce this issue BUT we see many such issues in rhv-4.4.0-13 Tier2 regression suit run. Shir will update will all the data.
Move back to ASSIGNED as checking https://code.engineering.redhat.com/gerrit/gitweb?p=ovirt-engine.git;a=shortlog;h=refs/tags/rhv-4.4.0-13 shows the two related patches being included in rhv-4.4.0-13. @e still see the same issue meaning storage domain deactivation fails but with different scenarios/flows and only in running automation regression suits. Also with automation, this is hard to reproduce with a running a single testcase but only when runnning the test suit. Shir is working to find a scenario the will be more easy to reproduce.
Hi Artur, I have one ENV I see this issue 100% reproduce when I force update OVF on a storage domain via webadmin. Sending you a mail with the ENV details so he can debug further. Engine: 2020-01-06 09:21:40,395+02 INFO [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] Lock freed to object 'EngineLock:{exclusiveLocks='[]', sharedLocks='[b218297a-8654-4ebf-910c-bc20e3932acb=OVF_UPDATE]'}' 2020-01-06 09:21:40,398+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] Command 'UpdateOvfStoreForStorageDomain' id: '54f6fe34-d9a1-4adc-932d-73863de3fc4a' with children [] failed when attempting to perform the next operation, marking as 'FAILED' 2020-01-06 09:21:40,398+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) 2020-01-06 09:21:40,399+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-37) [4290cd59] Command 'UpdateOvfStoreForStorageDomain' id: '54f6fe34-d9a1-4adc-932d-73863de3fc4a' child commands '[]' executions were completed, status 'FAILED' 2020-01-06 09:21:41,464+02 INFO [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-92) [4290cd59] Command 'ProcessOvfUpdateForStorageDomain' id: '30035f6b-5418-41b8-801c-a4563d32683c' child commands '[fbdb6127-7f78-4c0f-9b82-18b441b65fa3, f0798b16-ef34-479d-bf08-bbf6a528daa2]' executions were completed, status 'FAILED' 2020-01-06 09:21:41,476+02 ERROR [org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-92) [574895d7-91b2-4d39-9770-d37f07b0f345] Ending command 'org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand' with failure.
The errors below appear in the running of rhv-4.4.0-13 Tier2 (logs attached for each scenario): 1. After deactivate the iscsi SD (test_detach_attach_new_domain[iscsi] testcase5300 log): from engine.log: 2020-01-03 14:06:07,834+02 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [storagedomains_syncAction_90d78ecd-6] Failed to update OVF_STORE content 2020-01-03 14:06:07,834+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [storagedomains_syncAction_90d78ecd-6] Command 'ProcessOvfUpdateForStorageDomain' id: 'ea5ff01e-2fa5-4696-b1b8-33334b07c35d' with children [ff8286eb-94d7-45d2-a96e-eacf672a82d1, bf3f7dd9-53ac-4487-a0b1-57a313f9d519] failed when attempting to perform the next operation, marking as 'ACTIVE' 2020-01-03 14:06:07,834+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-98) [storagedomains_syncAction_90d78ecd-6] null: java.lang.RuntimeException at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.ovfstore.ProcessOvfUpdateForStorageDomainCommand.performNextOperation(ProcessOvfUpdateForStorageDomainCommand.java:443) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250) 2. After updating OVF (Test_change_ovf_store_count[iscsi] testcase6261 log): from engine.log: 2020-01-03 23:40:10,506+02 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsProxy] (EE-ManagedThreadFactory-engine-Thread-19) [53bed88e] Domain 'e7315b3e-565e-4333-a485-0a2b4020f5fe:iscsi_1' was reported with error code '2001' 2020-01-03 23:40:10,506+02 ERROR [org.ovirt.engine.core.bll.InitVdsOnUpCommand] (EE-ManagedThreadFactory-engine-Thread-19) [53bed88e] Storage Domain 'iscsi_1' of pool 'golden_env_mixed' is in problem in host 'host_mixed_2' 2020-01-03 23:40:10,520+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-19) [53bed88e] EVENT_ID: VDS_STORAGE_VDS_STATS_FAILED(189), Host host_mixed_2 reports about one of the Active Storage Domains as Problematic. from art.log: 2020-01-03 23:48:35,849 - MainThread - rhevmtests.storage.storage_ovf_on_any_domain.helpers - ERROR Unable to activate the OVF Store LV in order to allow for it to be copied from engine.log: 2020-01-03 23:49:25,850+02 ERROR [org.ovirt.engine.core.bll.CpuFlagsManagerHandler] (ServerService Thread Pool -- 51) [] Error getting info for CPU ' ', not in expected format. 2020-01-03 23:49:25,851+02 ERROR [org.ovirt.engine.core.bll.CpuFlagsManagerHandler] (ServerService Thread Pool -- 51) [] Error getting info for CPU ' ', not in expected format. 3. After updating OVF (Test_disk_on_master_domain[iscsi] testcase6247 log ): from art.log 2020-01-03 23:32:44,946 - MainThread - rhevmtests.storage.storage_ovf_on_any_domain.helpers - ERROR - Unable to activate the OVF Store LV in order to allow for it to be copied from engine.log (almost same errors from the description of this bug): 2020-01-03 23:33:33,100+02 ERROR [org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32) [6dd13d10] Command 'org.ovirt.engine.core.bll.storage.ovfstore.UploadStreamCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: java.net.SocketException: Connection reset (Failed with error VDS_NETWORK_ERROR and code 5022) 2020-01-03 23:33:33,123+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32) [6dd13d10] Command 'UpdateOvfStoreForStorageDomain' id: '440351a1-be0a-4449-a6db-8dc9d0f8cee3' with children [] failed when attempting to perform the next operation, marking as 'FAILED' 2020-01-03 23:33:33,123+02 ERROR [org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-32) [6dd13d10] EngineException: ENGINE (Failed with error ENGINE and code 5001): org.ovirt.engine.core.common.errors.EngineException: EngineException: ENGINE (Failed with error ENGINE and code 5001) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.executeNextOperation(UpdateOvfStoreForStorageDomainCommand.java:124) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.storage.domain.UpdateOvfStoreForStorageDomainCommand.performNextOperation(UpdateOvfStoreForStorageDomainCommand.java:112) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.SerialChildCommandsExecutionCallback.childCommandsExecutionEnded(SerialChildCommandsExecutionCallback.java:32) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.ChildCommandsCallbackBase.doPolling(ChildCommandsCallbackBase.java:80) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethodsImpl(CommandCallbacksPoller.java:175) at deployment.engine.ear.bll.jar//org.ovirt.engine.core.bll.tasks.CommandCallbacksPoller.invokeCallbackMethods(CommandCallbacksPoller.java:109) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.access$201(ManagedScheduledThreadPoolExecutor.java:383) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.internal.ManagedScheduledThreadPoolExecutor$ManagedScheduledFutureTask.run(ManagedScheduledThreadPoolExecutor.java:534) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834) at org.glassfish.javax.enterprise.concurrent.0.redhat-1//org.glassfish.enterprise.concurrent.ManagedThreadFactoryImpl$ManagedThread.run(ManagedThreadFactoryImpl.java:250)
Created attachment 1650105 [details] After deactivate the iscsi SD (test_detach_attach_new_domain[iscsi] testcase5300 log)
Created attachment 1650106 [details] After updating OVF (Test_change_ovf_store_count[iscsi] testcase6261 log)
Created attachment 1650114 [details] After updating OVF (Test_disk_on_master_domain[iscsi] testcase6247 log )
(In reply to Avihai from comment #21) > Hi Artur, > > I have one ENV I see this issue 100% reproduce when I force update OVF on a > storage domain via webadmin. > Sending you a mail with the ENV details so he can debug further. > <cut> Hi Avihai, I was checking (and comparing) logs you attached with those from BZ. It seems that on the environment you provided there is a problem with invalid certificate: "Certificate of host storage-ge5-vdsm2.scl.lab.tlv.redhat.com is invalid. The certificate doesn't contain valid subject alternative name, please enroll new certificate for the host. 1/2/202:42:20 PM" However, in BZ logs (uploaded by Shir) I can see a bunch of 'connection reset' errors. Perhaps, that is a matter of http/socket timeout settings. Next thing I'm going to do, is to review them. The original change includes replacing old http client with never one which makes it quite possible to have different defaults (if not specified explicitly) Definitely, previously reported issues are already fixed with provided patched.
Parallel certificate issues should be solved in BZ1787195, so moving this one also MODIFIED, if still reproduced please fail by QE
Can we retry this on latest build?
Tested on: ovirt-engine-4.4.0-0.17.master.el7.noarch vdsm-4.40.1-1.el8ev.x86_64 Tried to: - Activate - Deactivate - force OVF update - start/stop VM with disk running on the storage domain All went fine without any errors. VERIFIED
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020. Since the problem described in this bug report should be resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.