Bug 1752282 - Cannot activate Host while RefreshHostCapabilitiesCommand is running
Summary: Cannot activate Host while RefreshHostCapabilitiesCommand is running
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.3.6.5
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.4.0
: ---
Assignee: eraviv
QA Contact: Daniel
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-15 16:03 UTC by Shir Fishbain
Modified: 2020-05-20 20:02 UTC (History)
5 users (show)

Fixed In Version: 4.4.0-17
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-05-20 20:02:22 UTC
oVirt Team: Network
Embargoed:
pm-rhel: ovirt-4.4?
pm-rhel: ovirt-4.5?
dholler: planning_ack?
dholler: devel_ack+
pm-rhel: testing_ack+


Attachments (Terms of Use)
Logs (1.05 MB, application/zip)
2019-09-15 16:04 UTC, Shir Fishbain
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 106414 0 master MERGED core: Increase SetupNetworksWaitTimeoutSeconds 2020-11-01 17:41:52 UTC

Description Shir Fishbain 2019-09-15 16:03:18 UTC
Description of problem:
From the maintenance point of view, the engine log shows "disconnect" status that not being released and the host can't switch to activate status. 
The disconnection takes around 20 minutes until the test failed.
It looks like a lock on HOST_NETWORK is not freed.

Version-Release number of selected component (if applicable):
ovirt-engine-4.3.6.5-0.1.el7.noarch
vdsm-4.30.30-1.el7ev.x86_64

How reproducible:
Once (Couldn't manage to reproduce it, automation TC)

Steps to Reproduce:
1. Add storage domain to data-center 
2.Create VM from template
3.Create new data-center with a new cluster 
4. Move host to the new cluster 
5.Switch host to different cluster {'activate': True}
6.Deactivate the host with {'expected_status': 'maintenance'}
7.Update properties of the host (provided in parameters) with the new cluster
8.Activate the host (set status to UP)

Actual results:

2019-09-10 05:26:52,241+03 INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Before acquiring and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:26:52,241+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:26:55,237+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:26:55,240+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:26:58,162+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:26:58,743+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] FINISH, DisconnectStorageServerVDSCommand, return: {c8cf6f5d-8a77-42db-b7fd-2b7ba5f4217a=0, bfe4f2f8-a6c9-45d9-8cc1-cbda8ad46c48=0, e2c0716c-ba1f-4f76-b055-57939af81387=0}, log id: 548648da
2019-09-10 05:26:58,793+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] START, DisconnectStorageServerVDSCommand(HostName = host_mixed_2, StorageServerConnectionManagementVDSParameters:{hostId='fc25e468-758f-4b16-b33a-14a6229d5168', storagePoolId='b51b71f1-bcea-41d3-bff3-18f964659c96', storageType='ISCSI', connectionList='[StorageServerConnections:{id='1ce81d76-0996-4a26-aa60-c623f16276bf', connection='mantis-iscsi-lif2.lab.eng.tlv2.redhat.com', iqn='iqn.1992-08.com.netapp:vserver-rhv-qe', vfsType='null', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]', sendNetworkEventOnFailure='true'}), log id: 108b27d1
2019-09-10 05:26:59,726+03 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] Setting new tasks map. The map contains now 0 tasks
2019-09-10 05:26:59,726+03 INFO  [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (EE-ManagedThreadFactory-engineScheduled-Thread-14) [] Cleared all tasks of pool 'b51b71f1-bcea-41d3-bff3-18f964659c96'.
2019-09-10 05:27:01,166+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:04,169+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:07,174+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:08,211+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] FINISH, DisconnectStorageServerVDSCommand, return: {1ce81d76-0996-4a26-aa60-c623f16276bf=0}, log id: 108b27d1
2019-09-10 05:27:08,252+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] START, DisconnectStorageServerVDSCommand(HostName = host_mixed_2, StorageServerConnectionManagementVDSParameters:{hostId='fc25e468-758f-4b16-b33a-14a6229d5168', storagePoolId='b51b71f1-bcea-41d3-bff3-18f964659c96', storageType='FCP', connectionList='[StorageServerConnections:{id='00000000-0000-0000-0000-000000000000', connection='null', iqn='null', vfsType='null', mountOptions='null', nfsVersion='null', nfsRetrans='null', nfsTimeo='null', iface='null', netIfaceName='null'}]', sendNetworkEventOnFailure='true'}), log id: 229ec5b7
2019-09-10 05:27:10,256+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:10,260+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:13,184+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:15,779+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] FINISH, DisconnectStorageServerVDSCommand, return: {00000000-0000-0000-0000-000000000000=0}, log id: 229ec5b7
2019-09-10 05:27:15,797+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.RegisterLibvirtSecretsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] START, RegisterLibvirtSecretsVDSCommand(HostName = host_mixed_2, RegisterLibvirtSecretsVDSParameters:{hostId='fc25e468-758f-4b16-b33a-14a6229d5168', libvirtSecrets='[]', clearUnusedSecrets='true'}), log id: 4e334fd2
2019-09-10 05:27:15,815+03 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.RegisterLibvirtSecretsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-17) [4dab9753] FINISH, RegisterLibvirtSecretsVDSCommand, return: , log id: 4e334fd2
2019-09-10 05:27:15,815+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:15,824+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:15,824+03 INFO  [org.ovirt.engine.core.vdsbroker.VdsManager] (default task-22) [hosts_update_9c7ba5d0-0144-4492] vdsManager::disposing
2019-09-10 05:27:15,824+03 INFO  [org.ovirt.engine.core.vdsbroker.RemoveVdsVDSCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] FINISH, RemoveVdsVDSCommand, return: , log id: d4a48f6
2019-09-10 05:27:15,831+03 INFO  [org.ovirt.engine.core.vdsbroker.AddVdsVDSCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] START, AddVdsVDSCommand(HostName = host_mixed_2, AddVdsVDSCommandParameters:{hostId='fc25e468-758f-4b16-b33a-14a6229d5168'}), log id: 12c08446
2019-09-10 05:27:15,831+03 INFO  [org.ovirt.engine.core.vdsbroker.AddVdsVDSCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] AddVds - entered , starting logic to add VDS 'fc25e468-758f-4b16-b33a-14a6229d5168'
2019-09-10 05:27:15,833+03 INFO  [org.ovirt.engine.core.vdsbroker.AddVdsVDSCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] AddVds - VDS 'fc25e468-758f-4b16-b33a-14a6229d5168' was added, will try to add it to the resource manager
2019-09-10 05:27:15,834+03 INFO  [org.ovirt.engine.core.vdsbroker.VdsManager] (default task-22) [hosts_update_9c7ba5d0-0144-4492] Entered VdsManager constructor
2019-09-10 05:27:15,846+03 INFO  [org.ovirt.engine.core.vdsbroker.VdsManager] (default task-22) [hosts_update_9c7ba5d0-0144-4492] Initialize vdsBroker 'lynx26.lab.eng.tlv2.redhat.com:54321'
2019-09-10 05:27:15,848+03 INFO  [org.ovirt.engine.core.vdsbroker.ResourceManager] (default task-22) [hosts_update_9c7ba5d0-0144-4492] VDS 'fc25e468-758f-4b16-b33a-14a6229d5168' was added to the Resource Manager
2019-09-10 05:27:15,848+03 INFO  [org.ovirt.engine.core.vdsbroker.AddVdsVDSCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] FINISH, AddVdsVDSCommand, return: , log id: 12c08446
2019-09-10 05:27:15,863+03 INFO  [org.ovirt.engine.core.bll.AddVdsSpmIdCommand] (default task-22) [4364ebbf] Before acquiring and wait lock 'EngineLock:{exclusiveLocks='[6cb3a9e2-b481-4405-bd5b-be3d7a657ff6=REGISTER_VDS]', sharedLocks=''}'
2019-09-10 05:27:15,864+03 INFO  [org.ovirt.engine.core.bll.AddVdsSpmIdCommand] (default task-22) [4364ebbf] Lock-wait acquired to object 'EngineLock:{exclusiveLocks='[6cb3a9e2-b481-4405-bd5b-be3d7a657ff6=REGISTER_VDS]', sharedLocks=''}'
2019-09-10 05:27:15,867+03 INFO  [org.ovirt.engine.core.bll.AddVdsSpmIdCommand] (default task-22) [4364ebbf] Running command: AddVdsSpmIdCommand internal: true. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDS
2019-09-10 05:27:15,885+03 INFO  [org.ovirt.engine.core.bll.AddVdsSpmIdCommand] (default task-22) [4364ebbf] Lock freed to object 'EngineLock:{exclusiveLocks='[6cb3a9e2-b481-4405-bd5b-be3d7a657ff6=REGISTER_VDS]', sharedLocks=''}'
2019-09-10 05:27:15,885+03 INFO  [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (ForkJoinPool-1-worker-10) [7298807f] Failed to acquire lock and wait lock 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:15,939+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-22) [4364ebbf] EVENT_ID: USER_UPDATE_VDS(43), Host host_mixed_2 configuration was updated by admin@internal-authz.
2019-09-10 05:27:15,940+03 INFO  [org.ovirt.engine.core.bll.ChangeVDSClusterCommand] (default task-22) [4364ebbf] Lock freed to object 'EngineLock:{exclusiveLocks='[fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:15,940+03 INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Lock-wait acquired to object 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:15,946+03 INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Running command: RefreshHostCapabilitiesCommand internal: true. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
2019-09-10 05:27:15,946+03 INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Before acquiring lock in order to prevent monitoring for host 'host_mixed_2' from data-center 'dc_type_1005263257'
2019-09-10 05:27:15,946+03 INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Lock acquired, from now a monitoring of host will be skipped for host 'host_mixed_2' from data-center 'dc_type_1005263257'
2019-09-10 05:27:15,947+03 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to lynx26.lab.eng.tlv2.redhat.com/10.46.16.45
2019-09-10 05:27:15,949+03 INFO  [org.ovirt.engine.core.bll.network.host.PersistentHostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-8034) [5dc1f697] Running command: PersistentHostSetupNetworksCommand internal: true. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDS
2019-09-10 05:27:15,967+03 INFO  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-8034) [5dc1f697] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK]', sharedLocks=''}'
2019-09-10 05:27:15,967+03 WARN  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-8034) [5dc1f697] Validation of action 'HostSetupNetworks' failed for user admin@internal-authz. Reasons: VAR__ACTION__SETUP,VAR__TYPE__NETWORKS,ACTION_TYPE_FAILED_SETUP_NETWORKS_OR_REFRESH_IN_PROGRESS
2019-09-10 05:27:16,055+03 INFO  [org.ovirt.engine.core.bll.hostdeploy.UpdateVdsCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] Running command: UpdateVdsCommand internal: false. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDSAction group EDIT_HOST_CONFIGURATION with role type ADMIN
2019-09-10 05:27:16,080+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-22) [hosts_update_9c7ba5d0-0144-4492] EVENT_ID: VDS_ALERT_FENCE_IS_NOT_CONFIGURED(9,000), Failed to verify Power Management configuration for Host host_mixed_2.
2019-09-10 05:27:16,095+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-22) [hosts_update_9c7ba5d0-0144-4492] EVENT_ID: USER_UPDATE_VDS(43), Host host_mixed_2 configuration was updated by admin@internal-authz.
2019-09-10 05:27:16,216+03 INFO  [org.ovirt.engine.core.bll.ActivateVdsCommand] (default task-22) [hosts_syncAction_0f902f58-bfcc-4c2d] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
2019-09-10 05:27:16,216+03 WARN  [org.ovirt.engine.core.bll.ActivateVdsCommand] (default task-22) [hosts_syncAction_0f902f58-bfcc-4c2d] Validation of action 'ActivateVds' failed for user admin@internal-authz. Reasons: VAR__ACTION__ACTIVATE,VAR__TYPE__HOST,ACTION_TYPE_FAILED_OBJECT_LOCKED
2019-09-10 05:27:16,229+03 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-22) [] Operation Failed: [Cannot activate Host. Related operation is currently in progress. Please try again later.]

Expected results:
The host should be in activate status.

Additional info:
1. I talked with Fred Rolland that told me the host is lock on HOST_NETWORK and it seems network object lock is not freed.
2. I have sent a mail to Danny Hollar and he hasn't replied yet.

Comment 1 Shir Fishbain 2019-09-15 16:04:34 UTC
Created attachment 1615347 [details]
Logs

Comment 2 Shir Fishbain 2019-09-15 16:56:35 UTC
Very important comment :
The disconnection takes around 20 sec until the test failed.

Comment 3 Dominik Holler 2019-09-17 15:25:23 UTC
We changed the lock behavior in this area in 4.4.
Shir, any chance to check if this is issue is reproducible on 4.4?

Comment 4 Dominik Holler 2019-10-08 11:35:06 UTC
The issue is that host activation fails during refreshcaps is running:
INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Lock-wait acquired to object 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Running command: RefreshHostCapabilitiesCommand internal: true. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDSAction group MANIPULATE_HOST with role type ADMIN
INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Before acquiring lock in order to prevent monitoring for host 'host_mixed_2' from data-center 'dc_type_1005263257'
INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [7298807f] Lock acquired, from now a monitoring of host will be skipped for host 'host_mixed_2' from data-center 'dc_type_1005263257'
INFO  [org.ovirt.engine.core.bll.network.host.PersistentHostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-8034) [5dc1f697] Running command: PersistentHostSetupNetworksCommand internal: true. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDS
INFO  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-8034) [5dc1f697] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK]', sharedLocks=''}'
WARN  [org.ovirt.engine.core.bll.network.host.HostSetupNetworksCommand] (EE-ManagedThreadFactory-engine-Thread-8034) [5dc1f697] Validation of action 'HostSetupNetworks' failed for user admin@internal-authz. Reasons: VAR__ACTION__SETUP,VAR__TYPE__NETWORKS,ACTION_TYPE_FAILED_SETUP_NETWORKS_OR_REFRESH_IN_PROGRESS
INFO  [org.ovirt.engine.core.bll.hostdeploy.UpdateVdsCommand] (default task-22) [hosts_update_9c7ba5d0-0144-4492] Running command: UpdateVdsCommand internal: false. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDSAction group EDIT_HOST_CONFIGURATION with role type ADMIN
INFO  [org.ovirt.engine.core.bll.ActivateVdsCommand] (default task-22) [hosts_syncAction_0f902f58-bfcc-4c2d] Failed to Acquire Lock to object 'EngineLock:{exclusiveLocks='[fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'
WARN  [org.ovirt.engine.core.bll.ActivateVdsCommand] (default task-22) [hosts_syncAction_0f902f58-bfcc-4c2d] Validation of action 'ActivateVds' failed for user admin@internal-authz. Reasons: VAR__ACTION__ACTIVATE,VAR__TYPE__HOST,ACTION_TYPE_FAILED_OBJECT_LOCKED
ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-22) [] Operation Failed: [Cannot activate Host. Related operation is currently in progress. Please try again later.]
INFO  [org.ovirt.engine.core.bll.HandleVdsVersionCommand] (ForkJoinPool-1-worker-10) [6d89db25] Running command: HandleVdsVersionCommand internal: true. Entities affected :  ID: fc25e468-758f-4b16-b33a-14a6229d5168 Type: VDS
INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [6d89db25] Refresh host capabilities finished. Lock released. Monitoring can run now for host 'host_mixed_2' from data-center 'dc_type_1005263257'
INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-10) [6d89db25] EVENT_ID: HOST_REFRESHED_CAPABILITIES(606), Successfully refreshed the capabilities of host host_mixed_2.
INFO  [org.ovirt.engine.core.bll.RefreshHostCapabilitiesCommand] (ForkJoinPool-1-worker-10) [6d89db25] Lock freed to object 'EngineLock:{exclusiveLocks='[HOST_NETWORKfc25e468-758f-4b16-b33a-14a6229d5168=HOST_NETWORK, fc25e468-758f-4b16-b33a-14a6229d5168=VDS]', sharedLocks=''}'

Comment 5 Michael Burman 2020-01-20 09:56:40 UTC
(In reply to Dominik Holler from comment #3)
> We changed the lock behavior in this area in 4.4.
> Shir, any chance to check if this is issue is reproducible on 4.4?

Similar issue reproduced on network tests on 4.4 with nmstate.(sync API tests)
We need to handle this kind of failures, may be with increasing the default lock timeout config setting

2020-01-09 11:37:46,793+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-12) [] Operation Failed: [Cannot edit Host. Related operatio
n is currently in progress. Please try again later.]

Comment 6 Dominik Holler 2020-01-28 07:07:31 UTC
Please let me know if the issues happend again with ovirt-engine >= 4.4.0-17.

Comment 9 Dominik Holler 2020-01-28 09:54:06 UTC
oVirt gerrit 106414 did not solve the root of the issue, but maybe it is already good enough.
I like to close this bug, if it is not reproduced anymore.

Comment 10 Daniel 2020-02-25 09:06:46 UTC
Hi Dominik,
As this bug is impossible to reproduce manually we need your help with trying to do so synthetically(via code path if possible)
Can you please provide me a way to reproduce it via code so we can verify this with high confidence? 

We will update about this issue reintroduction from 4.4.0-17 onward but this is a weak reproduction with low confidence.

Comment 11 Dominik Holler 2020-02-25 09:21:47 UTC
(In reply to Daniel from comment #10)
> Hi Dominik,
> As this bug is impossible to reproduce manually we need your help with
> trying to do so synthetically(via code path if possible)
> Can you please provide me a way to reproduce it via code so we can verify
> this with high confidence? 
> 

Triggering a setupNetworks during
curl --header 'Content-Type: application/json' --user admin@internal:$OVIRT_PASSWORD --insecure -X POST https://$OVIRT_HOSTNAME/ovirt-engine/api/hosts/$HOST_ID/refresh --data '{}'
is running.

engine.log should contain: "Failed to acquire lock, will try again until timeout"

During adding the host, the refresh should take less than 20 seconds.


> We will update about this issue reintroduction from 4.4.0-17 onward but this
> is a weak reproduction with low confidence.

If the automation is not running into this issues anymore, I would close the bug.

Comment 14 Shir Fishbain 2020-02-25 11:51:58 UTC
The bug has seen in our automation in storage_import_storage_domain/test_detach_attach.py, the test cases are: TestCase5201 and TestCase5300

Comment 15 Michael Burman 2020-02-25 11:55:36 UTC
(In reply to Shir Fishbain from comment #14)
> The bug has seen in our automation in
> storage_import_storage_domain/test_detach_attach.py, the test cases are:
> TestCase5201 and TestCase5300

Tnx Shir,
So from my point of view, if this tests are pass on rhv 4.40-20 and the issues is not appear we are good)

Comment 16 Daniel 2020-03-04 06:36:13 UTC
According to the conversation above, verified via Jenkins by running the specific test cases again.
Both test cases TestCase5201 & TestCase5300 succeed on version 4.4.0-0.24.master.el7.
Jenkins run output can be found here: https://rhv-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/4.4_dev/job/rhv-4.4-ge-flow-storage/241/

Comment 17 Sandro Bonazzola 2020-05-20 20:02:22 UTC
This bugzilla is included in oVirt 4.4.0 release, published on May 20th 2020.

Since the problem described in this bug report should be
resolved in oVirt 4.4.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.