Description of problem: We ran a full cycle of the automation tests in the environment and we discovered a bug fills engine.log with error message: "LastUpdate Date is not initialized in the OVF_STORE disk." continuously in a loop. Log size become huge and could lead to crash. There is an additional storage domain in locked state, which is probably the causing the error. Storage domain name: "Register_domain_network_SD". Version-Release number of selected component (if applicable): Red Hat Virtualization Manager Version: 4.1.1-0.1.el7 How reproducible: Currently unknown. Actual results: Engine continues to write the error message to the engine.log file. Expected results: Engine should avoid filling the log with this error message. Additional info:
Created attachment 1255372 [details] engine.log 2017-02-17 03:00:06,728+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (default task-15) [11773bf0] Command 'PollVDSCommand(HostName = host_mixed_2, VdsIdVDSCommandParametersBase:{runAs ync='true', hostId='5d74b943-02b0-4d78-9771-27e0897bfcec'})' execution failed: java.util.concurrent.ExecutionException: org.ovirt.engine.core.vdsbroker.TransportRunTimeException: Connection issues during se nd request 2017-02-17 03:00:07,374+02 INFO [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) [] Connecting to vega05.qa.lab.tlv.redhat.com/10.35.128.28 2017-02-17 03:00:07,571+02 INFO [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (default task-2) [hosts_syncAction_16087ca6-63cc-445e] Failed to acquire lock and wait lock 'HostEngineLock:{exclusiveLo cks='[5d74b943-02b0-4d78-9771-27e0897bfcec=<VDS_INIT, >]', sharedLocks='null'}' 2017-02-17 03:00:08,437+02 INFO [org.ovirt.engine.core.bll.lock.InMemoryLockManager] (default task-2) [hosts_syncAction_16087ca6-63cc-445e] Failed to acquire lock and wait lock 'HostEngineLock:{exclusiveLo cks='[5d74b943-02b0-4d78-9771-27e0897bfcec=<VDS_INIT, >]', sharedLocks='null'}' ... engine fills the with the above message ... engine fills the with the above message ... engine fills the with the above message
Mor, The message described in the bug description - "LastUpdate Date is not initialized in the OVF_STORE disk" doesn't appear in the provided log. In comment #1 you quoted a different error related to the VDS_INIT lock (seems related to HostSetupNetworksCommand which was recently changed) I'm moving the bug to the network team for further investigation. If the initial described issue is relevant, please open a new bug with the relevant log.
Hi Liron, The error mentioned in the original topic is very relevant to the bug I opened. I just uploaded log which doesn't contains the error. I have newer log with the error message. The problem is that its compressed size is 273MB, and it decompresses to 22GB of log file! I'll take a snippet from it and upload it again. Thanks.
Created attachment 1256714 [details] stripped engine.log with the OVF error
Hi Mor, do you have updated steps to reproduce this bug?
(In reply to Lilach Zitnitski from comment #5) > Hi Mor, do you have updated steps to reproduce this bug? Hi Lilach, no, unfortunately I don't have any specific steps to reproduce it. But looking at the patch it seems that if engine fails to get the first OVF disk then the loop that fills the log occurs. Maybe you can ask Liron Aravot on how to trigger it?
(In reply to Mor from comment #6) > (In reply to Lilach Zitnitski from comment #5) > > Hi Mor, do you have updated steps to reproduce this bug? > > Hi Lilach, no, unfortunately I don't have any specific steps to reproduce > it. But looking at the patch it seems that if engine fails to get the first > OVF disk then the loop that fills the log occurs. Maybe you can ask Liron > Aravot on how to trigger it? Thanks Mor. Liron do you know how can I reproduce this bug?
The way to reproduce it involves manually modifying the vdsm code - 1. Have a DC with two storage domains and one host 2. Move the host to maintenance 3. Modify the vdsm code to raise exception in hsm.py (setVolumeDescription()), for example raise se.StorageException() 4. Restart the vdsm service. 5. Activate the host, move the non master domain to maintenance - the ovf update will fail because of the failure to set the volume description. 6. Detach the non master domain 7. Attach the non master domain. 8. Verify that the described log appears once.
-------------------------------------- Tested with the following code: ---------------------------------------- rhevm-4.1.2-0.1.el7.noarch vdsm-4.19.11-1.el7ev.x86_64 Tested with the following scenario: Steps to Reproduce: 1. Have a DC with two storage domains and one host 2. Move the host to maintenance 3. Modify the vdsm code to raise exception in hsm.py (setVolumeDescription()), for example raise se.StorageException() 4. Restart the vdsm service. 5. Activate the host, move the non master domain to maintenance - the ovf update will fail because of the failure to set the volume description. 6. Detach the non master domain 7. Attach the non master domain. 8. Verify that the described log appears once. Actual results: The described log message appears twice in the log throughout the test. Expected results: Moving to VERIFIED!