Bug 1139261
| Summary: | Add disk takes 3 hours when stoping and starting ovirt-engine | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Retired] oVirt | Reporter: | Kevin Alon Goldblatt <kgoldbla> | ||||
| Component: | ovirt-engine-core | Assignee: | Ravi Nori <rnori> | ||||
| Status: | CLOSED WORKSFORME | QA Contact: | Petr Kubica <pkubica> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 3.5 | CC: | acanan, amureini, bazulay, bugs, ecohen, gklein, iheim, kgoldbla, mgoldboi, oourfali, rbalakri, rnori, yeylon | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 3.5.1 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | infra | ||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2014-10-06 13:06:07 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
If the task is running uninterrupted this is an infra issue of when/how the task is polled. Oved - can your guys take lead on this? Sure. Assigned it to Ravi. I was unable to reproduce this, works on my end. Please check if this is reproducible. From the logs it looks like the host went into Non operation status at 2014-09-07 18:15:21 and came back up at 2014-09-07 20:59:05,775. So it works as expected. Please retest The problem here is: Why did the host become non operational as a result of stopping and then restarting the engine for a short while during the Add Volume operation. Is this the expected behavior. If the engine goes down for 60 seconds, your host enters the non operational state for 3 hours, then comes up and resumes the operation? The host going into non-operational is unrelated to server restart. The expected behavior I mentioned above is regarding the Add Volume Operation resuming after host comes back up. Kevin - are you able to reproduce it? If not we'll have to close it as WORKSFORME. Please reopen if the issue reproduces. Hi Ravi, Can you please answer comment #4? Hi Aharon, I already answered in Kevin's question in #4 in comment #5. I think the host going into non-operational state is unrelated to anything happening on the engine side. So it is not the expected behavior that the host goes into non-operation state |
Created attachment 935346 [details] engine vdsm and server logs Description of problem: Stoping a and then starting the ovirt-engine during and add disk to VM operation takes 3 hours to complete Version-Release number of selected component (if applicable): ovirt-engine-3.5.0-0.0.master.20140821064931.gitb794d66.el6.noarch vdsm-4.16.2-1.gite8cba75.el6.x86_64 How reproducible: Not always Steps to Reproduce: 1. Create VM with 1 preallocated ISCI disk of 1gb on a block domain 2. Once the createVolume Task is initiated by vdsm I stop the enigine, wait 60 seconds then start the engine again. 3.Once the engine comes up I stop the engine again, wait 80 seconds and start the engine again. The asynchroneous Task in the vdsm is continuously reported as 'Running'. This continues for 3 hours and then the task is finally completed Performing a similar create VM and add disk without restarting the engine on the same storage domain takes about 40 seconds Actual results: The add disk operation takes 3 hours to return as completed Expected results: The operation should complete within a minute or 2 at most Additional info: FROM THE ENGINE LOG >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>. [root@kevin-rhevm ~]# less /var/log/ovirt-engine/engine.log-20140908.gz |grep 5519a22c 2014-09-07 18:02:52,194 INFO [org.ovirt.engine.core.bll.AddDiskCommand] (ajp--127.0.0.1-8702-3) [5519a22c] Lock Acquired to object EngineLock [exclusiveLocks= key: ebf1ede1-12d6-4940-91d5-474a54d14100 value: VM_DISK_BOOT 2014-09-07 18:02:52,902 INFO [org.ovirt.engine.core.bll.AddDiskCommand] (ajp--127.0.0.1-8702-3) [5519a22c] Running command: AddDiskCommand internal: false. Entities affected : ID: ebf1ede1-12d6-4940-91d5-474a54d14100 Type: VMAction group CONFIGURE_VM_STORAGE with role type USER, ID: b240d926-3617-4790-8d77-bca4e52776f4 Type: StorageAction group CREATE_DISK with role type USER 2014-09-07 18:02:53,665 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ajp--127.0.0.1-8702-3) [2d4d736] Correlation ID: 5519a22c, Job ID: a24a440f-3f44-4c14-978d-29dcddc3873a, Call Stack: null, Custom Event ID: -1, Message: Add-Disk operation of vm6_Disk1 was initiated on VM vm6 by admin. 2014-09-07 18:05:05,116 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (org.ovirt.thread.pool-8-thread-21) [5519a22c] Adding task 1c880df0-46f9-4077-bf1a-50837fb1bd5f (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters), polling hasn't started yet.. 2014-09-07 18:05:05,920 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (org.ovirt.thread.pool-8-thread-21) [5519a22c] BaseAsyncTask::startPollingTask: Starting to poll task 1c880df0-46f9-4077-bf1a-50837fb1bd5f. 2014-09-07 18:05:05,931 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (org.ovirt.thread.pool-8-thread-21) [5519a22c] Discovered 1 tasks on Storage Pool dc_1, 1 added to manager. 2014-09-07 18:07:11,350 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (org.ovirt.thread.pool-8-thread-21) [5519a22c] Adding task 1c880df0-46f9-4077-bf1a-50837fb1bd5f (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters), polling hasn't started yet.. 2014-09-07 18:07:11,890 INFO [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (org.ovirt.thread.pool-8-thread-21) [5519a22c] BaseAsyncTask::startPollingTask: Starting to poll task 1c880df0-46f9-4077-bf1a-50837fb1bd5f. 2014-09-07 18:07:11,891 INFO [org.ovirt.engine.core.bll.tasks.AsyncTaskManager] (org.ovirt.thread.pool-8-thread-21) [5519a22c] Discovered 1 tasks on Storage Pool dc_1, 1 added to manager. 2014-09-07 20:59:15,728 INFO [org.ovirt.engine.core.bll.AddDiskCommand] (org.ovirt.thread.pool-8-thread-4) [5519a22c] Ending command successfully: org.ovirt.engine.core.bll.AddDiskCommand 2014-09-07 20:59:19,722 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-8-thread-4) Correlation ID: 5519a22c, Call Stack: null, Custom Event ID: -1, Message: The disk vm6_Disk1 was successfully added to VM vm6. FROM THE VDSM LOG >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> c880df0-46f9-4077-bf1a-50837fb1bd5f::DEBUG::2014-09-07 18:02:57,428::task::905::Storage.TaskManager.Task::(_runJobs) Task=`1c880df0-46f9-4077-bf1a-50837fb1bd5f`::Task.run: running job 0: createVolume: <bound method StoragePool.createVolume of <storage.sp.StoragePool object at 0x7f54f80e1850>> (args: ('b240d926-3617-4790-8d77-bca4e52776f4', '7500162a-0b1e-4931-b85d-4a3784d93903', 2097152, 5, 1, 2, 'a66fb59e-040c-4451-92bc-047cd4571358', '', '00000000-0000-0000-0000-000000000000', '00000000-0000-0000-0000-000000000000') kwargs: {}) 1c880df0-46f9-4077-bf1a-50837fb1bd5f::DEBUG::2014-09-07 18:02:57,428::task::331::Storage.TaskManager.Task::(run) Task=`1c880df0-46f9-4077-bf1a-50837fb1bd5f`::Job.run: running createVolume: <bound method StoragePool.createVolume of <storage.sp.StoragePool object at 0x7f54f80e1850>> (args: ('b240d926-3617-4790-8d77-bca4e52776f4', '7500162a-0b1e-4931-b85d-4a3784d93903', 2097152, 5, 1, 2, 'a66fb59e-040c-4451-92bc-047cd4571358', '', '00000000-0000-0000-0000-000000000000', '00000000-0000-0000-0000-000000000000') kwargs: {}) callback None Thread-135::DEBUG::2014-09-07 18:05:08,741::task::1191::Storage.TaskManager.Task::(prepare) Task=`fb5c40dc-5b02-4499-ae88-b7b47976d94b`::finished: {'allTasksInfo': {'1c880df0-46f9-4077-bf1a-50837fb1bd5f': {'verb': 'createVolume', 'id': '1c880df0-46f9-4077-bf1a-50837fb1bd5f'}}} Thread-137::DEBUG::2014-09-07 18:07:15,069::task::1191::Storage.TaskManager.Task::(prepare) Task=`17ecb732-6e6b-4238-86fe-35ea48cc1d19`::finished: {'allTasksInfo': {'1c880df0-46f9-4077-bf1a-50837fb1bd5f': {'verb': 'createVolume', 'id': '1c880df0-46f9-4077-bf1a-50837fb1bd5f'}}} Thread-147::DEBUG::2014-09-07 18:26:53,258::task::1191::Storage.TaskManager.Task::(prepare) Task=`29358927-d19a-4de5-bd35-54b9e1717953`::finished: {'tasks': {'1c880df0-46f9-4077-bf1a-50837fb1bd5f': {'code': 0, 'tag': u'spm', 'state': 'running', 'verb': 'createVolume', 'result': '', 'message': 'running job 1 of 1', 'id': '1c880df0-46f9-4077-bf1a-50837fb1bd5f'}}} Thread-197::DEBUG::2014-09-07 20:59:09,066::task::1191::Storage.TaskManager.Task::(prepare) Task=`6662097f-3e98-430a-b646-4012c761790d`::finished: {'allTasksInfo': {'1c880df0-46f9-4077-bf1a-50837fb1bd5f': {'verb': 'createVolume', 'id': '1c880df0-46f9-4077-bf1a-50837fb1bd5f'}}}