Bug 893689 - [engine-backend] unable to create floating disk - engine fails to clear task
Summary: [engine-backend] unable to create floating disk - engine fails to clear task
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.2.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: 3.2.0
Assignee: Liron Aravot
QA Contact: Haim
URL:
Whiteboard: storage
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2013-01-09 17:45 UTC by Haim
Modified: 2016-02-10 16:36 UTC (History)
10 users (show)

Fixed In Version: SF4
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
oVirt Team: Storage
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
engine log. (251.93 KB, application/x-gzip)
2013-01-09 17:51 UTC, Haim
no flags Details
vdsm.log (202.56 KB, application/x-gzip)
2013-01-09 17:54 UTC, Haim
no flags Details

Description Haim 2013-01-09 17:45:54 UTC
Description of problem:

unable to create system disk since engine fails to clear tasks from some unknown reason and hits a null-pointer exception.

repro steps goes as follows:

2013-01-09 19:28:57,337 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-62) Failed in HSMGetAllTasksStatusesVDS method
2013-01-09 19:28:57,337 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-62) Error code VolumeCreationError and error message VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Error creating a new volume
2013-01-09 19:28:57,338 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-62) ERROR, org.ovirt.engine.core.vdsbroker.irsbroker.SPMGetAllTasksStatusesVDSCommand, exception: NullPointerException: , log id: 2f2f092: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.getIrsProxy(IrsBrokerCommand.java:509) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.getIrsProxy(IrsBrokerCommand.java:1498) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.ExecuteVDSCommand(IrsBrokerCommand.java:1506) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.ExecuteCommand(VDSCommandBase.java:63) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.Execute(VdcCommandBase.java:28) [engine-dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:362) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AsyncTaskManager.getSPMsTasksStatuses(AsyncTaskManager.java:289) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AsyncTaskManager.PollAndUpdateAsyncTasks(AsyncTaskManager.java:241) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AsyncTaskManager._timer_Elapsed(AsyncTaskManager.java:93) [engine-bll.jar:]
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) [:1.7.0_09-icedtea]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_09-icedtea]
        at java.lang.reflect.Method.invoke(Method.java:601) [rt.jar:1.7.0_09-icedtea]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60) [engine-scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz-2.1.2.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz-2.1.2.jar:]

2013-01-09 19:28:57,340 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-62) SPMAsyncTask::PollTask: Polling task 854ed1c0-7dd3-46c1-aee5-1dcfabf9fa97 (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'success'.

Comment 1 Haim 2013-01-09 17:51:55 UTC
Created attachment 675747 [details]
engine log.

Comment 3 Haim 2013-01-09 17:54:23 UTC
Created attachment 675748 [details]
vdsm.log

Comment 4 Ayal Baron 2013-01-10 09:06:12 UTC
Haim, there are no reproduction steps here.
In addition, the log contains the getTaskStatus that shows that the task failed, but the task flow itself is not in the log:
Thread-55321::DEBUG::2013-01-09 17:30:45,530::taskManager::93::TaskManager::(getTaskStatus) Entry. taskID: fb7244f5-bb52-4ed3-b1d4-2750437fc2a0
Thread-55321::DEBUG::2013-01-09 17:30:45,530::taskManager::96::TaskManager::(getTaskStatus) Return. Response: {'code': 205, 'message': 'Error creating a new volume', 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'fb7244f5-bb52-4ed3-b1d4-2750437fc2a0'}

The vdsm task to create the disk failed (not sure whether before or after lvcreate because I don't have the relevant log).
The task we're missing info on is: fb7244f5-bb52-4ed3-b1d4-2750437fc2a0

Note that there are many problems with the domain (checksum problems, missing directories in master fs, etc).

Comment 5 Maor 2013-01-10 11:11:08 UTC
From the log I suspect the NPE was caused since the storage pool related to the task is missing or does not exists in the setup.
Haim, is it possible to check that through the DB:
 SELECT action_parameters FROM async_Tasks where task_id = 'fb7244f5-bb52-4ed3-b1d4-2750437fc2a0'
see what is the storage pool id in the task
and then SELECT id FROM storage_pool;
to check if this storage pool exists.

Comment 6 Haim 2013-01-10 13:49:33 UTC
(In reply to comment #5)
> From the log I suspect the NPE was caused since the storage pool related to
> the task is missing or does not exists in the setup.
> Haim, is it possible to check that through the DB:
>  SELECT action_parameters FROM async_Tasks where task_id =
> 'fb7244f5-bb52-4ed3-b1d4-2750437fc2a0'
> see what is the storage pool id in the task
> and then SELECT id FROM storage_pool;
> to check if this storage pool exists.

happens consistently on 3.2 setup, pool does exists, the problem only occurs with floating disk, you are more then welcome to login and take a look.

Comment 7 Allon Mureinik 2013-01-16 10:23:44 UTC
devel-ack - correct sp id should be passed, and we should be able to create the disk.

Comment 8 Liron Aravot 2013-01-20 12:11:24 UTC
relevant log to the issue (add disk)

3-01-09 19:48:14,640 INFO  [org.ovirt.engine.core.utils.transaction.TransactionSupport] (pool-3-thread-47) transaction rolled back
2013-01-09 19:48:14,656 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-3-thread-47) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type Add
Disk threw an exception: java.lang.NullPointerException
        at org.ovirt.engine.core.bll.VmCommand.endVmCommand(VmCommand.java:185) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.VmCommand.endSuccessfully(VmCommand.java:222) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.internalEndSuccessfully(CommandBase.java:523) [engine-bll.jar:]
 

already pushed and merged:
http://gerrit.ovirt.org/#/c/10623/

haim- the log you pasted is related to another issue, there seems to be tasks for storage pool that has been deleted - moving this to modify, adding needinfo for Haim to let you know about this issue if its not a known bug already (you might have opened one for that already).

Comment 9 Leonid Natapov 2013-02-13 10:24:04 UTC
sf6. fixed. can create floating system disk.

Comment 10 Itamar Heim 2013-06-11 08:26:42 UTC
3.2 has been released

Comment 11 Itamar Heim 2013-06-11 08:26:47 UTC
3.2 has been released


Note You need to log in before you can comment on or make changes to this bug.