Bug 893689 - [engine-backend] unable to create floating disk - engine fails to clear task
[engine-backend] unable to create floating disk - engine fails to clear task
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
x86_64 Linux
unspecified Severity high
: ---
: 3.2.0
Assigned To: Liron Aravot
Haim
storage
: Regression, TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-09 12:45 EST by Haim
Modified: 2016-02-10 11:36 EST (History)
10 users (show)

See Also:
Fixed In Version: SF4
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
engine log. (251.93 KB, application/x-gzip)
2013-01-09 12:51 EST, Haim
no flags Details
vdsm.log (202.56 KB, application/x-gzip)
2013-01-09 12:54 EST, Haim
no flags Details

  None (edit)
Description Haim 2013-01-09 12:45:54 EST
Description of problem:

unable to create system disk since engine fails to clear tasks from some unknown reason and hits a null-pointer exception.

repro steps goes as follows:

2013-01-09 19:28:57,337 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-62) Failed in HSMGetAllTasksStatusesVDS method
2013-01-09 19:28:57,337 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-62) Error code VolumeCreationError and error message VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = Error creating a new volume
2013-01-09 19:28:57,338 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-62) ERROR, org.ovirt.engine.core.vdsbroker.irsbroker.SPMGetAllTasksStatusesVDSCommand, exception: NullPointerException: , log id: 2f2f092: java.lang.NullPointerException
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand$IrsProxyData.getIrsProxy(IrsBrokerCommand.java:509) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.getIrsProxy(IrsBrokerCommand.java:1498) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand.ExecuteVDSCommand(IrsBrokerCommand.java:1506) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.vdsbroker.VDSCommandBase.ExecuteCommand(VDSCommandBase.java:63) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.dal.VdcCommandBase.Execute(VdcCommandBase.java:28) [engine-dal.jar:]
        at org.ovirt.engine.core.vdsbroker.ResourceManager.runVdsCommand(ResourceManager.java:362) [engine-vdsbroker.jar:]
        at org.ovirt.engine.core.bll.VDSBrokerFrontendImpl.RunVdsCommand(VDSBrokerFrontendImpl.java:33) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AsyncTaskManager.getSPMsTasksStatuses(AsyncTaskManager.java:289) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AsyncTaskManager.PollAndUpdateAsyncTasks(AsyncTaskManager.java:241) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.AsyncTaskManager._timer_Elapsed(AsyncTaskManager.java:93) [engine-bll.jar:]
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) [:1.7.0_09-icedtea]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [rt.jar:1.7.0_09-icedtea]
        at java.lang.reflect.Method.invoke(Method.java:601) [rt.jar:1.7.0_09-icedtea]
        at org.ovirt.engine.core.utils.timer.JobWrapper.execute(JobWrapper.java:60) [engine-scheduler.jar:]
        at org.quartz.core.JobRunShell.run(JobRunShell.java:213) [quartz-2.1.2.jar:]
        at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:557) [quartz-2.1.2.jar:]

2013-01-09 19:28:57,340 INFO  [org.ovirt.engine.core.bll.SPMAsyncTask] (QuartzScheduler_Worker-62) SPMAsyncTask::PollTask: Polling task 854ed1c0-7dd3-46c1-aee5-1dcfabf9fa97 (Parent Command AddDisk, Parameters Type org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters) returned status finished, result 'success'.
Comment 1 Haim 2013-01-09 12:51:55 EST
Created attachment 675747 [details]
engine log.
Comment 3 Haim 2013-01-09 12:54:23 EST
Created attachment 675748 [details]
vdsm.log
Comment 4 Ayal Baron 2013-01-10 04:06:12 EST
Haim, there are no reproduction steps here.
In addition, the log contains the getTaskStatus that shows that the task failed, but the task flow itself is not in the log:
Thread-55321::DEBUG::2013-01-09 17:30:45,530::taskManager::93::TaskManager::(getTaskStatus) Entry. taskID: fb7244f5-bb52-4ed3-b1d4-2750437fc2a0
Thread-55321::DEBUG::2013-01-09 17:30:45,530::taskManager::96::TaskManager::(getTaskStatus) Return. Response: {'code': 205, 'message': 'Error creating a new volume', 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'fb7244f5-bb52-4ed3-b1d4-2750437fc2a0'}

The vdsm task to create the disk failed (not sure whether before or after lvcreate because I don't have the relevant log).
The task we're missing info on is: fb7244f5-bb52-4ed3-b1d4-2750437fc2a0

Note that there are many problems with the domain (checksum problems, missing directories in master fs, etc).
Comment 5 Maor 2013-01-10 06:11:08 EST
From the log I suspect the NPE was caused since the storage pool related to the task is missing or does not exists in the setup.
Haim, is it possible to check that through the DB:
 SELECT action_parameters FROM async_Tasks where task_id = 'fb7244f5-bb52-4ed3-b1d4-2750437fc2a0'
see what is the storage pool id in the task
and then SELECT id FROM storage_pool;
to check if this storage pool exists.
Comment 6 Haim 2013-01-10 08:49:33 EST
(In reply to comment #5)
> From the log I suspect the NPE was caused since the storage pool related to
> the task is missing or does not exists in the setup.
> Haim, is it possible to check that through the DB:
>  SELECT action_parameters FROM async_Tasks where task_id =
> 'fb7244f5-bb52-4ed3-b1d4-2750437fc2a0'
> see what is the storage pool id in the task
> and then SELECT id FROM storage_pool;
> to check if this storage pool exists.

happens consistently on 3.2 setup, pool does exists, the problem only occurs with floating disk, you are more then welcome to login and take a look.
Comment 7 Allon Mureinik 2013-01-16 05:23:44 EST
devel-ack - correct sp id should be passed, and we should be able to create the disk.
Comment 8 Liron Aravot 2013-01-20 07:11:24 EST
relevant log to the issue (add disk)

3-01-09 19:48:14,640 INFO  [org.ovirt.engine.core.utils.transaction.TransactionSupport] (pool-3-thread-47) transaction rolled back
2013-01-09 19:48:14,656 ERROR [org.ovirt.engine.core.bll.EntityAsyncTask] (pool-3-thread-47) EntityAsyncTask::EndCommandAction [within thread]: EndAction for action type Add
Disk threw an exception: java.lang.NullPointerException
        at org.ovirt.engine.core.bll.VmCommand.endVmCommand(VmCommand.java:185) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.VmCommand.endSuccessfully(VmCommand.java:222) [engine-bll.jar:]
        at org.ovirt.engine.core.bll.CommandBase.internalEndSuccessfully(CommandBase.java:523) [engine-bll.jar:]
 

already pushed and merged:
http://gerrit.ovirt.org/#/c/10623/

haim- the log you pasted is related to another issue, there seems to be tasks for storage pool that has been deleted - moving this to modify, adding needinfo for Haim to let you know about this issue if its not a known bug already (you might have opened one for that already).
Comment 9 Leonid Natapov 2013-02-13 05:24:04 EST
sf6. fixed. can create floating system disk.
Comment 10 Itamar Heim 2013-06-11 04:26:42 EDT
3.2 has been released
Comment 11 Itamar Heim 2013-06-11 04:26:47 EDT
3.2 has been released

Note You need to log in before you can comment on or make changes to this bug.