Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1270725

Summary: [Cinder] Stateless VM fails to start with NullPointerException, operation does not rollback
Product: [oVirt] ovirt-engine Reporter: Ori Gofen <ogofen>
Component: BLL.StorageAssignee: Maor <mlipchuk>
Status: CLOSED CURRENTRELEASE QA Contact: Natalie Gavrielov <ngavrilo>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.6.0CC: acanan, amureini, bugs, mlipchuk, ogofen, tnisan
Target Milestone: ovirt-3.6.2Flags: amureini: ovirt-3.6.z?
rule-engine: planning_ack?
tnisan: devel_ack+
pstehlik: testing_ack+
Target Release: 3.6.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-02-23 13:32:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1305809    
Bug Blocks: 1135132, 1288157    
Attachments:
Description Flags
log none

Description Ori Gofen 2015-10-12 09:21:15 UTC
Created attachment 1081926 [details]
log

Description of problem:
Cinder with ceph backend is not yet support a stateless VM functionality.
When attaching a cinder disk to a stateless VM and starting it, the operation fails with NullPointer:
"2015-10-12 11:56:07,020 INFO  [org.ovirt.engine.core.bll.CreateAllSnapshotsFromVmCommand] (DefaultQuartzScheduler_Worker-6) [] Lock freed to object 'EngineLock:{exclusiveLock
s='[a0dcfb3d-591e-47e1-8dba-c6353fbd9cd5=<VM, ACTION_TYPE_FAILED_SNAPSHOT_IS_BEING_TAKEN_FOR_VM$VmName n_vm>]', sharedLocks='null'}'
2015-10-12 11:56:07,150 ERROR [org.ovirt.engine.core.bll.tasks.CommandExecutor] (DefaultQuartzScheduler_Worker-6) [] Error invoking callback method 'onSucceeded' for 'SUCCEED
ED' command '1f191124-b705-45c3-843a-48991d879e5f'
2015-10-12 11:56:07,151 ERROR [org.ovirt.engine.core.bll.tasks.CommandExecutor] (DefaultQuartzScheduler_Worker-6) [] Exception: javax.ejb.EJBException: java.lang.NullPointerE
xception
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.handleExceptionInNoTx(CMTTxInterceptor.java:216) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.invokeInNoTx(CMTTxInterceptor.java:268) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.supports(CMTTxInterceptor.java:377) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]
        at org.jboss.as.ejb3.tx.CMTTxInterceptor.processInvocation(CMTTxInterceptor.java:246) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.as.ejb3.component.interceptors.CurrentInvocationContextInterceptor.processInvocation(CurrentInvocationContextInterceptor.java:41) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.as.ejb3.component.invocationmetrics.WaitTimeInterceptor.processInvocation(WaitTimeInterceptor.java:43) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]
        at org.jboss.invocation.InterceptorContext.proceed(InterceptorContext.java:288) [jboss-invocation.jar:1.1.2.Final-redhat-1]
        at org.jboss.as.ejb3.component.interceptors.ShutDownInterceptorFactory$1.processInvocation(ShutDownInterceptorFactory.java:64) [jboss-as-ejb3.jar:7.5.3.Final-redhat-2]"

Webadmin's message is wrong:
"VM n_vm was run as stateless with one or more of disks that do not allow snapshots"

Cinder with ceph backend does support snapshots thus, this operation should be successful.
After the operation failed I have removed the VM, cinder reported on one redundant "leftover" snapshot.

[root@ogofen-cinder ~(keystone_admin)]# cinder snapshot-list
+--------------------------------------+--------------------------------------+-----------+--------------+------+
|                  ID                  |              Volume ID               |   Status  | Display Name | Size |
+--------------------------------------+--------------------------------------+-----------+--------------+------+
| b2a5029b-07b3-43b8-8072-bc0feee26085 | fe7b1961-a714-464a-9606-cbfa4bd785fd | available |     None     |  8   |
+--------------------------------------+--------------------------------------+-----------+--------------+------+

Version-Release number of selected component (if applicable):
rhevm-3.6-14

How reproducible:
100%

Steps to Reproduce:
1.have a powered off VM with one cinder disk
2.edit VM as stateless
3.attach 2 cinder disk to the VM and launch the VM
4.after failure attempt to remove the VM and it's disks

Actual results:
stateless VM is not supported, the operation described above ends with unused ceph volume

Expected results:
operation successful

Additional info:

Comment 1 Oved Ourfali 2015-10-12 09:34:44 UTC
Please open storage backend bugs n BLL.NETWORK.Storage component.

Comment 2 Allon Mureinik 2015-10-12 10:32:30 UTC
*** Bug 1270726 has been marked as a duplicate of this bug. ***

Comment 3 Red Hat Bugzilla Rules Engine 2015-10-19 10:57:39 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 4 Yaniv Lavi 2015-10-29 12:40:15 UTC
In oVirt testing is done on single release by default. Therefore I'm removing the 4.0 flag. If you think this bug must be tested in 4.0 as well, please re-add the flag. Please note we might not have testing resources to handle the 4.0 clone.

Comment 5 Maor 2015-12-03 17:17:17 UTC
It looks like the NPE is caused since the CoCo infrastructure does not pass steps to the child commands at job/ExecutionHandler.
Moving this to infra and open a separate bug on the flow of stateless VM with Cinder disk.

Comment 6 Sandro Bonazzola 2015-12-23 13:44:37 UTC
oVirt 3.6.2 RC1 has been released for testing, moving to ON_QA

Comment 7 Jiri Belka 2016-01-19 13:18:54 UTC
(In reply to Ori Gofen from comment #0)

> Steps to Reproduce:
> 1.have a powered off VM with one cinder disk
> 2.edit VM as stateless
> 3.attach 2 cinder disk to the VM and launch the VM
> 4.after failure attempt to remove the VM and it's disks
> 
> Actual results:
> stateless VM is not supported, the operation described above ends with
> unused ceph volume
> 
> Expected results:
> operation successful

Could you please help me to understand the steps?

In step '3' it should fail, should it fail silently or with an info in UI? I see in audit events that my test VM was started but no failure even it is still down.

In step '4' I should be able to remove this VM with its disks, right? I can't do it as there's locked snapshot...

engine=# select snapshot_id,vm_id,status,creation_date from snapshots where vm_id = ( select vm_guid from vms where vm_name = 'test');                                                                              
             snapshot_id              |                vm_id                 | status |       creation_date        
--------------------------------------+--------------------------------------+--------+----------------------------
 e314cea1-5edc-43dc-85f7-df31de2af34d | 65e0f3bf-2334-417b-b810-3aa0101c8951 | OK     | 2016-01-19 13:41:22.226+01
 abdf93a4-e8d0-4e94-b704-82a61cd26001 | 65e0f3bf-2334-417b-b810-3aa0101c8951 | LOCKED | 2016-01-19 13:51:02.538+01
(2 rows)

Comment 8 Maor 2016-01-19 13:35:34 UTC
This bug (see [1]) fixes this issue so I'm not sure if you can still reproduce this.

I recommend that this bug should be verified along with https://bugzilla.redhat.com/1288157 since both of them are using steps.

[1] https://bugzilla.redhat.com/1288157