Bug 1754262

Summary: VM with DataVolume doesn't fully start
Product: Container Native Virtualization (CNV) Reporter: Yossi Segev <ysegev>
Component: StorageAssignee: Adam Litke <alitke>
Status: CLOSED DEFERRED QA Contact: Natalie Gavrielov <ngavrilo>
Severity: medium Docs Contact:
Priority: medium    
Version: 2.1.0CC: alitke, cnv-qe-bugs, fdeutsch, kgoldbla, mhenriks, ncredi, ngavrilo, sgordon, ycui
Target Milestone: ---   
Target Release: 2.2.0   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: virt-cdi-operator-container-v2.2.0-3 hco-bundle-registry-container-v2.2.0-62 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-12-26 09:37:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
DataVolume spec
none
VM with DataVolume spec none

Description Yossi Segev 2019-09-22 10:26:51 UTC
Description of problem:
VMI, with DataVolume defined for its disk, starts successfully and is in "Running" state, but its virtctl console does not function.


Version-Release number of selected component (if applicable):
OCP: 4.2.0-0.nightly-2019-09-21-183303
Latest CNV v2.1.0 (due to 2019-Sep-22)

How reproducible:
Always

Steps to Reproduce:
1. Create DataVolume using the attached dv.yaml
$ oc create -f dv.yaml

2. Verify the DataVolume was created successfully:
 $ oc get dv
NAME        AGE
dv-cirros   23m

3. Verify PVC was created successfully, and that it is in bound state:
$ oc get pvc
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
dv-cirros   Bound    pvc-93b30aee-dd1e-11e9-af4b-fa163e474cdd   500Mi      RWO            rook-ceph-block   23m

4. Create a Vm that deploys this DataVolume. You can use the attached vm-dv-cirros.yaml
$ oc create -f vm-dv-cirros.yaml

5. Start the VMI:
$ virtctl start vm-cirros

6. Verify the VMI is in "Running" state (you might want to wait ~1 minute for that):
$ oc get vmi
NAME         AGE   PHASE     IP            NODENAME
vm-cirros    24m   Running   10.129.0.41   host-172-16-0-15

7. Open a console to the VMI:
$ virtctl console vm-cirros

Actual results:
The console is not responsive - there's no progress of the VM start-up, no login and no shell.

Expected results:
Interactive console, with login and shell CLI.


Additional info:
1. Same thing happens when using a VM with a non-block (file-system) DV (same DV spec, without volumeMode defined).
2. There's no problem when using the standard vm-cirros spec - the one which is in https://github.com/kubevirt/kubevirt/tree/master/examples, which uses a containerDisk image from the registry.

Comment 1 Yossi Segev 2019-09-22 10:28:57 UTC
Created attachment 1617756 [details]
DataVolume spec

Comment 2 Yossi Segev 2019-09-22 10:30:32 UTC
Created attachment 1617757 [details]
VM with DataVolume spec

Comment 3 Ying Cui 2019-09-24 06:54:01 UTC
*** Bug 1754261 has been marked as a duplicate of this bug. ***

Comment 4 Natalie Gavrielov 2019-09-25 12:22:17 UTC
Try to reproduce this

Comment 5 Fabian Deutsch 2019-09-30 10:47:41 UTC
Ying, can you tell how reproducible tihs is?

It looks to be in a primary flow, and if this was 100% reproducible, then I'd expect this to appear in many flows.

Comment 6 Ying Cui 2019-09-30 13:49:21 UTC
(In reply to Fabian Deutsch from comment #5)
> Ying, can you tell how reproducible tihs is?
> 
> It looks to be in a primary flow, and if this was 100% reproducible, then
> I'd expect this to appear in many flows.

Synced this bug with team, it was not 100% reproducible, but it could be reproduced on BM env. and PSI both.

Comment 7 Fabian Deutsch 2019-09-30 13:59:11 UTC
Ok.

Is it know if the Vm is really running (i.e. reacts to ping) or could it be that the VM is not starting correctly?

Comment 8 Adam Litke 2019-09-30 14:28:58 UTC
This bug deserves a closer look but I don't think it should be considered severe.  The reproduction steps outlined are a very common flow and we have not seen this behavior often at all.  I suggest we push this out to 2.1.1 to give us time to properly analyze it.

Comment 13 Adam Litke 2019-10-15 11:48:18 UTC
We will resolve this bug with an update to upstream documentation to clarify the supported combinations.

Comment 14 Nelly Credi 2019-11-25 08:08:54 UTC
please add fixed in version

Comment 15 Nelly Credi 2019-12-02 11:16:56 UTC
please add 'fixed in version'