Bug 1754262 - VM with DataVolume doesn't fully start
Summary: VM with DataVolume doesn't fully start
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 2.1.0
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: ---
: 2.2.0
Assignee: Adam Litke
QA Contact: Natalie Gavrielov
URL:
Whiteboard:
: 1754261 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-09-22 10:26 UTC by Yossi Segev
Modified: 2019-12-26 09:37 UTC (History)
9 users (show)

Fixed In Version: virt-cdi-operator-container-v2.2.0-3 hco-bundle-registry-container-v2.2.0-62
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-12-26 09:37:36 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
DataVolume spec (517 bytes, text/plain)
2019-09-22 10:28 UTC, Yossi Segev
no flags Details
VM with DataVolume spec (706 bytes, text/plain)
2019-09-22 10:30 UTC, Yossi Segev
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt containerized-data-importer pull 1008 0 'None' 'closed' 'doc: Clarify supported kubevirt formats' 2019-12-02 11:15:54 UTC

Description Yossi Segev 2019-09-22 10:26:51 UTC
Description of problem:
VMI, with DataVolume defined for its disk, starts successfully and is in "Running" state, but its virtctl console does not function.


Version-Release number of selected component (if applicable):
OCP: 4.2.0-0.nightly-2019-09-21-183303
Latest CNV v2.1.0 (due to 2019-Sep-22)

How reproducible:
Always

Steps to Reproduce:
1. Create DataVolume using the attached dv.yaml
$ oc create -f dv.yaml

2. Verify the DataVolume was created successfully:
 $ oc get dv
NAME        AGE
dv-cirros   23m

3. Verify PVC was created successfully, and that it is in bound state:
$ oc get pvc
NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS      AGE
dv-cirros   Bound    pvc-93b30aee-dd1e-11e9-af4b-fa163e474cdd   500Mi      RWO            rook-ceph-block   23m

4. Create a Vm that deploys this DataVolume. You can use the attached vm-dv-cirros.yaml
$ oc create -f vm-dv-cirros.yaml

5. Start the VMI:
$ virtctl start vm-cirros

6. Verify the VMI is in "Running" state (you might want to wait ~1 minute for that):
$ oc get vmi
NAME         AGE   PHASE     IP            NODENAME
vm-cirros    24m   Running   10.129.0.41   host-172-16-0-15

7. Open a console to the VMI:
$ virtctl console vm-cirros

Actual results:
The console is not responsive - there's no progress of the VM start-up, no login and no shell.

Expected results:
Interactive console, with login and shell CLI.


Additional info:
1. Same thing happens when using a VM with a non-block (file-system) DV (same DV spec, without volumeMode defined).
2. There's no problem when using the standard vm-cirros spec - the one which is in https://github.com/kubevirt/kubevirt/tree/master/examples, which uses a containerDisk image from the registry.

Comment 1 Yossi Segev 2019-09-22 10:28:57 UTC
Created attachment 1617756 [details]
DataVolume spec

Comment 2 Yossi Segev 2019-09-22 10:30:32 UTC
Created attachment 1617757 [details]
VM with DataVolume spec

Comment 3 Ying Cui 2019-09-24 06:54:01 UTC
*** Bug 1754261 has been marked as a duplicate of this bug. ***

Comment 4 Natalie Gavrielov 2019-09-25 12:22:17 UTC
Try to reproduce this

Comment 5 Fabian Deutsch 2019-09-30 10:47:41 UTC
Ying, can you tell how reproducible tihs is?

It looks to be in a primary flow, and if this was 100% reproducible, then I'd expect this to appear in many flows.

Comment 6 Ying Cui 2019-09-30 13:49:21 UTC
(In reply to Fabian Deutsch from comment #5)
> Ying, can you tell how reproducible tihs is?
> 
> It looks to be in a primary flow, and if this was 100% reproducible, then
> I'd expect this to appear in many flows.

Synced this bug with team, it was not 100% reproducible, but it could be reproduced on BM env. and PSI both.

Comment 7 Fabian Deutsch 2019-09-30 13:59:11 UTC
Ok.

Is it know if the Vm is really running (i.e. reacts to ping) or could it be that the VM is not starting correctly?

Comment 8 Adam Litke 2019-09-30 14:28:58 UTC
This bug deserves a closer look but I don't think it should be considered severe.  The reproduction steps outlined are a very common flow and we have not seen this behavior often at all.  I suggest we push this out to 2.1.1 to give us time to properly analyze it.

Comment 13 Adam Litke 2019-10-15 11:48:18 UTC
We will resolve this bug with an update to upstream documentation to clarify the supported combinations.

Comment 14 Nelly Credi 2019-11-25 08:08:54 UTC
please add fixed in version

Comment 15 Nelly Credi 2019-12-02 11:16:56 UTC
please add 'fixed in version'


Note You need to log in before you can comment on or make changes to this bug.