Bug 1908180 - Add source for template is stucking in preparing pvc
Summary: Add source for template is stucking in preparing pvc
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Console Kubevirt Plugin
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Gilad Lekner
QA Contact: Guohua Ouyang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-16 02:38 UTC by Guohua Ouyang
Modified: 2021-02-24 15:45 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:44:54 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
stucking on preparing pvc (159.94 KB, image/png)
2020-12-16 02:38 UTC, Guohua Ouyang
no flags Details
Template source importing (137.29 KB, image/png)
2020-12-16 20:21 UTC, Phillip Bailey
no flags Details
Not fix on 'Add boot source to template' (203.93 KB, image/png)
2020-12-30 12:06 UTC, Guohua Ouyang
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift console pull 7671 0 None closed Bug 1908180: Add source for template is stucking in preparing pvc 2021-02-08 10:53:20 UTC
Github openshift console pull 7681 0 None closed Bug 1908180: Add source for template is stucking in preparing pvc 2021-02-08 10:53:20 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:45:16 UTC

Description Guohua Ouyang 2020-12-16 02:38:44 UTC
Created attachment 1739504 [details]
stucking on preparing pvc

Description of problem:
Add source for template is stucking in preparing pvc, it says "Starting CDI pod".

Create a VM with URL source is successfully, so CDI is working. Adding source for template is stucking, seems it doesn't start CDI process at all.

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-12-13-202314

How reproducible:
100%

Steps to Reproduce:
1. Go to template tab
2. Select a template and click "Add source"
3. Select boot source type "Import via URL"
4. Fill in the image URL
5. Click "save and import".

Actual results:


Expected results:


Additional info:

Comment 1 Yaacov Zamir 2020-12-16 13:07:16 UTC
Guohua hi,
do you have more debug info ?

Comment 3 Phillip Bailey 2020-12-16 20:21:27 UTC
Created attachment 1739746 [details]
Template source importing

Comment 4 Phillip Bailey 2020-12-16 20:23:46 UTC
Guohua, I have attached a screenshot of boot source being imported on our environment. Please let us know ASAP after you've had a chance to update your cluster. Thanks!

Comment 5 Guohua Ouyang 2020-12-17 00:25:01 UTC
Now I'm on CNV 2.6 + OCP 4.7.0-0.nightly-2020-12-14-165231, still have this issue.

Comment 6 Guohua Ouyang 2020-12-17 04:11:16 UTC
Data upload on storage also are all failed with nothing happened in backend, I think they're hitting the same issue.

Comment 7 Yaacov Zamir 2020-12-17 06:05:16 UTC
@Gilad hi,

do you know something that can help here ?

Comment 10 Gilad Lekner 2020-12-21 21:35:55 UTC
can't seem to reproduce this issue, tried steps to reproduce and the import finished successfully.

Comment 11 Guohua Ouyang 2020-12-22 06:05:33 UTC
It's CDI problem, it cannot handle SC 'standard'.
$ virtctl image-upload dv rhel6 --size=10Gi --image-path=./cirros-0.4.0-x86_64-disk.img --block-volume --access-mode=ReadWriteOnce --storage-class=standard -n openshift-virtualization-os-images --insecure
DataVolume openshift-virtualization-os-images/rhel6 created
Waiting for PVC rhel6 upload pod to be ready...
timed out waiting for the condition

CDI also cannot handle SC='ocs-storagecluster-ceph-rbd'/,accessMmode=ReadWriteMany and volumeMode="Filesystem"
$ virtctl image-upload dv rhel8 --size=10Gi --image-path=./cirros-0.4.0-x86_64-disk.img  --access-mode=ReadWriteMany --storage-class=ocs-storagecluster-ceph-rbd -n openshift-virtualization-os-images --insecure
DataVolume openshift-virtualization-os-images/rhel8 created
Waiting for PVC rhel8 upload pod to be ready...
timed out waiting for the condition


And maybe more failures with sc/accessMode/volumeMode combination. 

There is epci for this: https://issues.redhat.com/browse/CNV-8626, but it's targeting on CNV 2.7, before that, we might need to find a solution for this. The similar issue around storageClass is https://bugzilla.redhat.com/show_bug.cgi?id=1900266.

Comment 12 Yaacov Zamir 2020-12-24 06:36:45 UTC
Note from mtg: we may not support the sc/accessMode/volumeMode combinations that show the bug, if this is the case, we can wait for backend fix, and fix the UI in next release.

@Guohua can you check if the sc/accessMode/volumeMode combination we support work well ?

Comment 13 Guohua Ouyang 2020-12-24 09:44:30 UTC
I don't know what're supported. 
@Tomas @Adam Do we have a matrix for what sc/accessMode/volumeMode are supported when do CDI imports, or is it just a bug if upload is failed?

Is it due to https://github.com/kubevirt/kubevirt/issues/4601?

Comment 14 Yaacov Zamir 2020-12-24 13:40:36 UTC
@Ying hi, can you help with comment#13 ?

Comment 15 Ying Cui 2020-12-28 13:18:46 UTC
(In reply to Guohua Ouyang from comment #11)
> It's CDI problem, it cannot handle SC 'standard'.
> $ virtctl image-upload dv rhel6 --size=10Gi
> --image-path=./cirros-0.4.0-x86_64-disk.img --block-volume
> --access-mode=ReadWriteOnce --storage-class=standard -n
> openshift-virtualization-os-images --insecure
> DataVolume openshift-virtualization-os-images/rhel6 created
> Waiting for PVC rhel6 upload pod to be ready...
> timed out waiting for the condition

FYI: Storageclass Matrix
https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/virtual_disks/virt-features-for-storage.html

--storage-class=standard does not support. 

> 
> CDI also cannot handle
> SC='ocs-storagecluster-ceph-rbd'/,accessMmode=ReadWriteMany and
> volumeMode="Filesystem"
> $ virtctl image-upload dv rhel8 --size=10Gi
> --image-path=./cirros-0.4.0-x86_64-disk.img  --access-mode=ReadWriteMany
> --storage-class=ocs-storagecluster-ceph-rbd -n
> openshift-virtualization-os-images --insecure
> DataVolume openshift-virtualization-os-images/rhel8 created
> Waiting for PVC rhel8 upload pod to be ready...
> timed out waiting for the condition

if storage-class=ocs-storagecluster-ceph-rbd, then you need to set volumeMode="Block"

So, could you recheck this issue?

Comment 16 Natalie Gavrielov 2020-12-28 13:34:43 UTC
Hi Guohua,

It's not so clear to me what is the issue?
The bug is missing "Expected results" and the information about the data volume you are trying to create.
From other commants, I'm guessing you tried using storage class "standard" (This is the supported matrix plus operations [1]), not every storage class supports
the functionality your asking for and it's worth checking the documantation.

(In reply to Guohua Ouyang from comment #13)
> I don't know what're supported. 
> @Tomas @Adam Do we have a matrix for what sc/accessMode/volumeMode are
> supported when do CDI imports, or is it just a bug if upload is failed?
Depends on the user needs (what kind of storage features it needs) [1]
AFAIK, we do not have a document specifying such matrix (and not sure we wish to have such)
Upload times out when it get's values it cannot satisfy (such as wrong volume mode..)

> Is it due to https://github.com/kubevirt/kubevirt/issues/4601?
No, this is only related to HPP which is not seems to be used here.


[1] https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/virtual_disks/virt-features-for-storage.html)

Comment 17 Maya Rashish 2020-12-28 15:06:00 UTC
Hi all, in CNV 2.6.0 we added ceph rbd to the storage matrix that we do have:
https://github.com/kubevirt/hyperconverged-cluster-operator/blob/2fcfb51277e828ce786b6fe2cc6cceb246bea755/pkg/controller/operands/cdi.go#L315-L341

So the UI should have the information that it needs to determine that it should use VolumeMode: Block and accessMode: ReadWriteMany

Comment 18 Guohua Ouyang 2020-12-29 01:52:39 UTC
@ycui @natalie The OpenShift Virtualization storage feature matrix is not suffient about nfs.
Eg: upload image with nfs/rwo/filesystem is working
    upload image with nfs/rwo/block is not working. When SC nfs is selected on UI, accessMode only have RWO which is good, but the volumeMode have both "Block" and "Filesystem". Without a clear documentation, UI is even hard to work on the SC/AccessMode/VolueMode selection.

IMHO, we need different storage feature matrix for different areas, like once page for VM live migration, once page for clone and one page for CDI upload. wdyt?



1. ./virtctl image-upload dv rhel6 --size=10Gi --image-path=./cirros.img --storage-class=nfs --volume-mode=Filesystem --access-mode=ReadWriteOnce --insecure -n openshift-virtualization-os-images
DataVolume openshift-virtualization-os-images/rhel6 created
Waiting for PVC rhel6 upload pod to be ready...
Pod now ready
Uploading data to https://cdi-uploadproxy-openshift-cnv.apps.sys01.cnv-qe.rhcloud.com

 12.13 MiB / 12.13 MiB [===================================================================================================] 100.00% 14s

Uploading data completed successfully, waiting for processing to complete, you can hit ctrl-c without interrupting the progress
Processing completed successfully

2. ./virtctl image-upload dv rhel6 --size=10Gi --image-path=./cirros.img --storage-class=nfs --volume-mode=Block --access-mode=ReadWriteOnce --insecure
DataVolume default/rhel6 created
Waiting for PVC rhel6 upload pod to be ready...
timed out waiting for the condition

Comment 19 Guohua Ouyang 2020-12-29 02:02:13 UTC
(In reply to Natalie Gavrielov from comment #16)
> Hi Guohua,
> 
> It's not so clear to me what is the issue?

Sorry if it confuses as the issue transfer from UI to storage.
The original issue is uploading an image is stucking forever if the combined STORAGECLASS/AccessMode/VolumeMode is not supported.
We need a clear documentation for user to make choice or for UI dev to work on the filter, the documentation [1] is not good enough as I mentioned in c#18.

> The bug is missing "Expected results" and the information about the data
> volume you are trying to create.
> From other commants, I'm guessing you tried using storage class "standard"
> (This is the supported matrix plus operations [1]), not every storage class
> supports
> the functionality your asking for and it's worth checking the documantation.
> 
> (In reply to Guohua Ouyang from comment #13)
> > I don't know what're supported. 
> > @Tomas @Adam Do we have a matrix for what sc/accessMode/volumeMode are
> > supported when do CDI imports, or is it just a bug if upload is failed?
> Depends on the user needs (what kind of storage features it needs) [1]
> AFAIK, we do not have a document specifying such matrix (and not sure we
> wish to have such)


> Upload times out when it get's values it cannot satisfy (such as wrong
> volume mode..)
> 
> > Is it due to https://github.com/kubevirt/kubevirt/issues/4601?
> No, this is only related to HPP which is not seems to be used here.
> 
> 
[1] https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/virtual_disks/virt-features-for-storage.html

Comment 20 Yaacov Zamir 2020-12-29 05:50:00 UTC
removing blocker+ flag because this issue will not happen if users follow documentation.

fix can be:
a - add a hint in the UI [modals that add/edit storage] linking to documentation (provided by Ying comment#15)
b - showing a hint why we suggest this combination of sc/accessMode/volumeMode and disable others, using the config map (provided by Maya comment#17)

@Guohua @Gilad WDYT ?

Comment 21 Ying Cui 2020-12-29 09:51:19 UTC
(In reply to Guohua Ouyang from comment #18)
> @ycui @natalie The OpenShift Virtualization storage feature matrix is not
> suffient about nfs.
> Eg: upload image with nfs/rwo/filesystem is working
>     upload image with nfs/rwo/block is not working. When SC nfs is selected
> on UI, accessMode only have RWO which is good, but the volumeMode have both
> "Block" and "Filesystem". Without a clear documentation, UI is even hard to
> work on the SC/AccessMode/VolueMode selection.
> 
> IMHO, we need different storage feature matrix for different areas, like
> once page for VM live migration, once page for clone and one page for CDI
> upload. wdyt?
> 
> 
> 
> 1. ./virtctl image-upload dv rhel6 --size=10Gi --image-path=./cirros.img
> --storage-class=nfs --volume-mode=Filesystem --access-mode=ReadWriteOnce
> --insecure -n openshift-virtualization-os-images
> DataVolume openshift-virtualization-os-images/rhel6 created
> Waiting for PVC rhel6 upload pod to be ready...
> Pod now ready
> Uploading data to
> https://cdi-uploadproxy-openshift-cnv.apps.sys01.cnv-qe.rhcloud.com
> 
>  12.13 MiB / 12.13 MiB
> [============================================================================
> =======================] 100.00% 14s
> 
> Uploading data completed successfully, waiting for processing to complete,
> you can hit ctrl-c without interrupting the progress
> Processing completed successfully
> 
> 2. ./virtctl image-upload dv rhel6 --size=10Gi --image-path=./cirros.img
> --storage-class=nfs --volume-mode=Block --access-mode=ReadWriteOnce
> --insecure
> DataVolume default/rhel6 created
> Waiting for PVC rhel6 upload pod to be ready...
> timed out waiting for the condition

NFS, only support filesystem volumemode, not block.

Comment 22 Ying Cui 2020-12-29 09:52:14 UTC
(In reply to Guohua Ouyang from comment #19)
> (In reply to Natalie Gavrielov from comment #16)
> > Hi Guohua,
> > 
> > It's not so clear to me what is the issue?
> 
> Sorry if it confuses as the issue transfer from UI to storage.
> The original issue is uploading an image is stucking forever if the combined
> STORAGECLASS/AccessMode/VolumeMode is not supported.
> We need a clear documentation for user to make choice or for UI dev to work
> on the filter, the documentation [1] is not good enough as I mentioned in
> c#18.
> 
> > The bug is missing "Expected results" and the information about the data
> > volume you are trying to create.
> > From other commants, I'm guessing you tried using storage class "standard"
> > (This is the supported matrix plus operations [1]), not every storage class
> > supports
> > the functionality your asking for and it's worth checking the documantation.
> > 
> > (In reply to Guohua Ouyang from comment #13)
> > > I don't know what're supported. 
> > > @Tomas @Adam Do we have a matrix for what sc/accessMode/volumeMode are
> > > supported when do CDI imports, or is it just a bug if upload is failed?
> > Depends on the user needs (what kind of storage features it needs) [1]
> > AFAIK, we do not have a document specifying such matrix (and not sure we
> > wish to have such)

Bug 1901616 - Storage support matrix for CNV is unclear. [ASSIGNED], FYI. 


> 
> 
> > Upload times out when it get's values it cannot satisfy (such as wrong
> > volume mode..)
> > 
> > > Is it due to https://github.com/kubevirt/kubevirt/issues/4601?
> > No, this is only related to HPP which is not seems to be used here.
> > 
> > 
> [1]
> https://docs.openshift.com/container-platform/4.6/virt/virtual_machines/
> virtual_disks/virt-features-for-storage.html

Comment 24 Guohua Ouyang 2020-12-30 05:38:55 UTC
Didn't see any difference in advance in on add source page.

Comment 25 Guohua Ouyang 2020-12-30 12:04:45 UTC
Created attachment 1743198 [details]
Looks good on "Add disk" dialog

Comment 26 Guohua Ouyang 2020-12-30 12:06:10 UTC
Created attachment 1743199 [details]
Not fix on 'Add boot source to template'

Comment 27 Yaacov Zamir 2020-12-30 17:45:42 UTC
*** Bug 1911617 has been marked as a duplicate of this bug. ***

Comment 28 Yaacov Zamir 2020-12-30 17:48:37 UTC
Note when verifying:

the fix should also cover this steps from bug 1911617:

Steps to Reproduce:
1. Make sure none of the defined storage class has "storageclass.kubernetes.io/is-default-class" annotation (no default storage class).
2. Go to VM template, and try to add source to one of the templates.
3. On the "Add boot source to template" dialog box, click advance - verify that the storage class field show one of the defined storage class (but don't touch that field)
4. Press "save and upload".

Actual results:
A new data volume is created, but the pvc spec doesn't include the storage class highlighted in the dialog box. As a result the PVC is pending as there are no available PVs.

Expected results:
The explicitly selected storage class should be used when creating the data volume for the template source. 


Additional info:
When opening the storage class scroll list and explicitly select a storage class, the DV will be created as expected.

Comment 30 Guohua Ouyang 2020-12-31 00:52:25 UTC
The issue is fixed on 'Add boot source to template'.

Do you an example yaml SC to test bug 1911617?

Comment 31 Yaacov Zamir 2020-12-31 05:06:33 UTC
The fix for this bug ** will not ** fix the use case described in bug 1911617,

I re-opened bug 1911617 , and we will have a new fix for the use case of bug 1911617.

Comment 32 Gilad Lekner 2021-01-19 09:38:47 UTC
clearing redundant need info

Comment 35 errata-xmlrpc 2021-02-24 15:44:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.