Bug 2054782 - DataImportCron status does not show failure when failing to create dataSource
Summary: DataImportCron status does not show failure when failing to create dataSource
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 4.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
low
Target Milestone: ---
: 4.10.1
Assignee: Arnon Gilboa
QA Contact: Yan Du
URL:
Whiteboard:
Depends On:
Blocks: 2083039 2091982
TreeView+ depends on / blocked
 
Reported: 2022-02-15 17:03 UTC by Ruth Netser
Modified: 2022-05-31 12:54 UTC (History)
3 users (show)

Fixed In Version: CNV v4.10.1-75
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-18 20:27:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt containerized-data-importer pull 2220 0 None Merged [release-v1.43] Delete erroneous DVs on DataImportCron digest update 2022-04-10 11:22:12 UTC
Github kubevirt containerized-data-importer pull 2239 0 None Merged [release-v1.43] Fix DataImportCron watch race 2022-04-15 12:49:53 UTC
Github kubevirt containerized-data-importer pull 2249 0 None Merged [release-v1.43] Tighten DataImportCron & DataSource max name length 2022-04-28 05:59:48 UTC
Red Hat Product Errata RHSA-2022:4668 0 None None None 2022-05-18 20:27:23 UTC

Description Ruth Netser 2022-02-15 17:03:21 UTC
Description of problem:
If dataSource creation fails (e.g name is too long), dataImportCron status does not provide any information

Version-Release number of selected component (if applicable):
CDI v4.10.0-89

How reproducible:
100%

Steps to Reproduce:
1. Add custom DataImportCron to HCO with a long name (name+ns > 63 characters)
2.
3.

Actual results:
DataImportCron is created but dataSource is not created.
There's no indication in DataImportCron about the failure

Expected results:
DataImportCron status should provide information about the failure

Additional info:
============== Added DataImportCron to HCO ================
    dataImportCronTemplates:
    - metadata:
        annotations:
          cdi.kubevirt.io/storage.bind.immediate.requested: "true"
        name: data-import-cron-with-non-existing-source
      spec:
        managedDataSource: non-existing-url-data-source
        retentionPolicy: None
        schedule: '* * * * *'
        template:
          spec:
            source:
              registry:
                pullMethod: node
                url: docker://non-existing-url
            storage:
              resources:
                requests:
                  storage: 10Gi

==============  DataImportCron  ================
$ oc get dic -n openshift-virtualization-os-images data-import-cron-with-non-existing-source -oyaml
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  annotations:
    cdi.kubevirt.io/storage.bind.immediate.requested: "true"
    operator-sdk/primary-resource: openshift-cnv/ssp-kubevirt-hyperconverged
    operator-sdk/primary-resource-type: SSP.ssp.kubevirt.io
  creationTimestamp: "2022-02-15T16:40:44Z"
  generation: 1
  labels:
    app.kubernetes.io/component: templating
    app.kubernetes.io/managed-by: ssp-operator
    app.kubernetes.io/name: data-sources
    app.kubernetes.io/part-of: hyperconverged-cluster
    app.kubernetes.io/version: 4.10.0
  name: data-import-cron-with-non-existing-source
  namespace: openshift-virtualization-os-images
  resourceVersion: "3025492"
  uid: 2e690d84-3b2a-4c2c-8f53-4754e07d5abd
spec:
  managedDataSource: non-existing-url-data-source
  retentionPolicy: None
  schedule: '* * * * *'
  template:
    metadata: {}
    spec:
      source:
        registry:
          pullMethod: node
          url: docker://non-existing-url
      storage:
        resources:
          requests:
            storage: 10Gi
    status: {}
status: {}
$

==============  cdi-deployment log  ================
{"level":"error","ts":1644938729.9672325,"logger":"controller.dataimportcron-controller","msg":"Reconciler error","name":"data-import-cron-with-non-existing-source","namespace":"openshift-virtualization-os-images","error":"CronJob.batch \"data-import-cron-with-non-existing-source-49454b5a\" is invalid: metadata.labels: Invalid value: \"openshift-virtualization-os-images.data-import-cron-with-non-existing-source\": must be no more than 63 characters","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}

Comment 1 Yan Du 2022-02-16 13:33:42 UTC
There are two aspects for this:
1. we should report dataimportcron conditions of this kind
2. we should use a hashing when naming the datasource, similar to the datavolume controller creates the PVC

Comment 2 Ruth Netser 2022-02-17 07:37:07 UTC
(In reply to Yan Du from comment #1)
> There are two aspects for this:
> 1. we should report dataimportcron conditions of this kind
> 2. we should use a hashing when naming the datasource, similar to the
> datavolume controller creates the PVC
This cannot be done; common templates refer to given dataSource names. This is why dataSources were introduced - a resource with a static name which dynamically points to a PVC (which can be replaced)

Comment 4 Yan Du 2022-04-13 01:21:05 UTC
Test on CNV-v4.10.1-62, got below error after set custom dataimportcron:

{"level":"error","ts":1649760447.0815444,"logger":"controller.dataimportcron-controller","msg":"Reconciler error","name":"","namespace":"openshift-virtualization-os-images","error":"values[0][cdi.kubevirt.io/dataImportCron]: Invalid value: \"openshift-virtualization-os-images.\": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue',  or 'my_value',  or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')","errorCauses":[{"error":"values[0][cdi.kubevirt.io/dataImportCron]: Invalid value: \"openshift-virtualization-os-images.\": a valid label must be an empty string or consist of alphanumeric characters, '-', '_' or '.', and must start and end with an alphanumeric character (e.g. 'MyValue',  or 'my_value',  or '12345', regex used for validation is '(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')"}],"stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:227"}

Comment 5 Maya Rashish 2022-04-18 13:52:47 UTC
Updated fixed in version to a bundle including PR 2239

Comment 6 Yan Du 2022-04-20 06:39:34 UTC
Test on CNV v4.10.1-78, issue has been fixed.

Comment 8 Yan Du 2022-04-28 07:02:13 UTC
Need to retest on the build contains PR 2249

Comment 9 Yan Du 2022-05-09 06:35:54 UTC
Verified on CNV-v4.10.1-101

Comment 10 Yan Du 2022-05-09 07:01:19 UTC
We can see some recurring Reconciler error in cdi deployment pod after adding custom dataimportcron to hco till the import finished.
Talked with Arnon, since it doesn't affect the import function, let's track the issue in #2083039

Comment 15 errata-xmlrpc 2022-05-18 20:27:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.1 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:4668


Note You need to log in before you can comment on or make changes to this bug.