Bug 2025878

Summary: The import cron pod is not deleted after delete the dataimportcron if the import is failed
Product: Container Native Virtualization (CNV) Reporter: Yan Du <yadu>
Component: StorageAssignee: Arnon Gilboa <agilboa>
Status: CLOSED ERRATA QA Contact: Yan Du <yadu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.10.0CC: cnv-qe-bugs, dholler, mrashish
Target Milestone: ---   
Target Release: 4.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: CNV v4.10.0-605 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-16 15:56:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yan Du 2021-11-23 09:40:37 UTC
Description of problem:
The import cron pod is not deleted after delete the dataimportcron if the import is failed

Version-Release number of selected component (if applicable):
CNV 4.10

How reproducible:
Always

Steps to Reproduce:
1. Create dataimportcron with en non-existed configmap
---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataImportCron
metadata:
  name: fedora-image-import-cron
  namespace: openshift-virtualization-os-images
spec:
  template:
    spec:
      source:
        registry:
          url: "docker://quay.io/kubevirt/fedora-cloud-registry-disk-demo:latest"
          pullMethod: node
          certConfigMap: no-certs
      storage:
        resources:
          requests:
            storage: 5Gi
        storageClassName: hostpath-provisioner
  schedule: "* * * * *"
  garbageCollect: Outdated
  managedDataSource: fedora

2. Check the CronJobs
$ oc get cronjobs
NAMESPACE                              NAME                                SCHEDULE       SUSPEND   ACTIVE   LAST SCHEDULE   AGE
openshift-cnv                          fedora-image-import-cron-415f66b2   * * * * *      False     0        <none>          36s

3. Check the import pod in openshift-cnv ns - The import will be failed due to the configmap is not existed

4. Delete the DataImportCron
$ oc delete DataImportCron fedora-image-import-cron

5. Check the Cronjobs - the Cronjobs is deleted after the DataImportCron is deleted
 
6. Check the import pod in openshift-cnv ns again


Actual results:
The pod keep in CreateContainerError and never being deleted even the DataImportCron os deleted
$ oc get pod -n openshift-cnv | grep fedora
fedora-image-import-cron-415f66b2-27294271--1-7wvrw             0/1     CreateContainerError   0            35m


Expected results:
The import pod should be deleted 

Additional info:

Comment 1 Arnon Gilboa 2021-11-30 16:28:00 UTC
Yan, reproducing it in my kubevirt ci env I see the cronjob pod (also in your case it's not the importer pod) stuck in ContainerCreating with the following warning in its describe:
  Warning  FailedMount  4m6s                  kubelet            Unable to attach or mount volumes: unmounted volumes=[cdi-cert-vol], unattached volumes=[cdi-cert-vol kube-api-access-d4tjp]: timed out waiting for the condition

What warning do you have in your pod describe?
It's interesting that your pod got to CreateContainerError unlike mine.
How long it took to get to this status?

Comment 2 Yan Du 2021-12-01 03:35:04 UTC
Seems the pod is in Error status now:
$ oc get pod -n openshift-cnv | grep fedora
fedora-image-import-cron-5981a50c-27296764--1-bjps8             0/1     Error               0               6d
fedora-image-import-cron-5981a50c-27296764--1-ftgxf             0/1     ContainerCreating   0               6d
fedora-image-import-cron-5981a50c-27296764--1-kpmhn             0/1     Error               0               6d
fedora-image-import-cron-5981a50c-27296764--1-mctdc             0/1     Error               0               6d
fedora-image-import-cron-9bb74e4e-27296766--1-kw8jv             0/1     ContainerCreating   0               6d
fedora-image-import-cron-fb0f3696-27305437--1-x7x5m             0/1     ContainerCreating   0               20m


describe pod log:
$ oc describe pod fedora-image-import-cron-5981a50c-27296764--1-bjps8 -n openshift-cnv
Name:         fedora-image-import-cron-5981a50c-27296764--1-bjps8
Namespace:    openshift-cnv
Priority:     0
Node:         yadu-kt9ms-worker-0-tvdwk/192.168.0.209
Start Time:   Thu, 25 Nov 2021 02:04:07 +0000
Labels:       controller-uid=8b956f0d-fa45-4dec-ba4d-e2de0aaaf72c
              job-name=fedora-image-import-cron-5981a50c-27296764
Annotations:  k8s.v1.cni.cncf.io/network-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.129.2.134"
                    ],
                    "default": true,
                    "dns": {}
                }]
              k8s.v1.cni.cncf.io/networks-status:
                [{
                    "name": "openshift-sdn",
                    "interface": "eth0",
                    "ips": [
                        "10.129.2.134"
                    ],
                    "default": true,
                    "dns": {}
                }]
              openshift.io/scc: restricted
Status:       Failed
IP:           10.129.2.134
IPs:
  IP:           10.129.2.134
Controlled By:  Job/fedora-image-import-cron-5981a50c-27296764
Containers:
  cdi-source-update-poller:
    Container ID:  cri-o://98fbaf43b7bee5e3787300769bdc5713b879146150344c8c8d39053c790894c2
    Image:         registry.redhat.io/container-native-virtualization/virt-cdi-importer@sha256:be877b23a9c5df1e7374e6b1029d1e71a5f02e2ffc6788ea7d5648684be4d1a0
    Image ID:      registry.redhat.io/container-native-virtualization/virt-cdi-importer@sha256:ae1c9df86982ae0a650f08cc49faeee4b07dd004e44e5c8d4d86ffb0236ab174
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/bin/cdi-source-update-poller
      -ns
      openshift-virtualization-os-images
      -cron
      fedora-image-import-cron
      -url
      docker://quay.io/kubevirt/fedora-cloud-registry-disk-demo:latest
      -certdir
      /certs
    State:          Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Thu, 25 Nov 2021 02:04:10 +0000
      Finished:     Thu, 25 Nov 2021 02:04:11 +0000
    Ready:          False
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /certs from cdi-cert-vol (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xwrlz (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  cdi-cert-vol:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      some-certs
    Optional:  false
  kube-api-access-xwrlz:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>


$ oc logs fedora-image-import-cron-5981a50c-27296764--1-bjps8 -n openshift-cnv
I1125 02:04:10.672251       1 transport.go:227] Inspecting image from 'docker://quay.io/kubevirt/fedora-cloud-registry-disk-demo:latest'
Digest is sha256:6f5afce978111b0968d1d718435df7ef4f0b266715acd41620321b6ed3c28ad6
W1125 02:04:10.892811       1 client_config.go:614] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2021/11/25 02:04:10 Failed getting DataImportCron, cronNamespace openshift-virtualization-os-images cronName fedora-image-import-cron: dataimportcrons.cdi.kubevirt.io "fedora-image-import-cron" not found

Comment 3 Maya Rashish 2022-01-09 10:37:56 UTC
PR got closed, moving back to assigned

Comment 4 Yan Du 2022-01-21 10:26:30 UTC
Test on CNV v4.10.0-605, issue have been fixed.

Comment 9 errata-xmlrpc 2022-03-16 15:56:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0947