Bug 1862453

Summary: The cron job resources does not purge after deleting the corresponding compliancesuite object
Product: OpenShift Container Platform Reporter: xiyuan
Component: Compliance OperatorAssignee: Juan Antonio Osorio <josorior>
Status: CLOSED ERRATA QA Contact: Prashant Dhamdhere <pdhamdhe>
Severity: medium Docs Contact:
Priority: medium    
Version: 4.6CC: jhrozek, josorior, mrogers, nkinder, xiyuan
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:21:52 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description xiyuan 2020-07-31 13:23:40 UTC
Description of problem 
The cron job resources does not purge after deleting the corresponding compliancesuite object.


Version-Release -Cluster version 
4.6.0-0.nightly-2020-07-25-091217

Reproduce
Always

Reproduce step
1. install compliance operator:
 1.1 clone compliance-operator git repo
 $ git clone https://github.com/openshift/compliance-operator.git
 1.2 Create 'openshift-compliance' namespace
 $ oc create -f compliance-operator/deploy/ns.yaml  
 1.3 Switch to 'openshift-compliance' namespace
 $ oc project openshift-compliance
 1.4 Deploy CustomResourceDefinition.
 $ for f in $(ls -1 compliance-operator/deploy/crds/*crd.yaml); do oc create -f $f; done
 1.5 Deploy compliance-operator.
 $ oc create -f compliance-operator/deploy/

2. Deploy ComplianceSuite CR with schedule, to run compliancesuite every 3 minutes:
 $ oc create -f - <<EOF                                                     
 apiVersion: compliance.openshift.io/v1alpha1
 kind: ComplianceSuite
 metadata:
   name: example-compliancesuite1
 spec:
   autoApplyRemediations: false
   schedule: "*/3 * * * *"
   scans:
     - name: worker-scan
       profile: xccdf_org.ssgproject.content_profile_moderate
       content: ssg-rhcos4-ds.xml
       contentImage: quay.io/complianceascode/ocp4:latest
       rule: "xccdf_org.ssgproject.content_rule_no_netrc_files"
       debug: true
       nodeSelector:
         node-role.kubernetes.io/worker: ""
 EOF
3. After about 10 minutes, delete the compliancesuite/example-compliancesuite1

4. Check the resources in openshift-compliance namespace
$ oc get all

Actual result
The cron job resources does not purge after deleting the corresponding compliancesuite object.

$ oc get all
NAME                                                     READY   STATUS      RESTARTS   AGE
pod/example-compliancesuite1-rerunner-1596198000-wbbkr   0/1     Completed   0          11m
pod/example-compliancesuite1-rerunner-1596198300-4x4gt   0/1     Completed   0          6m26s
pod/example-compliancesuite1-rerunner-1596198600-k2dfm   0/1     Completed   0          85s
...
NAME                                                     COMPLETIONS   DURATION   AGE
job.batch/example-compliancesuite1-rerunner-1596198000   1/1           7s         11m
job.batch/example-compliancesuite1-rerunner-1596198300   1/1           7s         6m30s
job.batch/example-compliancesuite1-rerunner-1596198600   1/1           8s         89s

Expected result
After deleting the compliancesuite object, the corresponding cron job resources should be purged.

Comment 5 Prashant Dhamdhere 2020-09-09 11:51:52 UTC
Hi Juan,

After deleting the compliancesuite object, the rerunner pods are getting removed but the rerunner 
jobs entries are still there. These jobs should get removed along with compliancesuite object.

OCP version: 4.6.0-0.nightly-2020-09-08-123737
Compliance Operator : v0.1.15

$ oc get pods

NAME                                                         READY   STATUS      RESTARTS   AGE
aggregator-pod-worker-scan                                   0/1     Completed   0          20s
compliance-operator-869646dd4f-ls7nx                         1/1     Running     0          15m
example-compliancesuite-rerunner-1599646860-ffwkp            0/1     Completed   0          7m22s
example-compliancesuite-rerunner-1599647040-477s8            0/1     Completed   0          4m21s
example-compliancesuite-rerunner-1599647220-5mmqs            0/1     Completed   0          81s
ocp4-pp-6786c5f5b-dr6lf                                      1/1     Running     0          14m
rhcos4-pp-78c8cc9d44-76rcd                                   1/1     Running     0          14m
worker-scan-ip-10-0-156-82.us-east-2.compute.internal-pod    0/2     Completed   0          70s
worker-scan-ip-10-0-177-248.us-east-2.compute.internal-pod   0/2     Completed   0          70s
worker-scan-ip-10-0-223-164.us-east-2.compute.internal-pod   0/2     Completed   0          70s
worker-scan-rs-7ffc9b88f6-w6dx4                              1/1     Running     0          70s

$ oc get cronjob
NAME                               SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
example-compliancesuite-rerunner   */3 * * * *   False     0        102s            9m37s

$ oc get jobs --watch
NAME                                          COMPLETIONS   DURATION   AGE
example-compliancesuite-rerunner-1599646860   1/1           8s         7m43s
example-compliancesuite-rerunner-1599647040   1/1           8s         4m43s
example-compliancesuite-rerunner-1599647220   1/1           7s         102s

$ oc get compliancesuite
NAME                      PHASE   RESULT
example-compliancesuite   DONE    COMPLIANT

$ oc delete compliancesuite example-compliancesuite
compliancesuite.compliance.openshift.io "example-compliancesuite" deleted

$ oc get all
NAME                                       READY   STATUS    RESTARTS   AGE
pod/compliance-operator-869646dd4f-ls7nx   1/1     Running   0          16m
pod/ocp4-pp-6786c5f5b-dr6lf                1/1     Running   0          15m
pod/rhcos4-pp-78c8cc9d44-76rcd             1/1     Running   0          15m

NAME                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)             AGE
service/compliance-operator-metrics   ClusterIP   172.30.193.92   <none>        8383/TCP,8686/TCP   15m

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/compliance-operator   1/1     1            1           16m
deployment.apps/ocp4-pp               1/1     1            1           15m
deployment.apps/rhcos4-pp             1/1     1            1           15m

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/compliance-operator-869646dd4f   1         1         1       16m
replicaset.apps/ocp4-pp-6786c5f5b                1         1         1       15m
replicaset.apps/rhcos4-pp-78c8cc9d44             1         1         1       15m

NAME                                                    COMPLETIONS   DURATION   AGE
job.batch/example-compliancesuite-rerunner-1599646860   1/1           8s         8m35s    <<-----
job.batch/example-compliancesuite-rerunner-1599647040   1/1           8s         5m35s   <<-----
job.batch/example-compliancesuite-rerunner-1599647220   1/1           7s         2m34s  <<-----

Comment 7 Prashant Dhamdhere 2020-09-16 15:48:09 UTC
It looks good now. All rerunner pods and relevant jobs are getting removed along with compliancesuite object.


OCP version: 4.6.0-0.nightly-2020-09-16-000734
Compliance Operator : v0.1.16


$ oc get pod
NAME                                                 READY   STATUS      RESTARTS   AGE
aggregator-pod-worker-scan                           0/1     Completed   0          36s
compliance-operator-869646dd4f-tlb4g                 1/1     Running     0          6h56m
example-compliancesuite1-rerunner-1600270380-lbt2l   0/1     Completed   0          8m28s
example-compliancesuite1-rerunner-1600270560-zl9jq   0/1     Completed   0          5m25s
example-compliancesuite1-rerunner-1600270740-bndml   0/1     Completed   0          2m26s
ocp4-pp-6786c5f5b-pmh57                              1/1     Running     0          6h55m
rhcos4-pp-78c8cc9d44-r2c8r                           1/1     Running     0          6h55m
worker-scan-pdhamdhe-vsp-1-b7flr-worker-sbtxh-pod    0/2     Completed   0          2m6s
worker-scan-pdhamdhe-vsp-1-b7flr-worker-wsk6j-pod    0/2     Completed   0          2m6s
worker-scan-rs-5c746d594b-7nfhz                      1/1     Running     0          2m7s

$ oc get cronjob
NAME                                SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
example-compliancesuite1-rerunner   */3 * * * *   False     0        2m36s           27m

$ oc get jobs --watch
NAME                                           COMPLETIONS   DURATION   AGE
example-compliancesuite1-rerunner-1600270380   1/1           16s        8m39s
example-compliancesuite1-rerunner-1600270560   1/1           14s        5m36s
example-compliancesuite1-rerunner-1600270740   1/1           14s        2m37s

$ oc get compliancesuite
NAME                       PHASE   RESULT
example-compliancesuite1   DONE    COMPLIANT

$ oc delete compliancesuite example-compliancesuite1
compliancesuite.compliance.openshift.io "example-compliancesuite1" deleted

$ oc get all
NAME                                       READY   STATUS    RESTARTS   AGE
pod/compliance-operator-869646dd4f-tlb4g   1/1     Running   0          6h57m
pod/ocp4-pp-6786c5f5b-pmh57                1/1     Running   0          6h56m
pod/rhcos4-pp-78c8cc9d44-r2c8r             1/1     Running   0          6h56m

NAME                                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/compliance-operator-metrics   ClusterIP   172.30.100.116   <none>        8383/TCP,8686/TCP   6h56m

NAME                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/compliance-operator   1/1     1            1           6h57m
deployment.apps/ocp4-pp               1/1     1            1           6h56m
deployment.apps/rhcos4-pp             1/1     1            1           6h56m

NAME                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/compliance-operator-869646dd4f   1         1         1       6h57m
replicaset.apps/ocp4-pp-6786c5f5b                1         1         1       6h56m
replicaset.apps/rhcos4-pp-78c8cc9d44             1         1         1       6h56m

Comment 9 errata-xmlrpc 2020-10-27 16:21:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196