Bug 2008140 - [4.10.0] CNV fails to deploy due to unavailable SSP virt-template-validator
Summary: [4.10.0] CNV fails to deploy due to unavailable SSP virt-template-validator
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: SSP
Version: 4.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.10.0
Assignee: Karel Šimon
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-27 12:12 UTC by Lukas Bednar
Modified: 2022-03-16 15:55 UTC (History)
5 users (show)

Fixed In Version: kubevirt-template-validator-container-v4.10.0-8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-16 15:55:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2022:0947 0 None None None 2022-03-16 15:55:58 UTC

Internal Links: 2008975

Description Lukas Bednar 2021-09-27 12:12:20 UTC
Description of problem:

Deployment of CNV fails due to unavailable SSP virt-template-validator


Version-Release number of selected component (if applicable):
hco-bundle-registry:v4.10.0-145
OCP-4.9.0-rc.3


How reproducible: 100


Steps to Reproduce:
1. Deploy CNV
2. Observe HCO status
3.

Actual results: SSP is not available:  openshift-cnv/virt-template-validator: No validator pods are running. Expected: 2


Expected results: CNV deployed successfully


Additional info:

[cnv-qe-jenkins@verify-49-hv2s8-executor ~]$ oc get hco -n openshift-cnv -o yaml
apiVersion: v1
items:
- apiVersion: hco.kubevirt.io/v1beta1
  kind: HyperConverged
  metadata:
    creationTimestamp: "2021-09-27T11:57:39Z"
    finalizers:
    - kubevirt.io/hyperconverged
    generation: 2
    labels:
      app: kubevirt-hyperconverged
    name: kubevirt-hyperconverged
    namespace: openshift-cnv
    resourceVersion: "16923520"
    uid: 9936a0fa-15e1-4c74-a4ce-01625987d460
  spec:
    certConfig:
      ca:
        duration: 48h0m0s
        renewBefore: 24h0m0s
      server:
        duration: 24h0m0s
        renewBefore: 12h0m0s
    featureGates:
      enableCommonBootImageImport: false
      sriovLiveMigration: true
      withHostPassthroughCPU: false
    infra: {}
    liveMigrationConfig:
      bandwidthPerMigration: 64Mi
      completionTimeoutPerGiB: 800
      parallelMigrationsPerCluster: 5
      parallelOutboundMigrationsPerNode: 2
      progressTimeout: 150
    workloadUpdateStrategy:
      batchEvictionInterval: 1m0s
      batchEvictionSize: 10
      workloadUpdateMethods:
      - LiveMigrate
      - Evict
    workloads: {}
  status:
    conditions:
    - lastTransitionTime: "2021-09-27T11:57:40Z"
      message: Reconcile completed successfully
      observedGeneration: 2
      reason: ReconcileCompleted
      status: "True"
      type: ReconcileComplete
    - lastTransitionTime: "2021-09-27T11:57:40Z"
      message: 'SSP is not available:  openshift-cnv/virt-template-validator: No validator
        pods are running. Expected: 2'
      observedGeneration: 2
      reason: SSPNotAvailable
      status: "False"
      type: Available
    - lastTransitionTime: "2021-09-27T11:57:40Z"
      message: 'SSP is progressing:  openshift-cnv/virt-template-validator: Not all
        template validator pods are running. Expected: 2, running: 0'
      observedGeneration: 2
      reason: SSPProgressing
      status: "True"
      type: Progressing
    - lastTransitionTime: "2021-09-27T11:57:40Z"
      message: 'SSP is degraded:  openshift-cnv/virt-template-validator: Not all template
        validator pods are running. Expected: 2, running: 0'
      observedGeneration: 2
      reason: SSPDegraded
      status: "True"
      type: Degraded
    - lastTransitionTime: "2021-09-27T11:57:40Z"
      message: 'SSP is progressing:  openshift-cnv/virt-template-validator: Not all
        template validator pods are running. Expected: 2, running: 0'
      observedGeneration: 2
      reason: SSPProgressing
      status: "False"
      type: Upgradeable
    dataImportSchedule: 8 */12 * * *
    observedGeneration: 2
    relatedObjects:
    - apiVersion: scheduling.k8s.io/v1
      kind: PriorityClass
      name: kubevirt-cluster-critical
      resourceVersion: "16677400"
      uid: 9c15b607-fa4a-4990-86ba-0a229fdf8ed3
    - apiVersion: kubevirt.io/v1
      kind: KubeVirt
      name: kubevirt-kubevirt-hyperconverged
      namespace: openshift-cnv
      resourceVersion: "16922979"
      uid: 3f6c1fa6-a202-411a-a372-836a24944826
    - apiVersion: cdi.kubevirt.io/v1beta1
      kind: CDI
      name: cdi-kubevirt-hyperconverged
      resourceVersion: "16922543"
      uid: 898b40f8-5947-4d9c-8962-b22a5b07bb24
    - apiVersion: v1
      kind: ConfigMap
      name: kubevirt-storage-class-defaults
      namespace: openshift-cnv
      resourceVersion: "16922344"
      uid: 2ed2f7d1-5354-455f-9abc-544c6a72b033
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: Role
      name: hco.kubevirt.io:config-reader
      namespace: openshift-cnv
      resourceVersion: "16922354"
      uid: 317bc902-672e-4183-a7f5-b90f145fa3b2
    - apiVersion: rbac.authorization.k8s.io/v1
      kind: RoleBinding
      name: hco.kubevirt.io:config-reader
      namespace: openshift-cnv
      resourceVersion: "16922363"
      uid: 443789fb-ba7d-43bd-bbb1-259aef97e9b7
    - apiVersion: networkaddonsoperator.network.kubevirt.io/v1
      kind: NetworkAddonsConfig
      name: cluster
      resourceVersion: "16923504"
      uid: 6d54685c-455e-45c8-873b-1e3738928648
    - apiVersion: ssp.kubevirt.io/v1beta1
      kind: SSP
      name: ssp-kubevirt-hyperconverged
      namespace: openshift-cnv
      resourceVersion: "16923518"
      uid: 12f072b0-8698-46b5-be42-05a0cc020d6f
    - apiVersion: v1
      kind: Service
      name: kubevirt-hyperconverged-operator-metrics
      namespace: openshift-cnv
      resourceVersion: "16922377"
      uid: 18e5d6c6-6e10-4f11-8691-6ed8da34f85b
    - apiVersion: monitoring.coreos.com/v1
      kind: ServiceMonitor
      name: kubevirt-hyperconverged-operator-metrics
      namespace: openshift-cnv
      resourceVersion: "16922384"
      uid: c6ce7f10-d46d-46bb-8cd3-ba86c52d6ac1
    - apiVersion: monitoring.coreos.com/v1
      kind: PrometheusRule
      name: kubevirt-hyperconverged-prometheus-rule
      namespace: openshift-cnv
      resourceVersion: "16922387"
      uid: 01015110-a4a8-46b6-b57f-635c9cb4b6a1
    - apiVersion: console.openshift.io/v1
      kind: ConsoleCLIDownload
      name: virtctl-clidownloads-kubevirt-hyperconverged
      resourceVersion: "16922389"
      uid: 4d496dd0-ba65-402b-91ad-7f26c89a0035
    - apiVersion: route.openshift.io/v1
      kind: Route
      name: hyperconverged-cluster-cli-download
      namespace: openshift-cnv
      resourceVersion: "16922396"
      uid: 6a1e1bd7-c3ca-48ff-b5d2-d8750f92d504
    - apiVersion: v1
      kind: Service
      name: hyperconverged-cluster-cli-download
      namespace: openshift-cnv
      resourceVersion: "16922400"
      uid: 38f51464-c1ae-445f-a990-b5a82dce4c98
    - apiVersion: console.openshift.io/v1
      kind: ConsoleQuickStart
      name: connect-ext-net-to-vm
      resourceVersion: "16922411"
      uid: b69c083d-62b6-4ef8-ad03-268eec41209e
    - apiVersion: console.openshift.io/v1
      kind: ConsoleQuickStart
      name: create-win10-vm
      resourceVersion: "16922412"
      uid: a201ed37-e12a-49c3-993b-1069a3818235
    - apiVersion: console.openshift.io/v1
      kind: ConsoleQuickStart
      name: create-rhel-vm
      resourceVersion: "16922414"
      uid: bb18c796-f697-4364-b5e5-8dfb250e7e18
    - apiVersion: console.openshift.io/v1
      kind: ConsoleQuickStart
      name: import-vmware-vm
      resourceVersion: "16922416"
      uid: c60fda46-d200-477c-9183-6fb819f5f4f7
    - apiVersion: v1
      kind: ConfigMap
      name: grafana-dashboard-kubevirt-top-consumers
      namespace: openshift-config-managed
      resourceVersion: "16922418"
      uid: c33dfbe7-d9b5-4db2-9c3e-066af2794695
    versions:
    - name: operator
      version: v4.10.0
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

Comment 1 Erkan Erol 2021-09-27 12:38:11 UTC
The problem is in readiness probe of virt-template-validator.


$ oc -n openshift-cnv describe pod virt-template-validator-559bd7b786-lcbfq
     ...
      Warning  Unhealthy       122m (x10 over 124m)    kubelet            Readiness probe failed: HTTP probe failed with statuscode: 404
      Warning  ProbeError      4m13s (x814 over 124m)  kubelet            Readiness probe error: HTTP probe failed with statuscode: 404
    ...

Comment 3 Lukas Bednar 2021-09-28 05:27:24 UTC
Verified with HCO-v4.10.0-148

Comment 9 errata-xmlrpc 2022-03-16 15:55:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.10.0 Images security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0947


Note You need to log in before you can comment on or make changes to this bug.