Bug 2088022

Summary: Default CPU requests on Velero/Restic are too demanding making scheduling fail in certain environments
Product: Migration Toolkit for Containers Reporter: Pranav Gaikwad <pgaikwad>
Component: OperatorAssignee: Pranav Gaikwad <pgaikwad>
Status: CLOSED ERRATA QA Contact: Prasad Joshi <prajoshi>
Severity: high Docs Contact: Richard Hoch <rhoch>
Priority: high    
Version: 1.7.1CC: prajoshi, rjohnson
Target Milestone: ---   
Target Release: 1.7.2   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-07-01 09:53:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pranav Gaikwad 2022-05-18 16:51:12 UTC
Description of problem:
Default CPU requests for Velero and Restic Pods are set to 500m. These values are high. The resources can be configured in DPA using `podConfig` field for Velero and Restic. Migration operator should set CPU requests to a lower value such as 100m so that Velero/Restic pods can be scheduled in resource constrained environments MTC often operates in.  

Version-Release number of selected component (if applicable):
1.7.1

How reproducible:
Always

Steps to Reproduce:
Deploy migration operator in an environment where some of the nodes do not have 500m CPU available. 


Actual results:
Restic Pods failed to schedule on nodes which don't have enough CPU.

Expected results:
Restic Pods should request lower CPU such that it can be scheduled on all nodes.

Additional info:

Comment 6 Prasad Joshi 2022-06-20 13:04:44 UTC
Verified with MTC 1.7.2 Pre-stage 

metadata_nvr: openshift-migration-operator-metadata-container-v1.7.2-15

DPA CR: 
  spec:
    backupImages: false
    configuration:
      restic:
        enable: true
        podConfig:
          labels:
            app.kubernetes.io/part-of: openshift-migration
          resourceAllocations:
            requests:
              cpu: 100m
        supplementalGroups: []
        timeout: 1h
      velero:
        defaultPlugins:
        - openshift
        - aws
        - gcp
        - azure
        noDefaultBackupLocation: true
        podConfig:
          labels:
            app.kubernetes.io/part-of: openshift-migration
          resourceAllocations:
            requests:
              cpu: 100m

$ oc get pod -n openshift-migration  velero-57c48b4bb-82mff -o yaml
    resources:
      limits:
        cpu: "1"
        memory: 512Mi
      requests:
        cpu: 100m
        memory: 128Mi

$ oc get pod -n openshift-migration  restic-xdsdb -o yaml 
    name: restic
    resources:
      limits:
        cpu: "1"
        memory: 512Mi
      requests:
        cpu: 100m
        memory: 128Mi

I see the correct cpu.request value as per above PR 

Moving this to verified status.

Comment 12 errata-xmlrpc 2022-07-01 09:53:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Migration Toolkit for Containers (MTC) 1.7.2 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5483