Bug 1967086
| Summary: | Cloning DataVolumes between namespaces fails while creating cdi-upload pod | |||
|---|---|---|---|---|
| Product: | Container Native Virtualization (CNV) | Reporter: | nijin ashok <nashok> | |
| Component: | Storage | Assignee: | Alexander Wels <awels> | |
| Status: | CLOSED ERRATA | QA Contact: | Kevin Alon Goldblatt <kgoldbla> | |
| Severity: | high | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 2.5.5 | CC: | alitke, awels, cnv-qe-bugs, kgershon, yadu | |
| Target Milestone: | --- | |||
| Target Release: | 2.6.6 | |||
| Hardware: | All | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | v2.6.6-37 registry-proxy.engineering.redhat.com/rh-osbs/iib:89865 | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1982269 (view as bug list) | Environment: | ||
| Last Closed: | 2021-08-10 17:33:37 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1982269 | |||
Can you check if the target namespace has a LimitRange defined? (In reply to Alexander Wels from comment #2) > Can you check if the target namespace has a LimitRange defined? Target doesn't have LimitRange defined. So triple checked the code and there is nothing we do that sets the limits or request to anything other than what is specified in the defaultPodSourceRequirements in the CDIConfig object (which is set in the CDI CR). So there must be a mutating webhook somewhere that automatically modifies those values, and the usual suspect is a LimitRange for those fields. However as we saw, the must gather doesn't report anything about a LimitRange, and there is no Cluster wide LimitRange object in Open Shift. That being said, all 0s is probably not a great default value and after some testing the following values are reasonable defaults, and we created a PR to set them to those values by default if not specified: CPULimit: 750m (3/4 of a CPU max) MemLimit: 600M (600M of memory max) CPURequest 100m (1/10 of a CPU minimum) MemRequest 60M (60M minimum) Should be sufficient for most work loads. For a work around you can set those values in the CDI CR, and we can see if that lets them continue testing. The linked PR makes this the default values. Moving back to POST because we haven't modified the release branch yet. please attach the cherrypick PR to this bug. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Virtualization 2.6.6 Images security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3119 |
Description of problem: While cloning data volume between namespaces, the cloning is scheduled but never starts. $ oc get dvs NAME PHASE PROGRESS RESTARTS AGE dv-tests-cloning-001 CloneScheduled N/A 30s The PVC status is "bound". $ oc get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE dv-tests-cloning-001 Bound pvc-326a53af-b8f7-4328-a30f-76ec1d21ee21 12Gi RWO ocs-external-storagecluster-ceph-rbd 33s But there is no cdi-upload pod. The cdi-deployment logs have got error "Pod \"cdi-upload-dv-tests-cloning-001\" is invalid: spec.containers[0].resources.requests: Invalid value: \"1m\": must be less than or equal to cpu limit". === {"level":"error","ts":1622451245.2904956,"logger":"controller","msg":"Reconciler error","controller":"upload-controller","name":"dv-tests-cloning-001","namespace":"tests-cloning","error":"Pod \"cdi-upload-dv-tests-cloning-001\" is invalid: spec.containers[0].resources.requests: Invalid value: \"1m\": must be less than or equal to cpu limit","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/github.com/go-logr/zapr/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:237\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:209\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:188\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/src/kubevirt.io/containerized-data-importer/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:90"} === As per my understanding, the pod.Spec.Containers[0].Resources is obtained from the defaultPodResourceRequirements and it has got default values. == $ oc get cdiconfig -o yaml apiVersion: v1 items: - apiVersion: cdi.kubevirt.io/v1beta1 kind: CDIConfig status: defaultPodResourceRequirements: limits: cpu: "0" memory: "0" requests: cpu: "0" memory: "0" === I cannot find a way to see the spec send by the cdi controller to create the pod but it looks like it's sending requests more than that of limits? However, that doesn't make sense since cdiconfig has got default values. There are no quotas and limits for the namespace. The permission is also mapped correctly. Version-Release number of selected component (if applicable): 2.5.5 How reproducible: Observed in a customer environment and not reproduced locally. Steps to Reproduce: 1. Issue is observed when cloning dv between namespaces. Actual results: Cloning DataVolumes between namespaces fails while creating cdi-upload pod. Expected results: cloning should work Additional info: