Bug 1835446
| Summary: | Special resource operator gpu-driver-container pod error related to elfutils-libelf-devel | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Paige Rubendall <prubenda> | ||||
| Component: | Special Resource Operator | Assignee: | Zvonko Kosic <zkosic> | ||||
| Status: | CLOSED NOTABUG | QA Contact: | Walid A. <wabouham> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.5 | CC: | aos-bugs, ematysek, mifiedle | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-05-26 18:29:30 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
The cluster is not entitled, please entitle the cluster and try again. This is not a bug. |
Created attachment 1688185 [details] This is the output of oc logs for the nvidia gpu driver container Description of problem: Gpu driver container pod hits error, looks like it is unable to find elfutils-libelf-devel.x86_64 package Version-Release number of selected component (if applicable): 4.5 How reproducible: 100% Steps to Reproduce: 1. Deploy ipi cluster on RHCOS 2. Deploy SRO from github master Actual results: Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 7m1s default-scheduler Successfully assigned nvidia-gpu/nvidia-gpu-driver-container-rhel8-tzbr9 to ip-10-0-148-114.us-east-2.compute.internal Normal AddedInterface 7m multus Add eth0 [10.129.4.22/23] Normal Started 6m8s (x4 over 6m59s) kubelet, ip-10-0-148-114.us-east-2.compute.internal Started container nvidia-gpu-driver-container-rhel8 Normal Pulling 5m18s (x5 over 6m59s) kubelet, ip-10-0-148-114.us-east-2.compute.internal Pulling image "image-registry.openshift-image-registry.svc:5000/nvidia-gpu/nvidia-gpu-driver-container:v4.18.0-147.8.1.el8_1.x86_64" Normal Pulled 5m18s (x5 over 6m59s) kubelet, ip-10-0-148-114.us-east-2.compute.internal Successfully pulled image "image-registry.openshift-image-registry.svc:5000/nvidia-gpu/nvidia-gpu-driver-container:v4.18.0-147.8.1.el8_1.x86_64" Normal Created 5m18s (x5 over 6m59s) kubelet, ip-10-0-148-114.us-east-2.compute.internal Created container nvidia-gpu-driver-container-rhel8 Warning BackOff 111s (x23 over 6m54s) kubelet, ip-10-0-148-114.us-east-2.compute.internal Back-off restarting failed container $ oc get pods NAME READY STATUS RESTARTS AGE nvidia-gpu-driver-build-1-build 0/1 Completed 0 14m nvidia-gpu-driver-container-rhel8-tzbr9 0/1 CrashLoopBackOff 6 6m43s special-resource-operator-76b658c584-lxzwr 1/1 Running 0 14m Expected results: Container running successfully Additional info: Using $oc logs nvidia-gpu-driver-container-rhel8-tzbr9 I can see the following error message. "Error: Unable to find a match: elfutils-libelf-devel.x86_64"