Description of problem: The installation of NVIDIA GPU Operator fails with Openshift 4.4.11 on AWS when installation is attempted from Operator Hub. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: 1.Create OCP 4.4.11 cluster on AWS 2.Follow NVIDIA GPU Operator instruction in NVIDIA online documentation https://docs.nvidia.com/datacenter/kubernetes/openshift-on-gpu-install-guide/index.html a) make sure entitlements are working properly b) create new project “gpu-operator-resources” c) Install the NVIDIA GPU Operator.From the side menu, select Operators > OperatorHub, then search for the NVIDIA GPU Operator. d) click install 3. view log oc logs -f nvidia-driver-daemonset-XXXXX "Error: Unable to find a match: kernel-headers-4.18.0-147.20.1.el8_1.x86_64 kernel-devel-4.18.0-147.20.1.el8_1.x86_64" Actual results: $ oc logs -f nvidia-driver-daemonset-vrx5c + set -eu + RUN_DIR=/run/nvidia + PID_FILE=/run/nvidia/nvidia-driver.pid + DRIVER_VERSION=440.64.00 + KERNEL_UPDATE_HOOK=/run/kernel/postinst.d/update-nvidia-driver + '[' 1 -eq 0 ']' + command=init + shift + case "${command}" in ++ getopt -l accept-license -o a -- + options=' --' + '[' 0 -ne 0 ']' + eval set -- ' --' ++ set -- -- + ACCEPT_LICENSE= ++ uname -r + KERNEL_VERSION=4.18.0-147.20.1.el8_1.x86_64 + PRIVATE_KEY= + PACKAGE_TAG= + for opt in ${options} + case "$opt" in + shift + break + '[' 0 -ne 0 ']' + init ========== NVIDIA Software Installer ========== + echo -e '\n========== NVIDIA Software Installer ==========\n' Starting installation of NVIDIA driver version 440.64.00 for Linux kernel version 4.18.0-147.20.1.el8_1.x86_64 + echo -e 'Starting installation of NVIDIA driver version 440.64.00 for Linux kernel version 4.18.0-147.20.1.el8_1.x86_64\n' + exec + flock -n 3 + echo 138704 + trap 'echo '\''Caught signal'\''; exit 1' HUP INT QUIT PIPE TERM + trap _shutdown EXIT + _unload_driver + rmmod_args=() + local rmmod_args + local nvidia_deps=0 + local nvidia_refs=0 + local nvidia_uvm_refs=0 + local nvidia_modeset_refs=0 + echo 'Stopping NVIDIA persistence daemon...' Stopping NVIDIA persistence daemon... + '[' -f /var/run/nvidia-persistenced/nvidia-persistenced.pid ']' Unloading NVIDIA driver kernel modules... + echo 'Unloading NVIDIA driver kernel modules...' + '[' -f /sys/module/nvidia_modeset/refcnt ']' + '[' -f /sys/module/nvidia_uvm/refcnt ']' + '[' -f /sys/module/nvidia/refcnt ']' + '[' 0 -gt 0 ']' + '[' 0 -gt 0 ']' + '[' 0 -gt 0 ']' + '[' 0 -gt 0 ']' + return 0 + _unmount_rootfs Unmounting NVIDIA driver rootfs... + echo 'Unmounting NVIDIA driver rootfs...' + findmnt -r -o TARGET + grep /run/nvidia/driver + _kernel_requires_package + local proc_mount_arg= Checking NVIDIA driver packages... + echo 'Checking NVIDIA driver packages...' + [[ ! -d /usr/src/nvidia-440.64.00/kernel ]] + cd /usr/src/nvidia-440.64.00/kernel + proc_mount_arg='--proc-mount-point /lib/modules/4.18.0-147.20.1.el8_1.x86_64/proc' ++ ls -d -1 'precompiled/**' + return 0 + _update_package_cache Updating the package cache... + '[' '' '!=' builtin ']' + echo 'Updating the package cache...' + yum -q makecache + _install_prerequisites ++ mktemp -d + local tmp_dir=/tmp/tmp.u1TOrlvdnA + trap 'rm -rf /tmp/tmp.u1TOrlvdnA' EXIT + cd /tmp/tmp.u1TOrlvdnA + dnf install -q -y elfutils-libelf.x86_64 elfutils-libelf-devel.x86_64 + rm -rf /lib/modules/4.18.0-147.20.1.el8_1.x86_64 + mkdir -p /lib/modules/4.18.0-147.20.1.el8_1.x86_64/proc Installing Linux kernel headers... + echo 'Installing Linux kernel headers...' + dnf -q -y install kernel-headers-4.18.0-147.20.1.el8_1.x86_64 kernel-devel-4.18.0-147.20.1.el8_1.x86_64 Error: Unable to find a match: kernel-headers-4.18.0-147.20.1.el8_1.x86_64 kernel-devel-4.18.0-147.20.1.el8_1.x86_64 ++ rm -rf /tmp/tmp.u1TOrlvdnA + _shutdown + _unload_driver + rmmod_args=() + local rmmod_args Stopping NVIDIA persistence daemon... + local nvidia_deps=0 + local nvidia_refs=0 + local nvidia_uvm_refs=0 + local nvidia_modeset_refs=0 + echo 'Stopping NVIDIA persistence daemon...' + '[' -f /var/run/nvidia-persistenced/nvidia-persistenced.pid ']' Unloading NVIDIA driver kernel modules... + echo 'Unloading NVIDIA driver kernel modules...' + '[' -f /sys/module/nvidia_modeset/refcnt ']' + '[' -f /sys/module/nvidia_uvm/refcnt ']' + '[' -f /sys/module/nvidia/refcnt ']' + '[' 0 -gt 0 ']' + '[' 0 -gt 0 ']' + '[' 0 -gt 0 ']' + '[' 0 -gt 0 ']' + return 0 + _unmount_rootfs + echo 'Unmounting NVIDIA driver rootfs...' Unmounting NVIDIA driver rootfs... + findmnt -r -o TARGET + grep /run/nvidia/driver + rm -f /run/nvidia/nvidia-driver.pid /run/kernel/postinst.d/update-nvidia-driver + return 0 Expected results: Expected NVIDIA GPU Operator to install when you click "install" in Operator Hub Additional info: $ oc logs -f cluster-entitled-build-pod Updating Subscription Management repositories. Unable to read consumer identity Subscription Manager is operating in container mode. Red Hat Enterprise Linux 8 for x86_64 - BaseOS 34 MB/s | 20 MB 00:00 Red Hat Enterprise Linux 8 for x86_64 - AppStre 17 MB/s | 19 MB 00:01 Red Hat Universal Base Image 8 (RPMs) - BaseOS 4.7 MB/s | 767 kB 00:00 Red Hat Universal Base Image 8 (RPMs) - AppStre 22 MB/s | 3.9 MB 00:00 Red Hat Universal Base Image 8 (RPMs) - CodeRea 56 kB/s | 11 kB 00:00 ====================== Name Exactly Matched: kernel-devel ====================== kernel-devel-4.18.0-80.1.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-80.el8.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-80.4.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-80.7.1.el8_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-80.11.1.el8_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-147.el8.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-80.11.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-80.7.2.el8_0.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-147.0.3.el8_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-147.8.1.el8_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-147.0.2.el8_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-147.3.1.el8_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-147.5.1.el8_1.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-193.el8.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-193.1.2.el8_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-193.6.3.el8_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-193.13.2.el8_2.x86_64 : Development package for building kernel modules to match the kernel kernel-devel-4.18.0-193.14.3.el8_2.x86_64 : Development package for building kernel modules to match the kernel Kyles-Mac-mini:infra-nodes kylebader$
This is a known problem: https://access.redhat.com/solutions/5232481 https://bugzilla.redhat.com/show_bug.cgi?id=1862229
*** Bug 1867854 has been marked as a duplicate of this bug. ***
*** Bug 1886059 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 1853726 ***