Description of problem: kernel-headers are now absent from RHCOS, but MCO extensions cannot be used to install them Version-Release number of selected component (if applicable): 4.6.0-rc.3 Steps to Reproduce: 1. oc debug node/... 2. chroot /host 3. rpm -qa | grep kernel 4. apply MachinConfig apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig metadata: labels: machineconfiguration.openshift.io/role: worker name: 03-worker-extensions spec: config: ignition: version: 3.1.0 extensions: - kernel-headers - kernel-devel 5. invalid extensions found: kernel-headers EDIT: note, using just kernel-devel under extensions also does not install kernel-headers package. Actual results: RHCOS removed kernel-headers and kernel-devel, but they cannot both be installed with MCO extensions rpm -qa | grep kernel kernel-core-4.18.0-193.24.1.el8_2.dt1.x86_64 kernel-modules-4.18.0-193.24.1.el8_2.dt1.x86_64 kernel-modules-extra-4.18.0-193.24.1.el8_2.dt1.x86_64 kernel-4.18.0-193.24.1.el8_2.dt1.x86_64 Expected results: Access to kernel-headers and kernel-devel on RHCOS workers kernel-core-4.18.0-193.14.3.el8_2.ppc64le kernel-modules-extra-4.18.0-193.14.3.el8_2.ppc64le kernel-modules-4.18.0-193.14.3.el8_2.ppc64le kernel-4.18.0-193.14.3.el8_2.ppc64le kernel-devel-4.18.0-193.14.3.el8_2.ppc64le kernel-headers-4.18.0-193.14.3.el8_2.ppc64le
Thank you for reporting this issue, we will work on a fix to include kernel-headers included as part of kernel-devel extension.
Sinny, could share some insights? This problem impacts the Spectrum Scale beta program. Dan, could you add this bug for discussion on the Thu multi-arch call?
(In reply to Hendrik Brueckner from comment #2) > Sinny, could share some insights? Currently the way kernel-devel extension is implemented in MCO - it installs only kernel-devel package (https://github.com/openshift/machine-config-operator/blob/master/pkg/daemon/update.go#L912) and missing dependencies if any. It seems, kernel-headers package is not an install dependency for kernel-devel, and hence it doesn't get installed. In order to fix the issue, we will need to patch MCO to also install kernel-headers package when kernel-devel is specified as an extensions in a MachineConfig. > This problem impacts the Spectrum Scale beta program. We understand the urgency here, this bug is in our top priority list to get fixed.
Sinny, thanks for the update.
Patch has been submitted to upstream master branch - https://github.com/openshift/machine-config-operator/pull/2170 . Once pull request gets merged and verified we will backport it to 4.6
Hi @Hendrick, please see Sinny's Comment 22. Do we still need to add this bug to Thursday meeting's discussion? If so, I will add it to the list.
@danili With the PR already merged, there is great progress here. The Spectrum Scale team is in contact with Red Hat on this topic as well.
Backported fix is in PR https://github.com/openshift/machine-config-operator/pull/2187
Added this bug to the MA meeting agenda. I'm removing the "needinfo"
Veried on 4.6.0-0.nightly-2020-11-15-104235. Successfully installed kernel-devel extension $ oc get nodes NAME STATUS ROLES AGE VERSION ci-ln-30rd65b-f76d1-zd8c9-master-0 Ready master 33m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-master-1 Ready master 33m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-master-2 Ready master 33m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv Ready worker 24m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk Ready worker 25m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-worker-d-lz6q2 Ready worker 25m v1.19.0+9f84db3 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.0-0.nightly-2020-11-15-104235 True False 106s Cluster version is 4.6.0-0.nightly-2020-11-15-104235 $ oc debug node/ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv -- chroot /host rpm -qa | grep kernel Starting pod/ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv-debug ... To use host binaries, run `chroot /host` kernel-core-4.18.0-193.29.1.el8_2.x86_64 kernel-modules-4.18.0-193.29.1.el8_2.x86_64 kernel-4.18.0-193.29.1.el8_2.x86_64 kernel-modules-extra-4.18.0-193.29.1.el8_2.x86_64 Removing debug pod ... $ cat << EOF > 03-worker-extensions.yaml > apiVersion: machineconfiguration.openshift.io/v1 > kind: MachineConfig > metadata: > labels: > machineconfiguration.openshift.io/role: worker > name: 03-worker-extensions > spec: > config: > ignition: > version: 3.1.0 > extensions: > - kernel-devel > EOF $ oc create -f 03-worker-extensions.yaml machineconfig.machineconfiguration.openshift.io/03-worker-extensions created $ oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 00-worker fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 01-master-container-runtime fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 01-master-kubelet fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 01-worker-container-runtime fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 01-worker-kubelet fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 03-worker-extensions 3.1.0 4s 99-master-generated-registries fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 99-master-ssh 3.1.0 38m 99-worker-generated-registries fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m 99-worker-ssh 3.1.0 38m rendered-master-3b5af39131e1d19c816b48cfd34e0192 fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m rendered-worker-57f8c43ccd91bebc038c12a94565c3a8 fbb7093f17ef1183b4a2e620daf064e50a56e720 3.1.0 32m $ oc get mcp/worker NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE worker rendered-worker-57f8c43ccd91bebc038c12a94565c3a8 False True False 3 0 0 0 33m $ watch oc get nodes $ oc get nodes NAME STATUS ROLES AGE VERSION ci-ln-30rd65b-f76d1-zd8c9-master-0 Ready master 35m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-master-1 Ready master 35m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-master-2 Ready master 35m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv Ready worker 26m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk Ready,SchedulingDisabled worker 26m v1.19.0+9f84db3 ci-ln-30rd65b-f76d1-zd8c9-worker-d-lz6q2 Ready worker 26m v1.19.0+9f84db3 $ watch oc get mcp/worker $ oc debug node/ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk -- chroot /host rpm -qa | grep kernel Starting pod/ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk-debug ... To use host binaries, run `chroot /host` kernel-core-4.18.0-193.29.1.el8_2.x86_64 kernel-headers-4.18.0-193.29.1.el8_2.x86_64 kernel-modules-4.18.0-193.29.1.el8_2.x86_64 kernel-4.18.0-193.29.1.el8_2.x86_64 kernel-devel-4.18.0-193.29.1.el8_2.x86_64 kernel-modules-extra-4.18.0-193.29.1.el8_2.x86_64 Removing debug pod ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.6.6 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:5115