Bug 1888853 - MCO extension kernel-headers is invalid
Summary: MCO extension kernel-headers is invalid
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Machine Config Operator
Version: 4.6
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
: 4.6.z
Assignee: Sinny Kumari
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On: 1890074
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-10-15 23:46 UTC by Evan Dunn
Modified: 2020-11-30 16:46 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1890074 (view as bug list)
Environment:
Last Closed: 2020-11-30 16:45:28 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 2187 0 None closed Bug 1888853: daemon: allow an extension to install group of packages 2021-02-04 02:21:13 UTC
Red Hat Product Errata RHBA-2020:5115 0 None None None 2020-11-30 16:46:18 UTC

Description Evan Dunn 2020-10-15 23:46:09 UTC
Description of problem:
kernel-headers are now absent from RHCOS, but MCO extensions cannot be used to install them

Version-Release number of selected component (if applicable):
4.6.0-rc.3

Steps to Reproduce:
1. oc debug node/... 
2. chroot /host
3. rpm -qa | grep kernel
4. apply MachinConfig

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker
  name: 03-worker-extensions
spec:
  config:
    ignition:
      version: 3.1.0
  extensions:
    - kernel-headers
    - kernel-devel
5. invalid extensions found: kernel-headers

EDIT: note, using just kernel-devel under extensions also does not install kernel-headers package.

Actual results:

RHCOS removed kernel-headers and kernel-devel, but they cannot both be installed with MCO extensions

rpm -qa | grep kernel

kernel-core-4.18.0-193.24.1.el8_2.dt1.x86_64
kernel-modules-4.18.0-193.24.1.el8_2.dt1.x86_64
kernel-modules-extra-4.18.0-193.24.1.el8_2.dt1.x86_64
kernel-4.18.0-193.24.1.el8_2.dt1.x86_64

Expected results:

Access to kernel-headers and kernel-devel on RHCOS workers

kernel-core-4.18.0-193.14.3.el8_2.ppc64le
kernel-modules-extra-4.18.0-193.14.3.el8_2.ppc64le
kernel-modules-4.18.0-193.14.3.el8_2.ppc64le
kernel-4.18.0-193.14.3.el8_2.ppc64le
kernel-devel-4.18.0-193.14.3.el8_2.ppc64le
kernel-headers-4.18.0-193.14.3.el8_2.ppc64le

Comment 1 Sinny Kumari 2020-10-21 09:36:34 UTC
Thank you for reporting this issue, we will work on a fix to include kernel-headers included as part of kernel-devel extension.

Comment 2 Hendrik Brueckner 2020-10-21 09:47:53 UTC
Sinny, could share some insights?

This problem impacts the Spectrum Scale beta program.

Dan, could you add this bug for discussion on the Thu multi-arch call?

Comment 4 Sinny Kumari 2020-10-21 10:21:17 UTC
(In reply to Hendrik Brueckner from comment #2)
> Sinny, could share some insights?

Currently the way kernel-devel extension is implemented in MCO - it installs only kernel-devel package (https://github.com/openshift/machine-config-operator/blob/master/pkg/daemon/update.go#L912) and missing dependencies if any. It seems, kernel-headers package is not an install dependency for kernel-devel, and hence it doesn't get installed. 

In order to fix the issue, we will need to patch MCO to also install kernel-headers package when kernel-devel is specified as an extensions in a MachineConfig.

> This problem impacts the Spectrum Scale beta program.

We understand the urgency here, this bug is in our top priority list to get fixed.

Comment 6 Hendrik Brueckner 2020-10-21 12:30:38 UTC
Sinny, thanks for the update.

Comment 12 Sinny Kumari 2020-10-22 14:25:17 UTC
Patch has been submitted to upstream master branch - https://github.com/openshift/machine-config-operator/pull/2170 . Once pull request gets merged and verified we will backport it to 4.6

Comment 13 Dan Li 2020-10-26 12:10:04 UTC
Hi @Hendrick, please see Sinny's Comment 22. Do we still need to add this bug to Thursday meeting's discussion? If so, I will add it to the list.

Comment 14 Hendrik Brueckner 2020-10-26 13:41:45 UTC
@danili With the PR already merged, there is great progress here. The Spectrum Scale team is in contact with Red Hat on this topic as well.

Comment 15 Sinny Kumari 2020-10-29 13:24:30 UTC
Backported fix is in PR https://github.com/openshift/machine-config-operator/pull/2187

Comment 16 Dan Li 2020-11-06 18:18:36 UTC
Added this bug to the MA meeting agenda. I'm removing the "needinfo"

Comment 19 Michael Nguyen 2020-11-16 18:50:15 UTC
Veried on 4.6.0-0.nightly-2020-11-15-104235.  Successfully installed kernel-devel extension


$ oc get nodes
NAME                                       STATUS   ROLES    AGE   VERSION
ci-ln-30rd65b-f76d1-zd8c9-master-0         Ready    master   33m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-master-1         Ready    master   33m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-master-2         Ready    master   33m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv   Ready    worker   24m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk   Ready    worker   25m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-worker-d-lz6q2   Ready    worker   25m   v1.19.0+9f84db3
$ oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-0.nightly-2020-11-15-104235   True        False         106s    Cluster version is 4.6.0-0.nightly-2020-11-15-104235
$ oc debug node/ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv -- chroot /host rpm -qa | grep kernel
Starting pod/ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv-debug ...
To use host binaries, run `chroot /host`
kernel-core-4.18.0-193.29.1.el8_2.x86_64
kernel-modules-4.18.0-193.29.1.el8_2.x86_64
kernel-4.18.0-193.29.1.el8_2.x86_64
kernel-modules-extra-4.18.0-193.29.1.el8_2.x86_64

Removing debug pod ...
$ cat << EOF > 03-worker-extensions.yaml
> apiVersion: machineconfiguration.openshift.io/v1
> kind: MachineConfig
> metadata:
>   labels:
>     machineconfiguration.openshift.io/role: worker
>   name: 03-worker-extensions
> spec:
>   config:
>     ignition:
>       version: 3.1.0
>   extensions:
>     - kernel-devel
> EOF
$ oc create -f 03-worker-extensions.yaml 
machineconfig.machineconfiguration.openshift.io/03-worker-extensions created
$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
00-worker                                          fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
01-master-container-runtime                        fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
01-master-kubelet                                  fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
01-worker-container-runtime                        fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
01-worker-kubelet                                  fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
03-worker-extensions                                                                          3.1.0             4s
99-master-generated-registries                     fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
99-master-ssh                                                                                 3.1.0             38m
99-worker-generated-registries                     fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
99-worker-ssh                                                                                 3.1.0             38m
rendered-master-3b5af39131e1d19c816b48cfd34e0192   fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
rendered-worker-57f8c43ccd91bebc038c12a94565c3a8   fbb7093f17ef1183b4a2e620daf064e50a56e720   3.1.0             32m
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-57f8c43ccd91bebc038c12a94565c3a8   False     True       False      3              0                   0                     0                      33m
$ watch oc get nodes
$ oc get nodes
NAME                                       STATUS                     ROLES    AGE   VERSION
ci-ln-30rd65b-f76d1-zd8c9-master-0         Ready                      master   35m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-master-1         Ready                      master   35m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-master-2         Ready                      master   35m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-worker-b-9xktv   Ready                      worker   26m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk   Ready,SchedulingDisabled   worker   26m   v1.19.0+9f84db3
ci-ln-30rd65b-f76d1-zd8c9-worker-d-lz6q2   Ready                      worker   26m   v1.19.0+9f84db3
$ watch oc get mcp/worker 
$ oc debug node/ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk -- chroot /host rpm -qa | grep kernel
Starting pod/ci-ln-30rd65b-f76d1-zd8c9-worker-c-m5zjk-debug ...
To use host binaries, run `chroot /host`
kernel-core-4.18.0-193.29.1.el8_2.x86_64
kernel-headers-4.18.0-193.29.1.el8_2.x86_64
kernel-modules-4.18.0-193.29.1.el8_2.x86_64
kernel-4.18.0-193.29.1.el8_2.x86_64
kernel-devel-4.18.0-193.29.1.el8_2.x86_64
kernel-modules-extra-4.18.0-193.29.1.el8_2.x86_64

Removing debug pod ...

Comment 21 errata-xmlrpc 2020-11-30 16:45:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.6 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:5115


Note You need to log in before you can comment on or make changes to this bug.