Bug 1861894

Summary: [4.5] rpm-ostree crashes when deploying realtime kernel
Product: Red Hat Enterprise Linux 8 Reporter: Jonathan Lebon <jlebon>
Component: rpm-ostreeAssignee: Jonathan Lebon <jlebon>
Status: CLOSED ERRATA QA Contact: atomic-bugs <atomic-bugs>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 8.2CC: imcleod, jlebon, miabbott, walters
Target Milestone: rcKeywords: ZStream
Target Release: 8.3Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1862233 (view as bug list) Environment:
Last Closed: 2020-11-04 03:11:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1862233    

Description Jonathan Lebon 2020-07-29 20:08:11 UTC
This bug was initially created as a copy of Bug #1860926

I am copying this bug because: 

We need to backport an upstream patch in order to fix this bug in 8.2-based RHCOS/OCP4:
https://github.com/coreos/rpm-ostree/pull/2178

Description of problem:


Version-Release number of selected component (if applicable):
4.5.0-0.nightly-2020-07-25-031342

How reproducible:
always

Steps to Reproduce:
1. ./openshift-install create manifests
2. add this MachineConfig to manifests:
   cat <<EOF >openshift/99-master-kerneltype.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "master"
  name: 99-master-kerneltype
spec:
  kernelType: realtime
EOF
    cat <<EOF >openshift/99-worker-kerneltype.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: "worker"
  name: 99-worker-kerneltype
spec:
  kernelType: realtime
EOF


Actual results:
bootstrap get failed. 

Expected results:
installation get completed.

Additional info:
Applying RT machineconfig as Day 2 deployment get successful.

Comment 7 Micah Abbott 2020-09-28 15:21:09 UTC
Was able to verify with an OCP 4.6 cluster; installed from scratch with a MC configuring realtime kernel.

```
$ oc get clusterversion
NAME      VERSION      AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.0-fc.8   True        False         26m     Cluster version is 4.6.0-fc.8

$ oc get nodes
NAME                                         STATUS   ROLES    AGE   VERSION
ip-10-0-134-59.us-west-2.compute.internal    Ready    master   46m   v1.19.0+8a39924
ip-10-0-138-143.us-west-2.compute.internal   Ready    worker   34m   v1.19.0+8a39924
ip-10-0-169-97.us-west-2.compute.internal    Ready    master   47m   v1.19.0+8a39924
ip-10-0-183-134.us-west-2.compute.internal   Ready    worker   35m   v1.19.0+8a39924
ip-10-0-206-51.us-west-2.compute.internal    Ready    worker   35m   v1.19.0+8a39924
ip-10-0-218-105.us-west-2.compute.internal   Ready    master   47m   v1.19.0+8a39924

$ oc get mc
NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
00-master                                          a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
00-worker                                          a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
01-master-container-runtime                        a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
01-master-kubelet                                  a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
01-worker-container-runtime                        a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
01-worker-kubelet                                  a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
99-master-generated-registries                     a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
99-master-kerneltype                                                                                            56m
99-master-ssh                                                                                 3.1.0             56m
99-worker-generated-registries                     a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
99-worker-ssh                                                                                 3.1.0             56m
rendered-master-7c7ce4964dcd4db1aaf39a3363516a47   a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m
rendered-worker-d34893d4740e2a7a1902fef3210a7cd6   a3c9532c8e8f2efe9b0f739fbd761b32cc0bfa2b   3.1.0             45m

$ oc get -o yaml mc/99-master-kerneltype
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  creationTimestamp: "2020-09-28T14:22:29Z"
  generation: 1
  labels:
    machineconfiguration.openshift.io/role: master
  managedFields:
  - apiVersion: machineconfiguration.openshift.io/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:labels:
          .: {}
          f:machineconfiguration.openshift.io/role: {}
      f:spec:
        .: {}
        f:kernelType: {}
    manager: cluster-bootstrap
    operation: Update
    time: "2020-09-28T14:22:29Z"
  name: 99-master-kerneltype
  resourceVersion: "1471"
  selfLink: /apis/machineconfiguration.openshift.io/v1/machineconfigs/99-master-kerneltype
  uid: 6b677c73-f313-4e58-8c0a-2a192105ddd3
spec:
  kernelType: realtime

$ oc debug node/ip-10-0-134-59.us-west-2.compute.internal 
Starting pod/ip-10-0-134-59us-west-2computeinternal-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.134.59
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# uname -a
Linux ip-10-0-134-59 4.18.0-193.19.1.rt13.70.el8_2.x86_64 #1 SMP PREEMPT RT Wed Aug 26 17:57:22 EDT 2020 x86_64 x86_64 x86_64 GNU/Linux
sh-4.4# rpm -q rpm-ostree
rpm-ostree-2020.4-1.el8.x86_64
sh-4.4# 
```

Comment 8 Colin Walters 2020-09-28 17:12:33 UTC
FWIW there is a periodic "gcp-rt" test run on the nightly stream, here's an example run from
this release image: https://openshift-release.apps.ci.l2s4.p1.openshiftapps.com/releasestream/4.6.0-0.nightly/release/4.6.0-0.nightly-2020-09-28-110510

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-gcp-rt-4.6/1310566118355636224

It failed e2e tests but it did install, which should also be sufficient to verify this bug.

Comment 11 errata-xmlrpc 2020-11-04 03:11:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (rpm-ostree bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:4708