Bug 1914469

Summary: real-time kernel in RHCOS is not synchronized
Product: OpenShift Container Platform Reporter: Dave Cain <dcain>
Component: RHCOSAssignee: Micah Abbott <miabbott>
Status: CLOSED NOTABUG QA Contact: Michael Nguyen <mnguyen>
Severity: urgent Docs Contact:
Priority: high    
Version: 4.6.zCC: bbreard, brault, dornelas, imcleod, jligon, kholtz, mapfelba, miabbott, mrussell, nstielau
Target Milestone: ---   
Target Release: 4.7.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1914988 (view as bug list) Environment:
Last Closed: 2021-01-11 17:01:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1914988, 1922262, 1922263    

Description Dave Cain 2021-01-08 23:54:37 UTC
Description of problem:
The Realtime (RT) variant of the RHEL kernel shipped in downstream RHCOS appears to not be synchronized with the standard kernel.

For example, the latest currently shipping stable version as of the BZ authoring is version 4.6.9.  That corresponds to 46.82.202012151054-0, which has kernel version 4.18.0-193.37.1.el8_2.x86_64.

When switching to the RT variant of the kernel via the Performance AddOn Operator, one gets booted into RT kernel version 
4.18.0-193.28.1.rt13.77.el8_2.x86_64.  This should be a more recent kernel.

Additional info:
The 28.1 kernel is from October and should be on a later release, such as 37.1, which is from December.

In looking at nightly builds for 4.6, it still has the kernel version from October.  (4.6.0-0.nightly-2021-01-08-200800) mounting the machine-os-content and looking in the extensions folder.

./extensions/kernel-rt/kernel-headers-4.18.0-193.28.1.el8_2.x86_64.rpm
./extensions/kernel-rt/kernel-rt-core-4.18.0-193.28.1.rt13.77.el8_2.x86_64.rpm
./extensions/kernel-rt/kernel-rt-devel-4.18.0-193.28.1.rt13.77.el8_2.x86_64.rpm
./extensions/kernel-rt/kernel-rt-kvm-4.18.0-193.28.1.rt13.77.el8_2.x86_64.rpm
./extensions/kernel-rt/kernel-rt-modules-4.18.0-193.28.1.rt13.77.el8_2.x86_64.rpm
./extensions/kernel-rt/kernel-rt-modules-extra-4.18.0-193.28.1.rt13.77.el8_2.x86_64.rpm

There are numerous fixes in more recent RT kernel versions that are absolutely critical for low latency applications running on OpenShift 4.6.

Comment 1 Micah Abbott 2021-01-11 14:44:34 UTC
RHCOS 4.6 is billed as an EUS release and uses the RHEL 8.2 EUS sources.  The kernel-rt package does not have an EUS release, rather it uses the moniker "Telecommunications Update Service".  See the most recent advisory for `kernel-rt` - https://access.redhat.com/errata/RHSA-2020:5428

The RHCOS build process is incorrectly using the wrong location for TUS updates on `kernel-rt`, so we'll have to update our build process/configuration to use the proper location.

Comment 2 Micah Abbott 2021-01-11 17:01:23 UTC
The RHCOS build configuration for 4.7 is tracking the standard RHEL 8 repos, so this problem does not need to be addressed in 4.7 directly.  Will close this as NOTABUG for 4.7.

The fix applies to 4.6.z, so I've created the clone here - https://bugzilla.redhat.com/show_bug.cgi?id=1914988

Please see the 4.6.z BZ to follow the fixes.