Bug 1907946 - Kernel update breaks 3rd party kernel drivers
Summary: Kernel update breaks 3rd party kernel drivers
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: CentOS Stream
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: 8.4
Assignee: Brian Stinson
QA Contact: Storage QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-15 15:04 UTC by Phil Perry
Modified: 2022-06-15 07:27 UTC (History)
15 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-06-15 07:27:34 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Phil Perry 2020-12-15 15:04:45 UTC
Description of problem: Updating to kernel-4.18.0-257.el8.x86_64.rpm breaks 3rd party drivers leaving system unbootable


Version-Release number of selected component (if applicable):
kernel-4.18.0-257.el8.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install CentOS Stream system using 3rd party kmod-3w-sas driver for SAS/RAID array
2. Apply latest kernel update
3. Reboot system

Actual results:
System hangs, does not reboot as unable to find SAS/RAID storage array

Expected results:
System boots and works as expected

Additional info:
The latest kernel update (kernel-4.18.0-257.el8.x86_64) to Stream breaks 3rd party SAS/RAID driver kmod-3w-sas-3.26.02.000-5.el8_3.elrepo.x86_64.rpm due to changes in kernel symbols.

Comment 2 Tomas Henzl 2020-12-16 15:59:20 UTC
Lukas,
this looks simalr to the third party issue we once had, please can you look into it?

Comment 3 Phil Perry 2020-12-16 16:49:01 UTC
Not sure if this is useful...

I think I tracked the affected symbols down to sysfs_create_bin_file and sysfs_remove_bin_file, but when I looked at the actual source code for these symbols in the -257 kernel, I could see that they'd changed from that in the RHEL8.3 kernel (but I'm not a kernel developer)

To be clear, I'm not expecting this to be something Red Hat can (or should) fix, or it to be something that shouldn't break, just trying to highlight that by releasing a constant Stream of development kernels to CentOS Stream means that the kernel in Stream is now no longer compatible with those kernels released to RHEL at the same point in time.

A solution would be to make regular RHEL kernels also available in Stream, either on an opt-in or opt-out basis, so those users for whom kernel ABI compatibility is critical can continue to contribute to, and benefit from, CentOS Stream.

I have other examples of similar breakage if you need more examples (kmod-aacraid and kmod-qla2xxx within the storage space) but I've not traced specific symbols for these.

Comment 4 Tomas Henzl 2020-12-16 18:58:07 UTC
(In reply to Phil Perry from comment #3)
> Not sure if this is useful...

Thanks, I have missed the part with changes in kernel symbols.

> I think I tracked the affected symbols down to sysfs_create_bin_file and
> sysfs_remove_bin_file, but when I looked at the actual source code for these
> symbols in the -257 kernel, I could see that they'd changed from that in the
> RHEL8.3 kernel (but I'm not a kernel developer)

Cestmir,
I don't see sysfs_create_bin_file and sysfs_remove_bin_file in KABI protected list but also have difficulties to find when we changed these symbols. Please look at the symbols used by 3w-sas, aacraid, qla2xxx between RHEL8.3 and kernel-257, (actually the first one breaking a symbol is sufficient).

 
> To be clear, I'm not expecting this to be something Red Hat can (or should)
> fix, or it to be something that shouldn't break, just trying to highlight
> that by releasing a constant Stream of development kernels to CentOS Stream
> means that the kernel in Stream is now no longer compatible with those
> kernels released to RHEL at the same point in time.
> 
> A solution would be to make regular RHEL kernels also available in Stream,
> either on an opt-in or opt-out basis, so those users for whom kernel ABI
> compatibility is critical can continue to contribute to, and benefit from,
> CentOS Stream.
>
> I have other examples of similar breakage if you need more examples
> (kmod-aacraid and kmod-qla2xxx within the storage space) but I've not traced
> specific symbols for these.

Comment 5 Phil Perry 2020-12-16 22:32:34 UTC
Hi Tomas,

Correct, sysfs_create_bin_file and sysfs_remove_bin_file are not on the kABI whitelist.

The process I used to identify those two symbols was to cross reference those symbols listed in /usr/share/doc/kmod-3w-sas-3.26.02.000/greylist.txt against a diff of symbols that changed between kernel-4.18.0-240.el8.x86_64 and kernel-4.18.0-257.el8.x86_64 from the output of 'rpm -q --provides' for each kernel (some 3595 of them).

Comment 9 Tomas Henzl 2021-01-26 17:51:58 UTC
Hi Phil,
I know that the KABI is a problem, how it looks from my angle you'll need to detect changes in kernel for modules you maintan and rebuild them I only hope that it doesn't happen often.

I'll reassign the bz to CentOs technical lead so you can discuss the other option - 'make regular RHEL kernels also available in Stream'.

Comment 10 Tomas Henzl 2021-01-26 17:57:41 UTC
Hi Brian,

CentOs stream suffers from KABI changes see comment#3 (part of CentOs are I think drivers we do not build but CentOs does)

"To be clear, I'm not expecting this to be something Red Hat can (or should) fix, or it to be something that shouldn't break, just trying to highlight that by releasing a constant Stream of development kernels to CentOS Stream means that the kernel in Stream is now no longer compatible with those kernels released to RHEL at the same point in time.

A solution would be to make regular RHEL kernels also available in Stream, either on an opt-in or opt-out basis, so those users for whom kernel ABI compatibility is critical can continue to contribute to, and benefit from, CentOS Stream."

I don't know how to help further so I'm reassigning the bz.

Comment 11 Phil Perry 2021-01-26 18:11:32 UTC
(In reply to Tomas Henzl from comment #9)
> Hi Phil,
> I know that the KABI is a problem, how it looks from my angle you'll need to
> detect changes in kernel for modules you maintan and rebuild them I only
> hope that it doesn't happen often.
> 
> I'll reassign the bz to CentOs technical lead so you can discuss the other
> option - 'make regular RHEL kernels also available in Stream'.

Thanks Thomas,

For the first Stream kernel update (kernel-4.18.0-257.el8.x86_64.rpm), 13 out of 44 packages I maintain were broken as a result of the update, so unfortunately it's a significant number, and not something we are able to maintain (for Stream) going forward, hence the request for Stream to provide an in situ solution.

Comment 12 Akemi Yagi 2021-01-26 18:42:36 UTC
This bug report was once opened to the public but then made private again. I do not see anything related to security / private stuff. Can someone explain?

Comment 13 Brian Stinson 2021-01-26 18:46:11 UTC
(In reply to Akemi Yagi from comment #12)
> This bug report was once opened to the public but then made private again. I
> do not see anything related to security / private stuff. Can someone explain?

There are many layers of Bugzilla automation especially when dealing with the kernel. We're working on a longer-term solution to this problem.

Comment 14 Akemi Yagi 2021-01-26 18:47:40 UTC
Thanks, Brian, for making the change so quickly.

Comment 16 Akemi Yagi 2022-02-20 22:45:00 UTC
I want to quote quite an extensive analysis on the kABI of CentOS Stream kernels, given in a recent presentation by Pat Riehecky, "Tracking kernel rate of change", on the CentOS Dojo, 2022.

https://wiki.centos.org/Events/Dojo/FOSDEM2022?action=AttachFile&do=view&target=kernel+rate+of+change.pdf

Comment 18 RHEL Program Management 2022-06-15 07:27:34 UTC
After evaluating this issue, there are no plans to address it further or fix it in an upcoming release.  Therefore, it is being closed.  If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened.


Note You need to log in before you can comment on or make changes to this bug.