RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1752902 - [regression] kmod 20-25 runs much slower than 20-23 due to traversing extra folders of non-existent kernels
Summary: [regression] kmod 20-25 runs much slower than 20-23 due to traversing extra f...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kmod
Version: 7.7
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Yauheni Kaliuta
QA Contact: Ziqian SUN (Zamir)
URL:
Whiteboard:
Depends On:
Blocks: 1710953
TreeView+ depends on / blocked
 
Reported: 2019-09-17 14:12 UTC by Oleksandr Natalenko
Modified: 2020-03-31 20:06 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1755196 (view as bug list)
Environment:
Last Closed: 2020-03-31 20:06:12 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
proposed implementation (32.01 KB, application/x-shellscript)
2019-09-20 16:42 UTC, Yauheni Kaliuta
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:1142 0 None None None 2020-03-31 20:06:23 UTC

Description Oleksandr Natalenko 2019-09-17 14:12:18 UTC
Based on the support case 02464164.

The customer uses 3rd-party module "openafs" since early RHEL7 days. That way, over time, with each update, it leaves its traces under /lib/modules/`uname r/extra folder.

With kmod 20-25 update (BZ 1643299 I believe), each new kernel update takes enormous amount of time due to the fact that now weak-modules traverses all the folders from previously installed kernels even if those kernels do not exist anymore.

Practically, what we see is this:

===
…
/lib/modules/3.10.0-123.13.2.el7.x86_64/extra/openafs/openafs.ko
/lib/modules/3.10.0-123.13.2.el7.x86_64/extra/openafs
/lib/modules/3.10.0-123.13.2.el7.x86_64/extra
/lib/modules/3.10.0-123.13.2.el7.x86_64
…
===

(and the same almost for each kernel the customer ever installed)

And weak-modules gets it all (a brief snippet of debug output):

===
…
weak module for openafs.ko already exists for kernel 3.10.0-1062.el7.x86_64, update case?
Module openafs.ko from kernel 3.10.0-123.8.1.el7.x86_64 is not compatible with kernel 3.10.0-1062.el7.x86_64 in symbols:  truncate_inode_pages __d_drop d_instantiate page_put_link key_alloc noop_fsync lookup_one_len d_make_root invalidate_mapping_pages unregister_key_type d_splice_alias key_instantiate_and_link register_key_type path_put shrink_dcache_parent d_prune_aliases d_drop page_follow_link_light mntput key_put d_move page_readlink d_rehash kern_path key_validate flock_lock_file_wait truncate_setsize generic_read_dir have_submounts keyring_search names_cachep d_find_alias dput test_set_page_writeback dget_parent mntget d_invalidate d_path
Removing compatible module openafs.ko from kernel 3.10.0-1062.el7.x86_64
…
===

This takes over 5 mins to complete. With "empty" folders removed (except the one that contains actual module to be referenced by a weak update mechanism and two of currently installed kernel) it takes only about 30 seconds.

Clearly, weak-modules should avoid touching non-existent kernels. The issue is not "openafs"-specific.

This was discussed with ykaliuta via IRC, and now I'm framing our discussion and my investigation with the customer into this BZ.

Comment 2 Oleksandr Natalenko 2019-09-17 14:14:15 UTC
And, oh yeah, kmod 20-26 (BZ 1695763 I suppose) doesn't make any difference for the customer's case.

Comment 4 Ziqian SUN (Zamir) 2019-09-18 11:16:28 UTC
I'm setting qa_ack+ for quicker proceeding before actually reproducing.

Hi Oleksandr,

Is the customer willing to test the fix or providing some info how I can get the openafs module?

Thanks.

Comment 5 Oleksandr Natalenko 2019-09-18 11:26:20 UTC
Hi.

(In reply to Ziqian SUN (Zamir) from comment #4)
> Is the customer willing to test the fix or providing some info how I can get
> the openafs module?

Yes, so far the customer is coöperative, so I'd say they will be able to check the fix.

As I've mentioned in the BZ description, the issue is not specific to openafs module, and can be reproduced with any 3rd-party module as long as it is left as a residue from previous kernel installations.

If you'd like to stick to openafs specifically, though, it can be taken from here: [1].

[1] https://www.openafs.org/release/openafs-1.6.22.4.html

Comment 7 Yauheni Kaliuta 2019-09-18 20:29:14 UTC
> Clearly, weak-modules should avoid touching non-existent kernels. 

Not exactly. Having "extra" module compiled for one kernel but providing weak link to another is a valid usecase, even if the original kernel is not in the system (not anymore or even has not ever been and the module is installed separately from a package).

But current logic is to check all the extra/ kernels from oldest to the newest trying to install the weak links. As the result it ends up with the latest compatible weak module installed.

For that usecase the logic should be reverted. In sense, that the kernels should be checked from newest to oldest and if the module was already installed from a newer version, it should be skipped (and if the kernel provides only already installed modules, it should be skipped completely).

Implementing it in bash with the current grouping is a bit tricky for me, but hopefully I'll not break other usecases.

Comment 8 Oleksandr Natalenko 2019-09-19 06:10:54 UTC
> Not exactly

I totally get it. This is why:

> except the one that contains actual module to be referenced by a weak update mechanism and two of currently installed kernel

Comment 10 Yauheni Kaliuta 2019-09-20 16:42:31 UTC
Created attachment 1617277 [details]
proposed implementation

I have not tested it too much, but it passes my local testsuite.

Comment 11 Charles Slivkoff 2019-09-20 17:38:02 UTC
This appears to work in my test case.


$ time ./weak-modules --dry-run --add-kernel --verbose
weak module for openafs.ko already exists for kernel 3.10.0-1062.el7.x86_64, update case?
Module openafs.ko from kernel 3.10.0-862.el7.x86_64 is compatible with kernel 3.10.0-1062.el7.x86_64
/sbin/depmod -C /tmp/weak-modules.anp88V/depmod.conf -ae -F /boot/System.map-3.10.0-1062.el7.x86_64 3.10.0-1062.el7.x86_64

real	0m30.828s
user	0m20.119s
sys	0m6.739s

Comment 23 errata-xmlrpc 2020-03-31 20:06:12 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1142


Note You need to log in before you can comment on or make changes to this bug.