Bug 1825940

Summary: kernel-core-5.7.0-0.rc1.20200416git9786cab67457.1 took very long time to install
Product: [Fedora] Fedora Reporter: H.J. Lu <hongjiu.lu>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rawhideCC: airlied, bskeggs, hdegoede, ichavero, itamar, jarodwilson, jcline, jeremy, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, masami256, mchehab, mjg59, mrmazda, steved, y9t7sypezp, ykaliuta
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description H.J. Lu 2020-04-20 14:34:01 UTC
kernel-core-5.7.0-0.rc1.20200416git9786cab67457.1 took very long time,
> 20 minutes, to install.  Most of time was spent in /usr/sbin/weak-modules:

posttrans scriptlet (using /bin/sh):
if [ -x /usr/sbin/weak-modules ]
then
    /usr/sbin/weak-modules --add-kernel 5.7.0-0.rc1.20200416git9786cab67457.1.1.cet.fc31.x86_64 || exit $?
fi
/bin/kernel-install add 5.7.0-0.rc1.20200416git9786cab67457.1.1.cet.fc31.x86_64 /lib/modules/5.7.0-0.rc1.20200416git9786cab67457.1.1.cet.fc31.x86_64/vmlinuz || exit $?

Comment 1 Steve 2020-04-27 17:38:44 UTC
Confirmed in an F31 VM with:

# dnf update --nogpg kernel-0:5.7.0-0.rc2.20200422git18bf34080c4c.1.fc33.x86_64 --releasever=33

Top shows CPU usage by depmod approaching 100% at times:

$ top -bc -n 1 -u root | head -9
top - 10:31:33 up 28 min,  1 user,  load average: 1.20, 1.11, 0.99
Tasks: 186 total,   2 running, 184 sleeping,   0 stopped,   0 zombie
%Cpu(s): 94.1 us,  5.9 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3928.0 total,   2000.5 free,    734.9 used,   1192.6 buff/cache
MiB Swap:    512.0 total,    512.0 free,      0.0 used.   2942.6 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
  38906 root      20   0   41960  39528   4220 R  81.2   1.0   0:03.82 /sbin/depmod -C /tmp/weak-modules.Os10HS/depmod.conf +
      1 root      20   0  172764  16432   9752 S   0.0   0.4   0:02.78 /usr/lib/systemd/systemd --switched-root --system --d+

$ rpm -qf /usr/sbin/depmod
kmod-26-4.fc31.x86_64

Comment 2 Steve 2020-04-27 18:09:55 UTC
Opened against kmod:

Bug 1828455 - depmod running for a long time when installing F33 kernel

This problem could have something to do with the ARK project:

kernel-ark
https://gitlab.com/cki-project/kernel-ark

Comment 3 Jeremy Cline 2020-04-27 19:08:39 UTC
(In reply to H.J. Lu from comment #0)
> kernel-core-5.7.0-0.rc1.20200416git9786cab67457.1 took very long time,
> > 20 minutes, to install.  Most of time was spent in /usr/sbin/weak-modules:
> 
> posttrans scriptlet (using /bin/sh):
> if [ -x /usr/sbin/weak-modules ]
> then
>     /usr/sbin/weak-modules --add-kernel
> 5.7.0-0.rc1.20200416git9786cab67457.1.1.cet.fc31.x86_64 || exit $?
> fi
> /bin/kernel-install add
> 5.7.0-0.rc1.20200416git9786cab67457.1.1.cet.fc31.x86_64
> /lib/modules/5.7.0-0.rc1.20200416git9786cab67457.1.1.cet.fc31.x86_64/vmlinuz
> || exit $?

This is due to old kernels installing kernel-modules-extra into the wrong directory. Current kernels now install them into kernel/ rather than the confusingly-named extra/ directory, but old kernels dropped them in extra/. A work-around would be to uninstall old versions of the kernel-modules-extra package.

Comment 4 Steve 2020-04-27 19:22:11 UTC
(In reply to Jeremy Cline from comment #3)
...
> This is due to old kernels installing kernel-modules-extra into the wrong
> directory. Current kernels now install them into kernel/ rather than the
> confusingly-named extra/ directory, but old kernels dropped them in extra/.
> A work-around would be to uninstall old versions of the kernel-modules-extra
> package.

That doesn't explain why depmod is spawning a process storm, instead of failing with an error.

See Bug 1828455, Comment 3, for a summary of strace results captured after only a few seconds. depmod is obviously failing to detect invalid input.

Validating your input is an elementary principle of software engineering.

Comment 5 Steve 2020-04-30 12:52:09 UTC
(In reply to Jeremy Cline from comment #3)
...
> This is due to old kernels installing kernel-modules-extra into the wrong
> directory. Current kernels now install them into kernel/ rather than the
> confusingly-named extra/ directory, but old kernels dropped them in extra/.
> A work-around would be to uninstall old versions of the kernel-modules-extra package.

I am still seeing this depmod process storm with:

kernel-5.7.0-0.rc3.20200429git1d2cc5ac6f66.1.fc33 
https://bodhi.fedoraproject.org/updates/FEDORA-2020-297b9b9340

I would like to refer a bug reporter* who is on F31 to a rawhide kernel for testing, but cannot do that if the install is going to cause him problems.

Please clarify the status of the current rawhide builds and suggest a workaround that doesn't require a complicated procedure.

* Bug 1817368 - Random hangs with sof-audio-pci

Comment 6 Steve 2020-04-30 14:08:55 UTC
(In reply to Steve from comment #5)
> ... and suggest a workaround that doesn't require a complicated procedure.

Here is a simple workaround:

$ cd /usr/sbin
# mv -i weak-modules weak-modules.DISABLE

The installed kernel boots fine:

$ uname -r
5.7.0-0.rc3.20200429git1d2cc5ac6f66.1.fc33.x86_64

BTW, weak-modules is a 1200-line bash script. weak-modules should be rewritten in Python or C.

Even better would be to remove weak-modules from the install process entirely.

==
$ wc -l /usr/sbin/weak-modules.DISABLE 
1199 /usr/sbin/weak-modules.DISABLE

Comment 7 Steve 2020-04-30 16:54:21 UTC
(In reply to Steve from comment #6)
> Even better would be to remove weak-modules from the install process entirely.

Or, at least, make it optional. The kernel scripts are half-way there:

$ rpm -q --scripts -p kernel-core-5.7.0-0.rc3.20200429git1d2cc5ac6f66.1.fc33.x86_64.rpm | grep -m1 weak-modules
if [ -x /usr/sbin/weak-modules ]

A second test on a config variable would stop weak-modules from running for people who don't want it.

Comment 8 Steve 2020-04-30 17:56:46 UTC
(In reply to Steve from comment #7)
> Or, at least, make it optional.

If weak-modules is truly optional, it should go into a sub-package: kmod-weak-modules.