Bug 2189320
| Summary: | cifsiod kernel has high cpu usage and server is freezing up after which only fix is to force reboot. | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | akarnafel | ||||||
| Component: | cifs-utils | Assignee: | Nobody <nobody> | ||||||
| Status: | ASSIGNED --- | QA Contact: | xiaoli feng <xifeng> | ||||||
| Severity: | urgent | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | CentOS Stream | CC: | brentw, bstinson, dmoraes1, dpulkowski, john.horne, jwboyer, lawrence.gorman, ryan.brothers, xzhou | ||||||
| Target Milestone: | rc | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | Type: | Bug | |||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Attachments: |
|
||||||||
I had the same behavior │ │ PID %CPU Size Res Res Res Res Shared Faults Faults Command │ │ Used KB Set Text Data Lib KB Min Maj │ │ 745 99.7 0 0 0 0 0 0 0 0 kworker/5:2+cifsiod │ │ 1115 0.5 456588 9452 32 27828 0 7860 0 0 vmtoolsd │ │ 5071 0.5 8332 5032 156 4680 0 2476 0 0 nmon16m_x86_64_ │ │ 1 0.0 169644 13408 44 19740 0 9748 0 0 systemd │ │ 2 0.0 0 0 0 0 0 0 0 0 kthreadd Created attachment 1959633 [details]
nmon - top
Additional Information: Kernel Version : 5.14.0-289.el9.x86_64 RPM : cifs-utils-7.0-1.el9.x86_64 OS Version : CentOS Stream release 9 VMware Tools: Running, version:12325 (Guest Managed) I think this issue is the same as bz2177562. It's a regression issue included from kernel-5.14.0-276.el9. I can't read about bz2177562. Can you do the right permissions ? Or tell us what we need to do ? (In reply to Daniel de Morais Carneiro from comment #5) > I can't read about Red Hatbz2177562. Can you do the right permissions ? Or > tell us what we need to do ? bz2177562 is ongoing fix. Now there isn't a solution. Let me see if I can set the permission. Thanks. I also have this exact problem since a kernel upgrade (to 5.14.0-284.11.1.el9_2.x86_64). At the same time cifs_utils was upgraded to 7.0-1.el9.x86_64. High CPU (~99%) usage for kworker/cifsiod followed by CPU soft lock. Unable to kill the process so only a reboot sorts out the problem. Then, within about 30-60 minutes, it's back again. Rolling cifs_utils to 6.14-1.el9.x86_64 did not fix the problem. Booting to previous kernel version (5.14.0-162.23.1.el9_1.x86_64) does seem to fix it. Additionally I can't view https://bugzilla.redhat.com/show_bug.cgi?id=2177562 so am unable to see if there's been any progress with the problem. I can confirm the same issue. I tried reverting back to cifs-utils (6.14-1.el9.x86_64) & that did not fix the problem as well. I will as a temporary workaround try reverting back to the previous kernel as well (5.14.0-162.23.1.el9_1.x86_64). There is a cve for that kernel https://nvd.nist.gov/vuln/detail/CVE-2022-3028 https://bugzilla.redhat.com/show_bug.cgi?id=2122228 Could you try kernel-5.14.0-301.el9? I find this bug bz2177562 is gone on kernel-5.14.0-301.el9. Thanks. I am running into this issue too after upgrading to 5.14.0-284.11.1.el9_2.x86_64. Will a fix for it be released soon? > Could you try kernel-5.14.0-301.el9? I find this bug Red Hatbz2177562 is
> gone on kernel-5.14.0-301.el9.
>
> Thanks.
This kernel-5.14.0-301.el9 is not available in the rhel-9-for-x86_64-baseos-rpms yet.
The latest kernel available is: 5.14.0-284.11.1.el9_2
Is there any timeline of when the fix will be released for RHEL 9? I'm still monitoring, but kernel version 5.14.0-284.18.1.el9_2.x86_64 (just released) appears to have fixed the issue for me. I also tested the new version: 5.14.0-284.18.1.el9_2.x86_64 I have been monitoring it for about 4+ hours with the new kernel. I also think this fixed the issue. Since on the troublesome version, I would noticed within 4 hours a reboot was necessary. Will report back if any issue arises throughout the day. |
Created attachment 1959632 [details] image1 Description of problem: This is a new VM, tried on kernel version 5.14.0.234 and it is working just fine. Kernel 5.14.0.289-299 do not work. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. install keyutils and cifs-utils 2. mount a cifs mount to the server. 3. Wait for 30-180 minutes before you will start seeing in dmesg for something like "watchdog: BUG: soft looking - CPU## stuck for xxxs! [kworker/0:3.....] Actual results: dmesg shows info like "watchdog: BUG: soft looking - CPU## stuck for xxxs! [kworker/0:3.....] and server is very slow to do anything. Expected results: server should not be slow. Additional info: I can access the file without any issues until the server is very slow.