RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2056383 - System freezes with callstack in dmesg: ret_from_fork
Summary: System freezes with callstack in dmesg: ret_from_fork
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel
Version: 8.6
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Phil Auld
QA Contact: Waylon Cude
URL:
Whiteboard:
: 2046454 2061658 2079179 (view as bug list)
Depends On: 2037123
Blocks: 2090535 2096305
TreeView+ depends on / blocked
 
Reported: 2022-02-21 06:24 UTC by Mark Assad
Modified: 2022-11-08 11:46 UTC (History)
10 users (show)

Fixed In Version: kernel-4.18.0-392.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2096305 (view as bug list)
Environment:
Last Closed: 2022-11-08 10:21:54 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/kernel rhel-8 merge_requests 2613 0 None None None 2022-04-27 13:29:28 UTC
Red Hat Issue Tracker RHELPLAN-112866 0 None None None 2022-02-21 06:47:48 UTC
Red Hat Product Errata RHSA-2022:7683 0 None None None 2022-11-08 10:23:04 UTC

Description Mark Assad 2022-02-21 06:24:06 UTC
Description of problem:

The kernel is getting in a deadlock situation with the following error in dmesg:

kernel: ------------[ cut here ]------------
kernel: cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg
kernel: WARNING: CPU: 62 PID: 383337 at kernel/sched/fair.c:3348 update_blocked_averages+0x62a/0x650
kernel: Modules linked in: ip_vs_rr xt_mark xt_ipvs xt_state ip_vs xt_nat veth vxlan ip6_udp_tunnel udp_tunnel xt_policy xt_conntrack ipt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink br_netfilter bridge stp llc rpcsec_gss_krb5 nfsv4 dns_resolver nfs lockd grace fscache overlay intel_rapl_msr intel_rapl_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp ledtrig_audio dell_smbios rfkill coretemp iTCO_wdt iTCO_vendor_support video crct10dif_pclmul wmi_bmof dell_wmi_descriptor crc32_pclmul dcdbas ghash_clmulni_intel ipmi_ssif rapl intel_cstate intel_uncore pcspkr ses enclosure scsi_transport_sas joydev i2c_i801 lpc_ich mei_me mei acpi_ipmi wmi ext4 ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter mbcache jbd2 auth_rpcgss sunrpc xfs sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea bnx2x sysfillrect sysimgblt fb_sys_fops drm ahci libahci mdio
kernel:  libcrc32c megaraid_sas libata crc32c_intel i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod fuse
kernel: CPU: 62 PID: 383337 Comm: kworker/62:0 Kdump: loaded Not tainted 4.18.0-365.el8.x86_64 #1
kernel: Hardware name: Dell Inc. PowerEdge M640/05YC4P, BIOS 2.12.2 07/12/2021
kernel: Workqueue:  0x0 (events)
kernel: RIP: 0010:update_blocked_averages+0x62a/0x650
kernel: Code: c0 99 ad 9b c6 05 78 2e c3 01 01 e8 39 2f fc ff 0f 0b e9 47 fa ff ff 48 c7 c7 e0 9d ad 9b c6 05 5a 2e c3 01 01 e8 1f 2f fc ff <0f> 0b 8b 93 38 01 00 00 e9 8a fc ff ff 80 3d 46 2e c3 01 00 75 93
kernel: RSP: 0018:ffffa9e9a04efd68 EFLAGS: 00010086
kernel: RAX: 0000000000000000 RBX: ffff8e377ffeaec0 RCX: 0000000000000007
kernel: RDX: 0000000000000007 RSI: 00000000ffff7fff RDI: ffff8e377ffd6750
kernel: RBP: ffff8e377ffeb000 R08: 0000000000000000 R09: c0000000ffff7fff
kernel: R10: 0000000000000001 R11: ffffa9e9a04efb80 R12: ffff8e377ffeb668
kernel: R13: 0000000000000001 R14: ffff8e377ffeae40 R15: 0000000000000000
kernel: FS:  0000000000000000(0000) GS:ffff8e377ffc0000(0000) knlGS:0000000000000000
kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
kernel: CR2: 00000000000000b0 CR3: 000000275ba10001 CR4: 00000000007706e0
kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
kernel: PKRU: 55555554
kernel: Call Trace:
kernel:  ? entry_SYSCALL_64_after_hwframe+0xb9/0xca
kernel:  newidle_balance+0xcb/0x3c0
kernel:  pick_next_task_fair+0x3e/0x3b0
kernel:  __schedule+0x146/0x830
kernel:  ? create_worker+0x1a0/0x1a0
kernel:  schedule+0x35/0xa0
kernel:  worker_thread+0xb7/0x390
kernel:  ? create_worker+0x1a0/0x1a0
kernel:  kthread+0x10a/0x120
kernel:  ? set_kthread_struct+0x40/0x40
kernel:  ret_from_fork+0x35/0x40
kernel: ---[ end trace 00c4093b0733bf91 ]---

Version-Release number of selected component (if applicable):
Linux version 4.18.0-365.el8.x86_64 (mockbuild.centos.org) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)) #1 SMP Thu Feb 10 16:11:23 UTC 2022

How reproducible:
It is happening consistently when the system is under load, but, I am unable to reproduce the error at will. 


Steps to Reproduce:
1. Run system for a few days and wait. Sorry for the poor description, I haven't been able to reproduce on demand. 
2. Has occured on multiple systems. 
3. Did not occur on CentOS 8, has only started since moving to streams. 


Actual results:
System freezes. Console doesn't respond to keyboard. Existing process may continue to run, but will lock up. 

System will respond to ping, but will not accept an ssh session.


Expected results:
System will run normally. 


Additional info:
I am not sure what else to do to try and resolve this error. I'm happy to try any suggestions.

Edit: typo

Comment 1 Mark Assad 2022-02-22 03:14:16 UTC
If I leave the box for longer, I start to see the following errors too:

 INFO: task kworker/64:2:125809 blocked for more than 120 seconds.
       Tainted: G        W        --------- -  - 4.18.0-365.el8.x86_64 #1
 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 task:kworker/64:2    state:D stack:    0 pid:125809 ppid:     2 flags:0x80004080
 Workqueue: cgroup_destroy css_free_rwork_fn
 Call Trace:
  __schedule+0x2d1/0x830
  schedule+0x35/0xa0
  schedule_timeout+0x274/0x300
  ? load_balance+0x163/0xc20
  ? recalibrate_cpu_khz+0x10/0x10
  ? ktime_get+0x3e/0xa0
  wait_for_completion+0x96/0x100
  flush_workqueue+0x14d/0x440
  ? __switch_to_asm+0x35/0x70
  cgroup1_pidlist_destroy_all+0x7c/0xa0
  css_free_rwork_fn+0xe3/0x3a0
  process_one_work+0x1a7/0x360
  ? create_worker+0x1a0/0x1a0
  worker_thread+0x30/0x390
  ? create_worker+0x1a0/0x1a0
  kthread+0x10a/0x120
  ? set_kthread_struct+0x40/0x40
  ret_from_fork+0x35/0x40

Comment 6 Phil Auld 2022-04-27 11:31:43 UTC
*** Bug 2079179 has been marked as a duplicate of this bug. ***

Comment 8 Phil Auld 2022-04-28 13:59:31 UTC
*** Bug 2061658 has been marked as a duplicate of this bug. ***

Comment 19 Phil Auld 2022-07-07 16:58:42 UTC
*** Bug 2046454 has been marked as a duplicate of this bug. ***

Comment 21 errata-xmlrpc 2022-11-08 10:21:54 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7683


Note You need to log in before you can comment on or make changes to this bug.