RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1903578 - kernnel-rt-debug: do not call blocking ops when !TASK_RUNNING; state=1 set at [<0000000050e86018>] handle_userfault+0x530/0x1820
Summary: kernnel-rt-debug: do not call blocking ops when !TASK_RUNNING; state=1 set at...
Keywords:
Status: CLOSED ERRATA
Alias: None
Deadline: 2021-09-13
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: kernel-rt
Version: 8.4
Hardware: x86_64
OS: Linux
low
low
Target Milestone: rc
: 8.4
Assignee: Andrew Halaney
QA Contact: Chunyu Hu
URL:
Whiteboard:
Depends On:
Blocks: 2020013 2029420 2029421 2029422
TreeView+ depends on / blocked
 
Reported: 2020-12-02 12:23 UTC by Chunyu Hu
Modified: 2023-08-08 03:02 UTC (History)
11 users (show)

Fixed In Version: kernel-rt-4.18.0-343.rt7.125.el8
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2029420 2029421 2029422 (view as bug list)
Environment:
Last Closed: 2022-05-10 14:41:26 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/rhel/src/kernel rhel-8 merge_requests 1249 0 None None None 2021-09-08 21:02:08 UTC
Red Hat Product Errata RHSA-2022:1975 0 None None None 2022-05-10 14:42:45 UTC

Description Chunyu Hu 2020-12-02 12:23:18 UTC
Description of problem:

run memroy test case on kernel-rt debug, got the kernel call trace.

[  688.899567] EDAC DEBUG: i3000_check: MC0
[  689.923539] EDAC DEBUG: i3000_check: MC0
[  689.945182] ------------[ cut here ]------------
[  689.950117] do not call blocking ops when !TASK_RUNNING; state=1 set at [<0000000050e86018>] handle_userfault+0x530/0x1820
[  689.976560] WARNING: CPU: 1 PID: 5861 at kernel/sched/core.c:6652 __might_sleep+0x146/0x1a0
[  689.976565] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc iTCO_wdt iTCO_vendor_support gpio_ich dcdbas intel_powerclamp pcspkr joydev lpc_ich i2c_i801 i3000_edac ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg ata_generic ata_piix libata tg3 serio_raw dm_mirror dm_region_hash dm_log dm_mod
[  689.976658] CPU: 1 PID: 5861 Comm: userfaultfd Not tainted 4.18.0-255.rt7.20.el8.x86_64+debug #1
[  689.976662] Hardware name: Dell Inc.                 PowerEdge SC430              /0M9873, BIOS A02 02/06/2006
[  689.976670] RIP: 0010:__might_sleep+0x146/0x1a0
[  689.976677] Code: 65 48 8b 1c 25 80 02 02 00 48 8d 7b 10 48 89 fe 48 c1 ee 03 80 3c 06 00 75 2b 48 8b 73 10 48 c7 c7 40 37 49 8b e8 68 2e f6 ff <0f> 0b e9 46 ff ff ff e8 6e fc 65 00 e9 1d ff ff ff e8 64 fc 65 00
[  689.976681] RSP: 0018:ffff88808a007ac8 EFLAGS: 00010282
[  689.976690] RAX: 0000000000000000 RBX: ffff8880a8128000 RCX: 0000000000000000
[  689.976693] RDX: 1ffff11015025000 RSI: ffffffff8bb4b8e8 RDI: ffff8880a812800c
[  689.976697] RBP: ffffffff8b4a76e0 R08: ffffed101577e975 R09: ffffed101577e974
[  689.976700] R10: ffffed101577e974 R11: ffff8880abbf4ba7 R12: 00000000000000be
[  689.976703] R13: 0000000000000000 R14: ffff888000000848 R15: ffff8880346f6848
[  689.976708] FS:  00007f3165b3f700(0000) GS:ffff8880aba00000(0000) knlGS:0000000000000000
[  689.976712] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  689.976715] CR2: 00007f3166d09028 CR3: 000000009a8ec000 CR4: 00000000000006e0
[  689.976718] Call Trace:
[  689.976746]  __up_read+0x2f/0x80
[  689.976760]  handle_userfault+0x741/0x1820
[  689.976796]  ? userfaultfd_ioctl+0x2f00/0x2f00
[  689.976827]  ? _raw_spin_unlock_irqrestore+0xc6/0xe0
[  689.976837]  ? lockdep_hardirqs_on+0x1b1/0x390
[  689.976867]  ? __rt_mutex_futex_unlock+0x10/0x10
[  689.976889]  ? userfaultfd_ctx_put+0x2e0/0x2e0
[  689.976933]  __handle_mm_fault+0x1531/0x1ab0
[  689.976951]  ? __pmd_alloc+0x540/0x540
[  689.976956]  ? get_lock_stats+0x18/0x120
[  689.976964]  ? put_lock_stats.isra.19+0xb/0xa0
[  689.977042]  handle_mm_fault+0x2bd/0x7a0
[  689.977068]  __do_page_fault+0x505/0xce0
[  689.977086]  ? trace_hardirqs_off_thunk+0x1a/0x20
[  689.977092]  ? page_fault+0x8/0x30
[  689.977113]  do_page_fault+0x12e/0x8a0
[  689.977129]  ? page_fault+0x8/0x30
[  689.977142]  page_fault+0x1e/0x30
[  689.977151] RIP: 0033:0x401da5
[  689.977159] Code: 04 85 c0 0f 84 eb 00 00 00 48 8b 05 c5 45 20 00 48 8b 4d a8 48 8b 15 6a 45 20 00 48 0f af d1 48 83 c2 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 e0 48 83 7d e0 00 75 43 48 8b 05 66 45 20 00 48
[  689.977162] RSP: 002b:00007f3165b3ee40 EFLAGS: 00010206
[  689.977168] RAX: 00007f3166d09028 RBX: 0000000000000000 RCX: 00000000000011c8
[  689.977171] RDX: 00000000011c802f RSI: 00007f3165b3ee94 RDI: 0000000000000000
[  689.977174] RBP: 00007f3165b3eef0 R08: 00007f3165b3ee88 R09: 00007f3165b3ee90
[  689.977177] R10: 0000000000000004 R11: 00007f3175f12240 R12: 00007ffe66b48cfe
[  689.977181] R13: 00007ffe66b48cff R14: 0000000000606480 R15: 00007f3165b3efc0
[  689.977231] irq event stamp: 332
[  689.977238] hardirqs last  enabled at (331): [<ffffffff8af36ed4>] _raw_spin_unlock_irq+0x24/0xd0
[  689.977245] hardirqs last disabled at (332): [<ffffffff89004eda>] trace_hardirqs_off_thunk+0x1a/0x20
[  689.977253] softirqs last  enabled at (0): [<ffffffff891f0fbb>] copy_process+0x1c9b/0x6600
[  689.977259] softirqs last disabled at (0): [<0000000000000000>] 0x0
[  689.977263] ---[ end trace 0000000000000002 ]---
[  690.947721] EDAC DEBUG: i3000_check: MC0
[  692.679402] EDAC DEBUG: i3000_check: MC0
[  693.701391] EDAC DEBUG: i3000_check: MC0
 

Version-Release number of selected component (if applicable):
4.18.0-255.rt7.20.el8.x86_64+debug

How reproducible:
always

Steps to Reproduce:
1.  yum -y install kernel-kernel-general-memory-function-userfaultfd2.noarch
2.  cd /mnt/tests/kernel/general/memory/function/userfaultfd2
3.  make run

Actual results:
there's unexpected sleep function showed.

Expected results:
Fix it.

Additional info:

Comment 1 Chunyu Hu 2020-12-02 12:25:20 UTC
This is only seen on kernel-rt.

Comment 3 Juri Lelli 2020-12-04 13:48:19 UTC
This can be reproduced on upstream as well (will report).

Also, it can be reproduced on all 8.x RT versions.
Guess we never noticed until now because the test is actually a PASS.

Comment 4 Andrew Halaney 2021-04-06 22:24:46 UTC
FYI, I think this is a false warning and I've posted a patch upstream to address it: https://lore.kernel.org/lkml/20210406221952.50399-1-ahalaney@redhat.com/

Comment 5 Andrew Halaney 2021-04-27 15:29:47 UTC
tglx said this would land in the next RT release, will wait until I see it there to backport: https://lore.kernel.org/lkml/877dkoud19.ffs@nanos.tec.linutronix.de/

Comment 7 Andrew Halaney 2021-08-23 13:50:36 UTC
Sorry for the delay -- I've been slowly trying to get this to land somewhere upstream since the last comment.

linux-rt-devel overhauled their rwsem implementation for RT and thus tglx never picked up the patch after that above thread.
I've been poking the stable maintainers offline to get it landed there, and Steven has included it in a RC today. Once that officially releases I'll go ahead and post a patch. Thanks for being patient.

https://lore.kernel.org/linux-rt-users/20210820234737.244832083@goodmis.org/

Comment 8 Andrew Halaney 2021-08-26 14:04:44 UTC
Landed upstream in the 5.10 RT stable branch: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v5.10-rt&id=b2ed0a4302faf2bb09e97529dd274233c082689b

Too close to 8.5 beta to target that. Setting ITR to 8.6 for that reason. It sounds like 8.6 might use the new RT patchset, in which case this won't ever land on 8.6 and will have to target 8.5 directly. Until that happens though I'm assuming this needs to land in 8.6.

Also, this affects 8.2-rt - 8.5-rt as well. I don't think I'm supposed to set ZTR until this is verified though, so holding off on that.

Comment 9 Andrew Halaney 2021-08-26 19:30:30 UTC
Premature, but here's the brew build for the debug kernel: http://brew-task-repos.usersys.redhat.com/repos/scratch/ahalaney/kernel-rt/4.18.0/337.rt7.118.el8bz1903578v1/

I've posted the MR but I'm going to leave this in the ASSIGNED state until 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to MODIFIED. Please let me know if anyone has objections to that strategy.

MR: https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1249

Comment 10 Chunyu Hu 2021-09-08 02:02:44 UTC
(In reply to Andrew Halaney from comment #9)
> Premature, but here's the brew build for the debug kernel:
> http://brew-task-repos.usersys.redhat.com/repos/scratch/ahalaney/kernel-rt/4.
> 18.0/337.rt7.118.el8bz1903578v1/

This is tested, result is good.

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   /kernel/general/memory/function/userfaultfd2
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [ 21:54:45 ] :: [   LOG    ] :: JOURNAL XML: /var/tmp/beakerlib-REi9TkB/journal.xml
:: [ 21:54:45 ] :: [   LOG    ] :: JOURNAL TXT: /var/tmp/beakerlib-REi9TkB/journal.txt
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   Duration: 213s
::   Phases: 3 good, 0 bad
::   OVERALL RESULT: PASS (/kernel/general/memory/function/userfaultfd2)

[root@ibm-x3650m4-05 userfaultfd2]# 
[root@ibm-x3650m4-05 userfaultfd2]# dmesg
[  507.204710] runtest.sh (1898): drop_caches: 3
[root@ibm-x3650m4-05 userfaultfd2]# dmesg
[  507.204710] runtest.sh (1898): drop_caches: 3
[root@ibm-x3650m4-05 userfaultfd2]# uname -r
4.18.0-337.rt7.118.el8bz1903578v1.x86_64+debug
[root@ibm-x3650m4-05 userfaultfd2]# dmesg
[  507.204710] runtest.sh (1898): drop_caches: 3

> 
> I've posted the MR but I'm going to leave this in the ASSIGNED state until
> 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to
> MODIFIED. Please let me know if anyone has objections to that strategy.

The ‘Devel target Milestone' is '2', not sure if we got enough reviews/acks for the
MR, is time to move to MODIFIED? as it's ITM2 already.

> 
> MR: https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1249

Comment 11 Andrew Halaney 2021-09-08 13:09:53 UTC
(In reply to Chunyu Hu from comment #10)
> (In reply to Andrew Halaney from comment #9)
> > I've posted the MR but I'm going to leave this in the ASSIGNED state until
> > 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to
> > MODIFIED. Please let me know if anyone has objections to that strategy.
> 
> The ‘Devel target Milestone' is '2', not sure if we got enough reviews/acks
> for the
> MR, is time to move to MODIFIED? as it's ITM2 already.

Ah, my apologies @chuhu, the rhel-8 kernel doesn't have a place to
land 8.6-rt changes yet but that should happen today.

I didn't realize that would be the case (I assumed by DTM 2 there would be
somewhere to land the patch).
I'm moving to DTM 3 to give buffer room for review after the branching happens today,
can you please confirm that ITM 5 is ok and if not adjust? Thanks and sorry!

Comment 13 Chunyu Hu 2021-09-09 09:01:24 UTC
(In reply to Andrew Halaney from comment #11)
> (In reply to Chunyu Hu from comment #10)
> > (In reply to Andrew Halaney from comment #9)
> > > I've posted the MR but I'm going to leave this in the ASSIGNED state until
> > > 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to
> > > MODIFIED. Please let me know if anyone has objections to that strategy.
> > 
> > The ‘Devel target Milestone' is '2', not sure if we got enough reviews/acks
> > for the
> > MR, is time to move to MODIFIED? as it's ITM2 already.
> 
> Ah, my apologies @chuhu, the rhel-8 kernel doesn't have a place to
> land 8.6-rt changes yet but that should happen today.
> 
> I didn't realize that would be the case (I assumed by DTM 2 there would be
> somewhere to land the patch).
> I'm moving to DTM 3 to give buffer room for review after the branching
> happens today,
> can you please confirm that ITM 5 is ok and if not adjust? Thanks and sorry!

It works for me for ITM-5. Thanks! The MR is there and bot has already attached the build, 
and if the workflow-bot can notify when the MR gets enough reviews/acks, that would be great.

Comment 14 Andrew Halaney 2021-09-09 13:36:51 UTC
You are right, the MR is now targeting the correct branch for 8.6-rt release.

The MR has enough reviews/acks now (I guess the bot doesn't post that),
so all that should be left is adding "Verified: Tested" when
appropriate (not sure if you want to take the MR artifacts for
a spin or if you trust my brew artifacts) and a maintainer to merge.
Thanks!

Comment 15 Chunyu Hu 2021-09-10 01:40:22 UTC
(In reply to Andrew Halaney from comment #14)
> You are right, the MR is now targeting the correct branch for 8.6-rt release.
> 
> The MR has enough reviews/acks now (I guess the bot doesn't post that),
> so all that should be left is adding "Verified: Tested" when
> appropriate (not sure if you want to take the MR artifacts for
> a spin or if you trust my brew artifacts) and a maintainer to merge.
> Thanks!

I trust the brew build, set 'Tested', there's no debug kernel build in the MR. Thanks!

Comment 16 Chunyu Hu 2021-09-27 02:09:48 UTC
Hi Andrew,

When are we going to add this into candidate/official kernel-rt build? The target ITM
for this is ITM-5, which is from Sep-28 to Oct 4. Maybe we need to defer this for
several ITMs? if that's the case, I'll adjust the ITM field of the bz. Thanks!

Regards,
Chunyu Hu

Comment 31 errata-xmlrpc 2022-05-10 14:41:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel-rt security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:1975


Note You need to log in before you can comment on or make changes to this bug.