Description of problem: run memroy test case on kernel-rt debug, got the kernel call trace. [ 688.899567] EDAC DEBUG: i3000_check: MC0 [ 689.923539] EDAC DEBUG: i3000_check: MC0 [ 689.945182] ------------[ cut here ]------------ [ 689.950117] do not call blocking ops when !TASK_RUNNING; state=1 set at [<0000000050e86018>] handle_userfault+0x530/0x1820 [ 689.976560] WARNING: CPU: 1 PID: 5861 at kernel/sched/core.c:6652 __might_sleep+0x146/0x1a0 [ 689.976565] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc iTCO_wdt iTCO_vendor_support gpio_ich dcdbas intel_powerclamp pcspkr joydev lpc_ich i2c_i801 i3000_edac ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg ata_generic ata_piix libata tg3 serio_raw dm_mirror dm_region_hash dm_log dm_mod [ 689.976658] CPU: 1 PID: 5861 Comm: userfaultfd Not tainted 4.18.0-255.rt7.20.el8.x86_64+debug #1 [ 689.976662] Hardware name: Dell Inc. PowerEdge SC430 /0M9873, BIOS A02 02/06/2006 [ 689.976670] RIP: 0010:__might_sleep+0x146/0x1a0 [ 689.976677] Code: 65 48 8b 1c 25 80 02 02 00 48 8d 7b 10 48 89 fe 48 c1 ee 03 80 3c 06 00 75 2b 48 8b 73 10 48 c7 c7 40 37 49 8b e8 68 2e f6 ff <0f> 0b e9 46 ff ff ff e8 6e fc 65 00 e9 1d ff ff ff e8 64 fc 65 00 [ 689.976681] RSP: 0018:ffff88808a007ac8 EFLAGS: 00010282 [ 689.976690] RAX: 0000000000000000 RBX: ffff8880a8128000 RCX: 0000000000000000 [ 689.976693] RDX: 1ffff11015025000 RSI: ffffffff8bb4b8e8 RDI: ffff8880a812800c [ 689.976697] RBP: ffffffff8b4a76e0 R08: ffffed101577e975 R09: ffffed101577e974 [ 689.976700] R10: ffffed101577e974 R11: ffff8880abbf4ba7 R12: 00000000000000be [ 689.976703] R13: 0000000000000000 R14: ffff888000000848 R15: ffff8880346f6848 [ 689.976708] FS: 00007f3165b3f700(0000) GS:ffff8880aba00000(0000) knlGS:0000000000000000 [ 689.976712] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 689.976715] CR2: 00007f3166d09028 CR3: 000000009a8ec000 CR4: 00000000000006e0 [ 689.976718] Call Trace: [ 689.976746] __up_read+0x2f/0x80 [ 689.976760] handle_userfault+0x741/0x1820 [ 689.976796] ? userfaultfd_ioctl+0x2f00/0x2f00 [ 689.976827] ? _raw_spin_unlock_irqrestore+0xc6/0xe0 [ 689.976837] ? lockdep_hardirqs_on+0x1b1/0x390 [ 689.976867] ? __rt_mutex_futex_unlock+0x10/0x10 [ 689.976889] ? userfaultfd_ctx_put+0x2e0/0x2e0 [ 689.976933] __handle_mm_fault+0x1531/0x1ab0 [ 689.976951] ? __pmd_alloc+0x540/0x540 [ 689.976956] ? get_lock_stats+0x18/0x120 [ 689.976964] ? put_lock_stats.isra.19+0xb/0xa0 [ 689.977042] handle_mm_fault+0x2bd/0x7a0 [ 689.977068] __do_page_fault+0x505/0xce0 [ 689.977086] ? trace_hardirqs_off_thunk+0x1a/0x20 [ 689.977092] ? page_fault+0x8/0x30 [ 689.977113] do_page_fault+0x12e/0x8a0 [ 689.977129] ? page_fault+0x8/0x30 [ 689.977142] page_fault+0x1e/0x30 [ 689.977151] RIP: 0033:0x401da5 [ 689.977159] Code: 04 85 c0 0f 84 eb 00 00 00 48 8b 05 c5 45 20 00 48 8b 4d a8 48 8b 15 6a 45 20 00 48 0f af d1 48 83 c2 2f 48 01 d0 48 83 e0 f8 <48> 8b 00 48 89 45 e0 48 83 7d e0 00 75 43 48 8b 05 66 45 20 00 48 [ 689.977162] RSP: 002b:00007f3165b3ee40 EFLAGS: 00010206 [ 689.977168] RAX: 00007f3166d09028 RBX: 0000000000000000 RCX: 00000000000011c8 [ 689.977171] RDX: 00000000011c802f RSI: 00007f3165b3ee94 RDI: 0000000000000000 [ 689.977174] RBP: 00007f3165b3eef0 R08: 00007f3165b3ee88 R09: 00007f3165b3ee90 [ 689.977177] R10: 0000000000000004 R11: 00007f3175f12240 R12: 00007ffe66b48cfe [ 689.977181] R13: 00007ffe66b48cff R14: 0000000000606480 R15: 00007f3165b3efc0 [ 689.977231] irq event stamp: 332 [ 689.977238] hardirqs last enabled at (331): [<ffffffff8af36ed4>] _raw_spin_unlock_irq+0x24/0xd0 [ 689.977245] hardirqs last disabled at (332): [<ffffffff89004eda>] trace_hardirqs_off_thunk+0x1a/0x20 [ 689.977253] softirqs last enabled at (0): [<ffffffff891f0fbb>] copy_process+0x1c9b/0x6600 [ 689.977259] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 689.977263] ---[ end trace 0000000000000002 ]--- [ 690.947721] EDAC DEBUG: i3000_check: MC0 [ 692.679402] EDAC DEBUG: i3000_check: MC0 [ 693.701391] EDAC DEBUG: i3000_check: MC0 Version-Release number of selected component (if applicable): 4.18.0-255.rt7.20.el8.x86_64+debug How reproducible: always Steps to Reproduce: 1. yum -y install kernel-kernel-general-memory-function-userfaultfd2.noarch 2. cd /mnt/tests/kernel/general/memory/function/userfaultfd2 3. make run Actual results: there's unexpected sleep function showed. Expected results: Fix it. Additional info:
This is only seen on kernel-rt.
This can be reproduced on upstream as well (will report). Also, it can be reproduced on all 8.x RT versions. Guess we never noticed until now because the test is actually a PASS.
FYI, I think this is a false warning and I've posted a patch upstream to address it: https://lore.kernel.org/lkml/20210406221952.50399-1-ahalaney@redhat.com/
tglx said this would land in the next RT release, will wait until I see it there to backport: https://lore.kernel.org/lkml/877dkoud19.ffs@nanos.tec.linutronix.de/
Sorry for the delay -- I've been slowly trying to get this to land somewhere upstream since the last comment. linux-rt-devel overhauled their rwsem implementation for RT and thus tglx never picked up the patch after that above thread. I've been poking the stable maintainers offline to get it landed there, and Steven has included it in a RC today. Once that officially releases I'll go ahead and post a patch. Thanks for being patient. https://lore.kernel.org/linux-rt-users/20210820234737.244832083@goodmis.org/
Landed upstream in the 5.10 RT stable branch: https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v5.10-rt&id=b2ed0a4302faf2bb09e97529dd274233c082689b Too close to 8.5 beta to target that. Setting ITR to 8.6 for that reason. It sounds like 8.6 might use the new RT patchset, in which case this won't ever land on 8.6 and will have to target 8.5 directly. Until that happens though I'm assuming this needs to land in 8.6. Also, this affects 8.2-rt - 8.5-rt as well. I don't think I'm supposed to set ZTR until this is verified though, so holding off on that.
Premature, but here's the brew build for the debug kernel: http://brew-task-repos.usersys.redhat.com/repos/scratch/ahalaney/kernel-rt/4.18.0/337.rt7.118.el8bz1903578v1/ I've posted the MR but I'm going to leave this in the ASSIGNED state until 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to MODIFIED. Please let me know if anyone has objections to that strategy. MR: https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1249
(In reply to Andrew Halaney from comment #9) > Premature, but here's the brew build for the debug kernel: > http://brew-task-repos.usersys.redhat.com/repos/scratch/ahalaney/kernel-rt/4. > 18.0/337.rt7.118.el8bz1903578v1/ This is tested, result is good. :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: /kernel/general/memory/function/userfaultfd2 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: [ 21:54:45 ] :: [ LOG ] :: JOURNAL XML: /var/tmp/beakerlib-REi9TkB/journal.xml :: [ 21:54:45 ] :: [ LOG ] :: JOURNAL TXT: /var/tmp/beakerlib-REi9TkB/journal.txt :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: :: Duration: 213s :: Phases: 3 good, 0 bad :: OVERALL RESULT: PASS (/kernel/general/memory/function/userfaultfd2) [root@ibm-x3650m4-05 userfaultfd2]# [root@ibm-x3650m4-05 userfaultfd2]# dmesg [ 507.204710] runtest.sh (1898): drop_caches: 3 [root@ibm-x3650m4-05 userfaultfd2]# dmesg [ 507.204710] runtest.sh (1898): drop_caches: 3 [root@ibm-x3650m4-05 userfaultfd2]# uname -r 4.18.0-337.rt7.118.el8bz1903578v1.x86_64+debug [root@ibm-x3650m4-05 userfaultfd2]# dmesg [ 507.204710] runtest.sh (1898): drop_caches: 3 > > I've posted the MR but I'm going to leave this in the ASSIGNED state until > 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to > MODIFIED. Please let me know if anyone has objections to that strategy. The ‘Devel target Milestone' is '2', not sure if we got enough reviews/acks for the MR, is time to move to MODIFIED? as it's ITM2 already. > > MR: https://gitlab.com/redhat/rhel/src/kernel/rhel-8/-/merge_requests/1249
(In reply to Chunyu Hu from comment #10) > (In reply to Andrew Halaney from comment #9) > > I've posted the MR but I'm going to leave this in the ASSIGNED state until > > 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to > > MODIFIED. Please let me know if anyone has objections to that strategy. > > The ‘Devel target Milestone' is '2', not sure if we got enough reviews/acks > for the > MR, is time to move to MODIFIED? as it's ITM2 already. Ah, my apologies @chuhu, the rhel-8 kernel doesn't have a place to land 8.6-rt changes yet but that should happen today. I didn't realize that would be the case (I assumed by DTM 2 there would be somewhere to land the patch). I'm moving to DTM 3 to give buffer room for review after the branching happens today, can you please confirm that ITM 5 is ok and if not adjust? Thanks and sorry!
(In reply to Andrew Halaney from comment #11) > (In reply to Chunyu Hu from comment #10) > > (In reply to Andrew Halaney from comment #9) > > > I've posted the MR but I'm going to leave this in the ASSIGNED state until > > > 8.5 branches off. Once main-rt targets the 8.6 release I'll change it to > > > MODIFIED. Please let me know if anyone has objections to that strategy. > > > > The ‘Devel target Milestone' is '2', not sure if we got enough reviews/acks > > for the > > MR, is time to move to MODIFIED? as it's ITM2 already. > > Ah, my apologies @chuhu, the rhel-8 kernel doesn't have a place to > land 8.6-rt changes yet but that should happen today. > > I didn't realize that would be the case (I assumed by DTM 2 there would be > somewhere to land the patch). > I'm moving to DTM 3 to give buffer room for review after the branching > happens today, > can you please confirm that ITM 5 is ok and if not adjust? Thanks and sorry! It works for me for ITM-5. Thanks! The MR is there and bot has already attached the build, and if the workflow-bot can notify when the MR gets enough reviews/acks, that would be great.
You are right, the MR is now targeting the correct branch for 8.6-rt release. The MR has enough reviews/acks now (I guess the bot doesn't post that), so all that should be left is adding "Verified: Tested" when appropriate (not sure if you want to take the MR artifacts for a spin or if you trust my brew artifacts) and a maintainer to merge. Thanks!
(In reply to Andrew Halaney from comment #14) > You are right, the MR is now targeting the correct branch for 8.6-rt release. > > The MR has enough reviews/acks now (I guess the bot doesn't post that), > so all that should be left is adding "Verified: Tested" when > appropriate (not sure if you want to take the MR artifacts for > a spin or if you trust my brew artifacts) and a maintainer to merge. > Thanks! I trust the brew build, set 'Tested', there's no debug kernel build in the MR. Thanks!
Hi Andrew, When are we going to add this into candidate/official kernel-rt build? The target ITM for this is ITM-5, which is from Sep-28 to Oct 4. Maybe we need to defer this for several ITMs? if that's the case, I'll adjust the ITM field of the bz. Thanks! Regards, Chunyu Hu
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: kernel-rt security and bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1975