Bug 1896982
| Summary: | kernel-rt: kernel BUG at lib/list_debug.c:28! | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Chunyu Hu <chuhu> |
| Component: | kernel-rt | Assignee: | Juri Lelli <jlelli> |
| kernel-rt sub component: | Memory Management | QA Contact: | Chunyu Hu <chuhu> |
| Status: | CLOSED CURRENTRELEASE | Docs Contact: | |
| Severity: | unspecified | ||
| Priority: | unspecified | CC: | bhu, liwan, mm-maint, pifang, rt-maint, rt-qe |
| Version: | 8.4 | ||
| Target Milestone: | rc | ||
| Target Release: | 8.4 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-03-12 08:42:23 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Chunyu Hu
2020-11-12 01:08:36 UTC
origin job is run on a dt kernel rt dt 4.18.0-246.rt4.11.el8.dt2.x86_64, also list corruption panic with different line of list_debug, so run mainline version, got the similar panic in comment#0: Vmcore: http://ibm-x3250m4-03.rhts.eng.pek2.redhat.com/vmcore/chuhu/4.18.0-246.rt4.11.el8.dt2.x86_64/4718130/hp-dl380eg8-01.rhts.eng.pek2.redhat.com/10.73.194.73-2020-11-10-06:23:36/vmcore-dmesg.txt [ 2211.243652] ------------[ cut here ]------------ [ 2211.243655] kernel BUG at lib/list_debug.c:56! [ 2211.243673] invalid opcode: 0000 [#1] PREEMPT_RT SMP PTI [ 2211.243678] CPU: 18 PID: 4154990 Comm: runtest.sh Kdump: loaded Tainted: G I --------- - - 4.18.0-246.rt4.11.el8.dt2.x86_64 #1 [ 2211.243679] Hardware name: HP ProLiant DL380e Gen8, BIOS P73 07/01/2013 [ 2211.243701] RIP: 0010:__list_del_entry_valid.cold.1+0x20/0x4c [ 2211.243706] Code: 43 4f 87 e8 7c e7 cb ff 0f 0b 48 89 fe 48 89 c2 48 c7 c7 b0 43 4f 87 e8 68 e7 cb ff 0f 0b 48 c7 c7 60 44 4f 87 e8 5a e7 cb ff <0f> 0b 48 89 f2 48 89 fe 48 c7 c7 20 44 4f 87 e8 46 e7 cb ff 0f 0b [ 2211.243708] RSP: 0018:ffffbb1921a6fc70 EFLAGS: 00010246 [ 2211.243711] RAX: 0000000000000054 RBX: fffff76aa13b5208 RCX: 0000000000000001 [ 2211.243712] RDX: 0000000000000000 RSI: ffffffff874e1ce3 RDI: 00000000ffffffff [ 2211.243714] RBP: 00000000000005b7 R08: ffffffff8698aa90 R09: 0000000000000544 [ 2211.243715] R10: 000000000001d3d4 R11: ffffbb1921a6fb20 R12: ffff9a846f4f4630 [ 2211.243717] R13: fffff76aa0f57388 R14: ffffbb1921a6fcd0 R15: ffff9a846f4f4650 [ 2211.243720] FS: 00007f229a208740(0000) GS:ffff9a846f580000(0000) knlGS:0000000000000000 [ 2211.243721] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 2211.243723] CR2: 00005643b9d6bb20 CR3: 000000084dd10003 CR4: 00000000000606e0 [ 2211.243725] Call Trace: [ 2211.243737] isolate_pcp_pages+0xf3/0x1c0 [ 2211.243747] drain_pages_zone+0x17c/0x250 [ 2211.243751] drain_pages+0x39/0x50 [ 2211.243754] drain_all_pages+0xce/0x120 [ 2211.243759] start_isolate_page_range+0x1ce/0x2f0 [ 2211.243768] __offline_pages+0xfa/0x8f0 [ 2211.243776] ? rt_spin_unlock+0x13/0x40 [ 2211.243784] ? klist_next+0xd5/0xe0 [ 2211.243790] ? device_is_dependent+0xa0/0xa0 [ 2211.243800] memory_subsys_offline+0x45/0x60 [ 2211.243806] device_offline+0x84/0xb0 [ 2211.243812] state_store+0x63/0xb0 [ 2211.243821] kernfs_fop_write+0xf6/0x1a0 [ 2211.243827] vfs_write+0xa5/0x1a0 [ 2211.243833] ksys_write+0x52/0xc0 [ 2211.243840] do_syscall_64+0x87/0x1a0 [ 2211.243845] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 2211.243849] RIP: 0033:0x7f22998ea198 [ 2211.243852] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 c5 43 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55 [ 2211.243853] RSP: 002b:00007ffdb428e4c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 2211.243856] RAX: ffffffffffffffda RBX: 0000000000000008 RCX: 00007f22998ea198 [ 2211.243857] RDX: 0000000000000008 RSI: 00005643b9b30380 RDI: 0000000000000001 [ 2211.243858] RBP: 00005643b9b30380 R08: 000000000000000a R09: 00007f229997a4c0 [ 2211.243860] R10: 000000000000000a R11: 0000000000000246 R12: 00007f2299bba6c0 [ 2211.243861] R13: 0000000000000008 R14: 00007f2299bb5880 R15: 0000000000000008 [ 2211.243865] Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr iTCO_wdt iTCO_vendor_support intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore ipmi_ssif intel_rapl_perf pcspkr hpwdt hpilo ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter ioatdma lpc_ich ip_tables xfs sd_mod t10_pi sg mgag200 drm_kms_helper uas syscopyarea sysfillrect sysimgblt usb_storage fb_sys_fops drm_vram_helper drm_ttm_helper ttm ahci sfc serio_raw igb bnx2x libahci drm dca mtd libcrc32c i2c_algo_bit mdio crc32c_intel libata dm_mirror dm_region_hash dm_log dm_mod There's no such issue with 8.4 GA version kernel-rt-4.18.0-240.rt7.54.el8 https://beaker.engineering.redhat.com/jobs/4723325 Hi, Would it be possible to test again with latest 8.4-rt build (kernel-rt-4.18.0-296.rt7.63.el8 at the time of writing). We merged an RT specific change lately that it would be interesting to see if it might play a role here. Thanks! (In reply to Juri Lelli from comment #3) > Hi, > > Would it be possible to test again with latest 8.4-rt build > (kernel-rt-4.18.0-296.rt7.63.el8 at the time of writing). Job submitted. Will update when job finish running. > > We merged an RT specific change lately that it would be interesting > to see if it might play a role here. > > Thanks! |