Bug 2154880

Summary: [rhel8] LTP: read_all_sys - RIP: 0010:intel_rps_get_max_frequency+0x5/0x40 [i915]
Product: Red Hat Enterprise Linux 8 Reporter: Bruno Goncalves <bgoncalv>
Component: kernelAssignee: Jocelyn Falempe <jfalempe>
kernel sub component: Graphics QA Contact: Desktop QE <desktop-qa-list>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: jfalempe, liwan, ndegraef, pifang, tpelka
Version: 8.8Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: kernel-4.18.0-456.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2162003 (view as bug list) Environment:
Last Closed: 2023-05-16 08:59:38 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2162003    

Description Bruno Goncalves 2022-12-19 13:35:15 UTC
Description of problem:

During LTP read_all_sys test from [1] the following panic happens:

[ 4354.948286] LTP: starting read_all_sys (read_all -d /sys -q -r 3)
[ 4355.800842] rtc_cmos 00:03: Deprecated ABI, please use nvmem
[ 4356.407119] hp_wmi: query 0x4 returned error 0x5
[ 4356.409006] hp_wmi: query 0x4 returned error 0x5
[ 4356.410570] hp_wmi: query 0x4 returned error 0x5
[ 4356.412182] hp_wmi: query 0x4 returned error 0x5
[ 4356.414104] hp_wmi: query 0x1 returned error 0x3
[ 4356.415580] hp_wmi: query 0x4 returned error 0x5
[ 4356.417539] hp_wmi: query 0x1 returned error 0x3
[ 4356.419016] hp_wmi: query 0x4 returned error 0x5
[ 4356.420765] hp_wmi: query 0x1 returned error 0x3
[ 4356.422758] hp_wmi: query 0x2 returned error 0x3
[ 4356.424534] hp_wmi: query 0x2 returned error 0x3
[ 4356.426046] hp_wmi: query 0x3 returned error 0x5
[ 4356.427669] hp_wmi: query 0x2a returned error 0x5
[ 4356.429520] hp_wmi: query 0x2 returned error 0x3
[ 4356.431006] hp_wmi: query 0x3 returned error 0x5
[ 4356.432518] hp_wmi: query 0x2a returned error 0x5
[ 4356.434014] hp_wmi: query 0x3 returned error 0x5
[ 4356.435467] hp_wmi: query 0x2a returned error 0x5
[ 4356.511993] BUG: unable to handle kernel paging request at 000000000001002b
[ 4356.518947] PGD 80000001079ee067 P4D 80000001079ee067 PUD 4607a4067 PMD 0 
[ 4356.525820] Oops: 0000 [#1] SMP PTI
[ 4356.529314] CPU: 1 PID: 525114 Comm: read_all Kdump: loaded Tainted: G           OE    --------- -  - 4.18.0-443.el8.mr3821_719784222.gc844.x86_64 #1
[ 4356.542699] Hardware name: HP HP Z240 SFF Workstation/802E, BIOS N51 Ver. 01.21 03/02/2016
[ 4356.550934] RIP: 0010:intel_rps_get_max_frequency+0x5/0x40 [i915]
[ 4356.557110] Code: 04 7e 14 80 bf d1 33 ff ff 00 74 0b 80 bf 31 32 ff ff 00 74 02 eb be 0f b6 b7 c8 00 00 00 e9 d2 ed ff ff 66 90 0f 1f 44 00 00 <83> bf a4 2e ff ff 04 7e 1d 80 bf d1 33 ff ff 00 74 14 80 bf 31 32
[ 4356.575814] RSP: 0018:ffffb852019e3e10 EFLAGS: 00010206
[ 4356.581057] RAX: ffff8940868e5530 RBX: ffff8940868f54b0 RCX: 0000000000000001
[ 4356.588173] RDX: ffffffffc047f100 RSI: ffff89408966ff29 RDI: 000000000001d187
[ 4356.595289] RBP: 0000000000000000 R08: ffff89408966ff20 R09: ffff8940890159c0
[ 4356.602399] R10: ffff8943e0803000 R11: 0000000000000001 R12: ffff8940868f54d0
[ 4356.609512] R13: ffffffffc047f100 R14: 0000000000000001 R15: ffff8943e1405680
[ 4356.616620] FS:  00007fcf228ea740(0000) GS:ffff8943ef480000(0000) knlGS:0000000000000000
[ 4356.624686] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4356.630410] CR2: 000000000001002b CR3: 000000045faec005 CR4: 00000000003706e0
[ 4356.637541] DR0: 0000000000000001 DR1: 0000000000000000 DR2: 0000000000000000
[ 4356.644648] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
[ 4356.651757] Call Trace:
[ 4356.654224]  sysfs_gt_attribute_r_func.isra.4+0x52/0xb0 [i915]
[ 4356.660112]  max_freq_mhz_show+0x1a/0x40 [i915]
[ 4356.664708]  sysfs_kf_seq_show+0x9b/0x110
[ 4356.668710]  seq_read+0x163/0x420
[ 4356.672020]  vfs_read+0x91/0x150
[ 4356.675239]  ksys_read+0x4f/0xb0
[ 4356.678456]  do_syscall_64+0x5b/0x1b0
[ 4356.682114]  entry_SYSCALL_64_after_hwframe+0x61/0xc6
[ 4356.687142] RIP: 0033:0x7fcf224baa82
[ 4356.690710] Code: 95 20 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b6 0f 1f 80 00 00 00 00 f3 0f 1e fa 8b 05 c6 d9 20 00 85 c0 75 12 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 41 54 49 89 d4 55 48 89
[ 4356.709427] RSP: 002b:00007ffff2080e48 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 4356.716967] RAX: ffffffffffffffda RBX: 000000000000000b RCX: 00007fcf224baa82
[ 4356.724070] RDX: 00000000000003ff RSI: 00007ffff2080f00 RDI: 000000000000000b
[ 4356.731195] RBP: 00007fcf228e5000 R08: 00007ffff21c51b0 R09: 00000000006b3e42
[ 4356.738305] R10: 00000000006b3e42 R11: 0000000000000246 R12: 0000000000000001
[ 4356.745412] R13: 0000000000000018 R14: 00007fcf228e9028 R15: 000000000008033a
[ 4356.752520] Modules linked in: snd_seq_dummy binfmt_misc n_gsm pps_ldisc slcan ppp_synctty n_hdlc ppp_async ppp_generic slip slhc nfsv3 nfs_acl tun brd overlay fuse vfat fat ext4 mbcache jbd2 loop tcp_diag udp_diag inet_diag rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc snd_hda_codec_hdmi snd_ctl_led snd_hda_codec_realtek intel_rapl_msr intel_rapl_common snd_hda_codec_generic ledtrig_audio x86_pkg_temp_thermal snd_hda_intel intel_powerclamp coretemp snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm_intel hp_wmi snd_hda_core snd_hwdep snd_seq kvm irqbypass sparse_keymap snd_seq_device snd_pcm rfkill snd_timer crct10dif_pclmul crc32_pclmul iTCO_wdt mei_wdt iTCO_vendor_support wmi_bmof ghash_clmulni_intel snd rapl intel_cstate intel_uncore pcspkr soundcore wmi i2c_i801 tpm_infineon acpi_pad intel_pmc_core intel_pch_thermal mei_me mei xfs libcrc32c raid0 sr_mod sd_mod cdrom t10_pi sg i915 i2c_algo_bit cec drm_buddy intel_gtt drm_display_helper
[ 4356.752583]  drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm e1000e ahci libahci libata serio_raw crc32c_intel video dm_mirror dm_region_hash dm_log dm_mod [last unloaded: init_module]
[ 4356.856850] Red Hat flags: eBPF/sock
[ 4356.860422] CR2: 000000000001002b

Version-Release number of selected component (if applicable):
kernel-4.18.0-443.el8

How reproducible:
not sure yet

Steps to Reproduce:
1.Trying run LTP from [1]


Additional info:
test logs: https://datawarehouse.cki-project.org/kcidb/tests/6341165
cki issue tracker: https://datawarehouse.cki-project.org/issue/1772#hosts

[1] https://gitlab.com/cki-project/kernel-tests/-/tree/main/distribution/ltp/lite

Comment 3 Jocelyn Falempe 2022-12-23 14:51:19 UTC
Backporting the following fix from upstream, makes it work on the hp-z240 test machine:

a8a4f0467d706f drm/i915: Fix CFI violations in gt_sysfs
https://cgit.freedesktop.org/drm-intel/commit/?id=a8a4f0467d706fc22d286dfa973946e5944b793c

grep . *
id:0
punit_req_freq_mhz:0
rc6_enable:1
rc6_residency_ms:592221
rps_act_freq_mhz:0
rps_boost_freq_mhz:950
rps_cur_freq_mhz:350
rps_max_freq_mhz:950
rps_min_freq_mhz:350
rps_RP0_freq_mhz:950
rps_RP1_freq_mhz:350
rps_RPn_freq_mhz:350

I will propose a MR soon.

Comment 7 Jocelyn Falempe 2023-01-17 08:08:37 UTC
FYI, the upstream patch is now included in the stable v6.1.7 branch.

https://www.spinics.net/lists/kernel/msg4653539.html

Comment 13 errata-xmlrpc 2023-05-16 08:59:38 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: kernel security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2951