Bug 1518274
Summary: | backport: c4ccd6b1ce locking/rtmutex: Prevent dequeue vs. unlock race | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Luiz Capitulino <lcapitulino> |
Component: | kernel-rt | Assignee: | Clark Williams <williams> |
kernel-rt sub component: | Other | QA Contact: | Jiri Kastner <jkastner> |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | unspecified | ||
Priority: | unspecified | CC: | bhu, lcapitulino, lgoncalv, lmiksik, williams |
Version: | 7.5 | ||
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: |
undefined
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2018-04-10 09:00:10 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 1442258 | ||
Attachments: |
Description
Luiz Capitulino
2017-11-28 14:31:05 UTC
Created attachment 1359903 [details]
c4ccd6b1ce locking/rtmutex: Prevent dequeue vs. unlock race
Created attachment 1359904 [details]
Revert "[rt] avoid disabling preemption during fast iova allocations"
Created attachment 1359905 [details]
a761e53ed4 iommu/iova: Don't disable preempt around this_cpu_ptr()
Created attachment 1359918 [details]
raw_cpu_ptr -> this_cpu_ptr
Forgot to post the scheduling while atomic dump patches 1-3 are fixing: [18094.434539] BUG: scheduling while atomic: ksoftirqd/5/58/0x00000002 [18094.434567] Modules linked in: vhost_net vhost macvtap macvlan xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink bridge stp llc iTCO_wdt iTCO_vendor_support sb_edac intel_powerclamp dcdbas coretemp intel_rapl i osf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd ipmi_ssif pcspkr sg ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter shpchp mei_me mei lpc_ich nfsd auth_rpcgss nfs_acl lockd grace ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200 i2c_algo_bit drm_kms_helper [18094.434574] syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm ahci tg3 i2c_core libahci ptp pps_core libata mxm_wmi megaraid_sas wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod scsi_transport_iscsi [18094.434576] CPU: 5 PID: 58 Comm: ksoftirqd/5 Not tainted 3.10.0-789.rt56.723.fix1.el7.x86_64 #1 [18094.434577] Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.0.1 04/11/2016 [18094.434578] Call Trace: [18094.434586] [<ffffffffb56d8e16>] dump_stack+0x19/0x1b [18094.434589] [<ffffffffb56d3691>] __schedule_bug+0x62/0x70 [18094.434592] [<ffffffffb56ddaf6>] __schedule+0x6b6/0x830 [18094.434594] [<ffffffffb56ddca0>] schedule+0x30/0xa0 [18094.434595] [<ffffffffb56dea7d>] rt_spin_lock_slowlock+0x13d/0x360 [18094.434597] [<ffffffffb56dffe5>] rt_spin_lock+0x25/0x30 [18094.434601] [<ffffffffb5575067>] alloc_iova_fast+0x167/0x220 [18094.434605] [<ffffffffb55809e5>] intel_alloc_iova+0xa5/0xd0 [18094.434607] [<ffffffffb5584eb5>] intel_map_sg+0xc5/0x240 [18094.434611] [<ffffffffb54879ba>] scsi_dma_map+0xaa/0xe0 [18094.434618] [<ffffffffc02c0c4c>] megasas_build_io_fusion+0xfc/0x8e0 [megaraid_sas] [18094.434623] [<ffffffffb50c1cec>] ? try_to_wake_up+0x6c/0x560 [18094.434629] [<ffffffffc02c16dd>] megasas_build_and_issue_cmd_fusion+0xed/0x300 [megaraid_sas] [18094.434633] [<ffffffffc02b022e>] megasas_queue_command+0x11e/0x130 [megaraid_sas] [18094.434635] [<ffffffffb547cf7a>] scsi_dispatch_cmd+0xaa/0x290 [18094.434637] [<ffffffffb5486546>] scsi_request_fn+0x4f6/0x6b0 [18094.434642] [<ffffffffb5303233>] __blk_run_queue+0x33/0x40 [18094.434644] [<ffffffffb5303286>] blk_run_queue+0x26/0x40 [18094.434646] [<ffffffffb5485148>] scsi_run_queue+0x288/0x320 [18094.434647] [<ffffffffb547c2bd>] ? __scsi_put_command+0x2d/0x90 [18094.434649] [<ffffffffb5486740>] scsi_next_command+0x20/0x40 [18094.434650] [<ffffffffb5486899>] scsi_end_request+0x139/0x1e0 [18094.434652] [<ffffffffb5486b08>] scsi_io_completion+0x168/0x6a0 [18094.434655] [<ffffffffb547b955>] scsi_finish_command+0xd5/0x130 [18094.434657] [<ffffffffb5486022>] scsi_softirq_done+0x132/0x160 [18094.434659] [<ffffffffb530e0d0>] blk_done_softirq+0xa0/0xe0 [18094.434661] [<ffffffffb508a750>] do_current_softirqs+0x240/0x470 [18094.434663] [<ffffffffb508aa6a>] run_ksoftirqd+0x3a/0x70 [18094.434665] [<ffffffffb50b5e62>] smpboot_thread_fn+0x202/0x2d0 [18094.434667] [<ffffffffb50b5c60>] ? lg_local_unlock+0x20/0x20 [18094.434670] [<ffffffffb50ac98f>] kthread+0xcf/0xe0 [18094.434672] [<ffffffffb50ac8c0>] ? kthread_worker_fn+0x170/0x170 [18094.434673] [<ffffffffb56e8e58>] ret_from_fork+0x58/0x90 [18094.434675] [<ffffffffb50ac8c0>] ? kthread_worker_fn+0x170/0x170 For some reason I don't have the commit id's listed in c#1. So tracking down the equivalent mods in 7.5: revert 295e3dc1a7187 [rt] avoid disabling preemption during fast iova allocations apply aaffaa8a3b595 iommu/iova: Don't disable preempt around this_cpu_ptr() Then change references to raw_cpu_ptr to this_cpu_ptr. They are from the upstream RT devel repo. Created attachment 1364032 [details]
locking/rtmutex: Prevent dequeue vs. unlock race
Created attachment 1364033 [details]
iommu/iova: Don't disable preempt around this_cpu_ptr()
Patches sent to kernel-rt-team list https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=14701302 Scratch build done against kernel-rt-3.10.0-809.rt56.745.el7 Booted, 12h rteval running on realtime-03.khw.lab.eng.bos.redhat.com We need to review the "iommu/iova: Don't disable preempt around this_cpu_ptr()" patch in light of this commit, which introduced raw_cpu_ptr: b3ca1c10d7b3 percpu: add raw_cpu_ops In our current code, raw_cpu_ptr() maps directly to __this_cpu_ptr() as this is the version that does not check preemption. Upstream morphed things in a way that later on this_cpu_ptr() dropped the checks it did. My impression is that the change will be basically this: s/this_cpu_ptr/__this_cpu_ptr/g @jiri, yes looks correct to me. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:0676 |