Bug 1512875

Summary: WARNING: CPU: 7 PID: 1090 at drivers/target/target_core_transport.c:3009 __transport_check_aborted_status+0x153/0x190 [target_core_mod]
Product: Red Hat Enterprise Linux 7 Reporter: Zhang Yi <yizhan>
Component: kernel-rtAssignee: Arnaldo Carvalho de Melo <acme>
kernel-rt sub component: Kernel-rt Target (LIO) QA Contact: Zhang Yi <yizhan>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bhu, cleech, daolivei, jkastner, lgoncalv, williams, yizhan
Version: 7.5   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-10-30 09:40:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1532680    

Description Zhang Yi 2017-11-14 11:41:36 UTC
Description of problem:
WARNING: CPU: 7 PID: 1090 at drivers/target/target_core_transport.c:3009 __transport_check_aborted_status+0x153/0x190 [target_core_mod]

Version-Release number of selected component (if applicable):
3.10.0-768.rt56.699.el7.x86_64

How reproducible:
100%

Steps to Reproduce:
1. Setup SRP target
2. Connect the target on client
3.

Actual results:


Expected results:


Additional info:
This issue was reproduced on RT kernel.

[ 7552.799997] ------------[ cut here ]------------
[ 7552.800016] WARNING: CPU: 7 PID: 1090 at drivers/target/target_core_transport.c:3009 __transport_check_aborted_status+0x153/0x190 [target_core_mod]
[ 7552.800037] Modules linked in: target_core_user uio target_core_pscsi target_core_file target_core_iblock ib_srpt ib_srp scsi_transport_srp scsi_tgt xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter bridge stp llc ib_isert iscsi_target_mod target_core_mod ib_ucm rpcrdma mlx5_ib sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm libiscsi ib_umad ib_ipoib scsi_transport_iscsi ib_cm sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support hfi1 ipmi_ssif sg rdmavt ib_core hpilo hpwdt pcspkr ipmi_si
[ 7552.800055]  ipmi_devintf ipmi_msghandler wmi acpi_power_meter ioatdma dca shpchp pcc_cpufreq lpc_ich ip_tables xfs libcrc32c mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ata_generic pata_acpi mlx5_core ata_piix tg3 drm devlink libata i2c_core ptp hpsa scsi_transport_sas pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ib_srpt]
[ 7552.800058] CPU: 7 PID: 1090 Comm: kworker/7:1H Not tainted 3.10.0-768.rt56.699.el7.x86_64 #1
[ 7552.800058] Hardware name: HP ProLiant DL380p Gen8, BIOS P70 11/14/2013
[ 7552.800066] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[ 7552.800067] Call Trace:
[ 7552.800075]  [<ffffffffb76cd055>] dump_stack+0x19/0x1b
[ 7552.800078]  [<ffffffffb70807bb>] __warn+0xfb/0x120
[ 7552.800080]  [<ffffffffb70808fd>] warn_slowpath_null+0x1d/0x20
[ 7552.800085]  [<ffffffffc0ab3983>] __transport_check_aborted_status+0x153/0x190 [target_core_mod]
[ 7552.800091]  [<ffffffffc0ab5c04>] target_execute_cmd+0x34/0x2e0 [target_core_mod]
[ 7552.800096]  [<ffffffffc0ab5fc2>] transport_generic_new_cmd+0x112/0x240 [target_core_mod]
[ 7552.800100]  [<ffffffffc0ab6132>] transport_handle_cdb_direct+0x42/0x90 [target_core_mod]
[ 7552.800105]  [<ffffffffc0ab62cd>] target_submit_cmd_map_sgls+0x14d/0x210 [target_core_mod]
[ 7552.800107]  [<ffffffffc09c15b4>] srpt_handle_new_iu+0x254/0x660 [ib_srpt]
[ 7552.800109]  [<ffffffffc09c1bc8>] srpt_recv_done+0x38/0x60 [ib_srpt]
[ 7552.800113]  [<ffffffffc07a5fb5>] __ib_process_cq+0x65/0xe0 [ib_core]
[ 7552.800118]  [<ffffffffc07a60a0>] ib_cq_poll_work+0x20/0x60 [ib_core]
[ 7552.800120]  [<ffffffffb70a4336>] process_one_work+0x176/0x4a0
[ 7552.800121]  [<ffffffffb70a50ec>] worker_thread+0x16c/0x3f0
[ 7552.800123]  [<ffffffffb70a4f80>] ? manage_workers.isra.36+0x2b0/0x2b0
[ 7552.800125]  [<ffffffffb70ac62f>] kthread+0xcf/0xe0
[ 7552.800139]  [<ffffffffb70ac560>] ? kthread_worker_fn+0x170/0x170
[ 7552.800151]  [<ffffffffb76dd1d8>] ret_from_fork+0x58/0x90
[ 7552.800153]  [<ffffffffb70ac560>] ? kthread_worker_fn+0x170/0x170
[ 7552.800154] ---[ end trace 0000000000000002 ]---
[ 7554.164964] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.231254] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.294860] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.360810] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.421867] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.485931] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.546909] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.607820] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.671883] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.
[ 7554.730826] srpt/0xf4521403000e4aa0f4521403000e4ad0: Unsupported SCSI Opcode 0xa3, sending CHECK_CONDITION.

Comment 2 Beth Uptagrafft 2017-11-14 16:06:58 UTC
Assigning to an RT engineer.

Comment 3 Arnaldo Carvalho de Melo 2018-02-28 19:12:51 UTC
An rpm with a patch converting that WARN_ON_ONCE() to the _NONRT() variant is available for testing at:

http://people.redhat.com/acme/torture/RPMS/x86_64/kernel-rt-3.10.0-843.rt56.784.test.el7.x86_64.rpm

Please let me know if it is possible to test it or alternatively to loan me the machines and provide instructions about how to test it myself.

Comment 4 Zhang Yi 2018-03-01 05:27:24 UTC
(In reply to Arnaldo Carvalho de Melo from comment #3)
> An rpm with a patch converting that WARN_ON_ONCE() to the _NONRT() variant
> is available for testing at:
> 
> http://people.redhat.com/acme/torture/RPMS/x86_64/kernel-rt-3.10.0-843.rt56.
> 784.test.el7.x86_64.rpm
> 
> Please let me know if it is possible to test it or alternatively to loan me
> the machines and provide instructions about how to test it myself.

Cannot reproduce this issue with above kernel, thanks.

Yi

Comment 6 Luis Claudio R. Goncalves 2018-04-11 15:21:10 UTC
The solution proposed by Arnaldo was included in kernel-rt-3.10.0-863.rt56.805.el7 as this commit:

    a6196419b848 target: No need to WARN_ON if !irqs_disabled() when checking aborted status

The full commit is:

commit a6196419b8480dcf483a1ea2073a57cb6874af81
Author: Arnaldo Carvalho de Melo <acme>
Date:   Wed Feb 28 11:13:49 2018 -0300

    target: No need to WARN_ON if !irqs_disabled() when checking aborted status
    
    Since we already require that cmd->t_state_lock be held.
    
    This was introduced in commit 310d3d314be7 ("target: Fix race with
    SCF_SEND_DELAYED_TAS handling").
    
    Reported at https://bugzilla.redhat.com/show_bug.cgi?id=1512875
    
    Signed-off-by: Arnaldo Carvalho de Melo <acme>

diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index dd6fd003860f..e6b25d9dff21 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -3008,7 +3008,7 @@ static int __transport_check_aborted_status(struct se_cmd *cmd, int send_status)
 	__acquires(&cmd->t_state_lock)
 {
 	assert_spin_locked(&cmd->t_state_lock);
-	WARN_ON_ONCE(!irqs_disabled());
+	WARN_ON_ONCE_NONRT(!irqs_disabled());
 
 	if (!(cmd->transport_state & CMD_T_ABORTED))
 		return 0;

Comment 10 errata-xmlrpc 2018-10-30 09:40:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:3096