Description of problem: These two command will cause kernel panic: ================================== iscsiadm -m discovery -t st -p 192.0.1.1 -I iser iscsiadm -m node -T iqn.2010-10.com.example:storage-1000 -l ================================== Unable to handle kernel paging request at 00002b7edc527310 RIP: [<ffffffff88260077>] :ib_ipath:ipath_sg_dma_address+0x5/0x66 PGD 0 Oops: 0000 [1] SMP last sysfs file: /block/sdb/removable CPU 2 Modules linked in: ib_iser ib_srp rds ib_sdp ib_ipoib rdma_ucm rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ib_sa ib_mad iw_cxgb3 ib_ipath ib_core be2iscsi iscsi_tcp bnx2i cnic uio cxgb3i cxgb3 8021q libiscsi_tcp libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi ipoib_helper autofs4 hidp rfcomm l2cap bluetooth lockd sunrpc ipv6 xfrm_nalgo crypto_api loop dm_multipath scsi_dh video backlight sbs power_meter hwmon i2c_ec i2c_core dell_wmi wmi button battery asus_acpi acpi_memhotplug ac parport_pc lp parport ide_cd bnx2 i5000_edac edac_mc sg cdrom serio_raw pcspkr dm_raid45 dm_message dm_region_hash dm_mem_cache dm_snapshot dm_zero dm_mirror dm_log dm_mod ata_piix libata shpchp megaraid_sas sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd Pid: 5124, comm: iscsi_q_3 Not tainted 2.6.18-194.el5 #1 RIP: 0010:[<ffffffff88260077>] [<ffffffff88260077>] :ib_ipath:ipath_sg_dma_address+0x5/0x66 RSP: 0018:ffff81006064fc88 EFLAGS: 00010246 RAX: ffffffff882aa5e0 RBX: ffff81005d3b5000 RCX: ffff81005d3b4e00 RDX: 00002b7edc527310 RSI: ffff81007a50a000 RDI: 0000000000000000 RBP: ffff81005d3b4e00 R08: ffff810001000058 R09: 0000000000000020 R10: ffff81006f0928d0 R11: 0000000000000050 R12: ffff81006f092a70 R13: 000000000000001f R14: ffff81007a5b0000 R15: ffff81007a50a000 FS: 0000000000000000(0000) GS:ffff81007ff4ce40(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 00002b7edc527310 CR3: 0000000000201000 CR4: 00000000000006e0 Process iscsi_q_3 (pid: 5124, threadinfo ffff81006064e000, task ffff810077193080) Stack: ffffffff8862d0fc 0000000000000003 ffff81006f092880 0000000000000000 ffff81005c7e2290 ffff81005d6f3f10 ffff81005bccce40 ffff81006f0929d0 0000000278525e00 ffff81006064fe30 000000020000004c ffff810002388740 Call Trace: [<ffffffff8862d0fc>] :ib_iser:iser_reg_rdma_mem+0x105/0x7b4 [<ffffffff8868e262>] :libiscsi2:iscsi_xmitworker+0x0/0x2a8 [<ffffffff8862c927>] :ib_iser:iser_send_command+0x157/0x397 [<ffffffff8868e262>] :libiscsi2:iscsi_xmitworker+0x0/0x2a8 [<ffffffff8862dafd>] :ib_iser:iscsi_iser_task_xmit+0xd6/0x1ac [<ffffffff8868e167>] :libiscsi2:iscsi_prep_scsi_cmd_pdu+0x416/0x511 [<ffffffff80063ff8>] thread_return+0x62/0xfe [<ffffffff8868d756>] :libiscsi2:iscsi_xmit_task+0x36/0x69 [<ffffffff8868e3e5>] :libiscsi2:iscsi_xmitworker+0x183/0x2a8 [<ffffffff8004dc37>] run_workqueue+0x94/0xe4 [<ffffffff8004a472>] worker_thread+0x0/0x122 [<ffffffff800a198c>] keventd_create_kthread+0x0/0xc4 [<ffffffff8004a562>] worker_thread+0xf0/0x122 [<ffffffff8008e16d>] default_wake_function+0x0/0xe [<ffffffff800a198c>] keventd_create_kthread+0x0/0xc4 [<ffffffff800a198c>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032bdc>] kthread+0xfe/0x132 [<ffffffff8005efb1>] child_rip+0xa/0x11 [<ffffffff800a198c>] keventd_create_kthread+0x0/0xc4 [<ffffffff80032ade>] kthread+0x0/0x132 [<ffffffff8005efa7>] child_rip+0x0/0x11 Code: 48 8b 0a 48 c1 e9 33 48 89 c8 48 c1 e8 09 48 8b 04 c5 80 43 RIP [<ffffffff88260077>] :ib_ipath:ipath_sg_dma_address+0x5/0x66 RSP <ffff81006064fc88> CR2: 00002b7edc527310 <0>Kernel panic - not syncing: Fatal exception ================================== Version-Release number of selected component (if applicable): RHEL 5.6 Beta 1 kernel-2.6.18-229.el5 scsi-target-utils-1.0.8-0.el5 iscsi-initiator-utils-6.2.0.872-4.el5 openib-1.4.1-5.el5 How reproducible: 100% Steps to Reproduce: 1. Create IPoIB 2. Create iscsi target using scsi-target-utils: ================================ /etc/init.d/tgtd start dd if=/dev/zero of=/tmp/lun1 count=1 bs=1MB seek=2048 dd if=/dev/zero of=/tmp/lun2 count=1 bs=1MB seek=2048 tgtadm --lld iscsi --mode target --op new --tid 1 --targetname iqn.2010-10.com.example:storage-1000 tgtadm --lld iscsi --mode target --op bind --tid 1 --initiator-address ALL tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 1 --backing-store /tmp/lun1 tgtadm --lld iscsi --mode logicalunit --op new --tid 1 --lun 2 --backing-store /tmp/lun2 ================================ 3. Perform iscsi login via iSER: iscsiadm -m discovery -t st -p 192.0.1.1 -I iser #192.0.1.1 is the IP address of target IB iscsiadm -m node -T iqn.2010-10.com.example:storage-1000 -l Actual results: Kernel Panic and system stuck Expected results: iSCSI login successful Additional info: Please feel free to reach me via email or irc if you need any more informaiton.
Additional infor: The iscsi login was sucessfully. Please check these log which is just reported before the kernel panic: ============================== iser: iser_connect:connecting to: ffff81005b1ee04cI4, port 0xbc0c iser: iser_cma_handler:event 0 conn ffff81005d7a7910 id ffff81005d3b5c00 iser: iser_cma_handler:event 2 conn ffff81005d7a7910 id ffff81005d3b5c00 iser: iser_create_ib_conn_res:setting conn ffff81005d7a7910 cma_id ffff81005d3b5c00: fmr_pool ffff81005bccccc0 qp ffff81007bf48800 iser: iser_cma_handler:event 8 conn ffff81005d7a7910 id ffff81005d3b5c00 iser: iser_cma_handler:event: 8, error: 10 iser: iscsi_iser_ep_poll:ib conn ffff81005d7a7910 rc = -1 iser: iscsi_iser_ep_disconnect:ib conn ffff81005d7a7910 state 4 iser: iser_free_ib_conn_res:freeing conn ffff81005d7a7910 cma_id ffff81005d3b5c00 fmr pool ffff81005bccccc0 qp ffff81007bf48800 iser: iser_device_try_release:device ffff81005bcccec0 refcount 0 iser: iser_connect:connecting to: ffff81005d7a784cI4, port 0xbc0c iser: iser_cma_handler:event 0 conn ffff81005d6f3f10 id ffff81005d3b5c00 iser: iser_cma_handler:event 2 conn ffff81005d6f3f10 id ffff81005d3b5c00 iser: iser_create_ib_conn_res:setting conn ffff81005d6f3f10 cma_id ffff81005d3b5c00: fmr_pool ffff81005bcccec0 qp ffff81007a698800 iser: iser_cma_handler:event 9 conn ffff81005d6f3f10 id ffff81005d3b5c00 iser: iscsi_iser_ep_poll:ib conn ffff81005d6f3f10 rc = 1 iser: iscsi_iser_conn_bind:binding iscsi conn ffff81005c7e2290 to iser_conn ffff81005d6f3f10 Vendor: IET Model: Controller Rev: 0001 Type: RAID ANSI SCSI revision: 05 scsi 3:0:0:0: Attached scsi generic sg2 type 12 Vendor: IET Model: VIRTUAL-DISK Rev: 0001 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sdb: 4001953 512-byte hdwr sectors (2049 MB) sdb: Write Protect is off SCSI device sdb: drive cache: write back SCSI device sdb: 4001953 512-byte hdwr sectors (2049 MB) sdb: Write Protect is off SCSI device sdb: drive cache: write back sd 3:0:0:1: Attached scsi disk sdb sd 3:0:0:1: Attached scsi generic sg3 type 0 Vendor: IET Model: VIRTUAL-DISK Rev: 0001 Type: Direct-Access ANSI SCSI revision: 05 SCSI device sdc: 4001953 512-byte hdwr sectors (2049 MB) sdc: Write Protect is off SCSI device sdc: drive cache: write back SCSI device sdc: 4001953 512-byte hdwr sectors (2049 MB) sdc: Write Protect is off SCSI device sdc: drive cache: write back sd 3:0:0:2: Attached scsi disk sdc sd 3:0:0:2: Attached scsi generic sg4 type 0 Unable to handle kernel paging request at 00002b7edc527310 RIP: [<ffffffff88260077>] :ib_ipath:ipath_sg_dma_address+0x5/0x66 PGD 0 Oops: 0000 [1] SMP last sysfs file: /block/sdb/removabl ===========================
Got same kernel panic on RHEL 5.5 kernel-2.6.18-194.el5.
You're hitting some problem in ipath_sg_dma_address which means the underlying Hw is qlogic card and the driver in use in ipath, I don't have such tesbed, as the cards I'm using are all Mellanox ones... would be great if you run with both iser and libiscsi2 debug prints open, and attach the output before the crash, for ib_iser set debug_level=2 and for libiscsi2 set debug_libiscsi=1
one more thing, are you running 32bits or 64bits kernel? if 32bits, is PAE or alike active?
Or Genlitz, It's 64 bits. I don't own that server. I will request Gurhan's help. Gurhan Ozen, Can you provide Or Gerlitz the information above? If you don't time to do so, please loan the servers to me. Thank you.
Gris, I spoke with Gurhan this afternoon. He had said you had exchanged emails and he had given you the names of the systems you could use. ib-test#..... They were all configured and ready for you. Were you able to get the information you needed? Thank you, Jeff
RHEL 5.5 GA kernel 2.6.18-194.el5 x86_64 iscsi-initiator-utils-6.2.0.872-4.el5 System crashed after executing: [root@dell-pe1950-03 ~]# iscsiadm -m node -T iqn.2010-10.com.example:storage-1000 -l Logging in to [iface: iser, target: iqn.2010-10.com.example:storage-1000, portal: 192.0.1.1,3260] Login to [iface: iser, target: iqn.2010-10.com.example:storage-1000, portal: 192.0.1.1,3260] successful. These are the log (/var/log/message) before crash: ========================================== Jan 3 21:25:24 dell-pe1950-03 kernel: 802.1Q VLAN Support v1.8 Ben Greear <greearb> Jan 3 21:25:24 dell-pe1950-03 kernel: All bugs added by David S. Miller <davem> Jan 3 21:25:24 dell-pe1950-03 kernel: cxgb3i: tag itt 0x1fff, 13 bits, age 0xf, 4 bits. Jan 3 21:25:24 dell-pe1950-03 kernel: iscsi: registered transport (cxgb3i) Jan 3 21:25:24 dell-pe1950-03 kernel: Broadcom NetXtreme II CNIC Driver cnic v2.1.0 (Oct 10, 2009) Jan 3 21:25:24 dell-pe1950-03 kernel: cnic: Added CNIC device: eth0 Jan 3 21:25:24 dell-pe1950-03 kernel: cnic: Added CNIC device: eth1 Jan 3 21:25:24 dell-pe1950-03 kernel: Broadcom NetXtreme II iSCSI Driver bnx2i v2.1.0 (Dec 06, 2009) Jan 3 21:25:24 dell-pe1950-03 kernel: iscsi: registered transport (bnx2i) Jan 3 21:25:24 dell-pe1950-03 kernel: scsi1 : Broadcom Offload iSCSI Initiator Jan 3 21:25:24 dell-pe1950-03 kernel: scsi2 : Broadcom Offload iSCSI Initiator Jan 3 21:25:26 dell-pe1950-03 kernel: iscsi: registered transport (tcp) Jan 3 21:25:26 dell-pe1950-03 kernel: iscsi: registered transport (be2iscsi) Jan 3 21:25:26 dell-pe1950-03 iscsid: iSCSI logger with pid=6145 started! Jan 3 21:25:27 dell-pe1950-03 iscsid: transport class version 2.0-871. iscsid version 2.0-872 Jan 3 21:25:27 dell-pe1950-03 iscsid: iSCSI daemon with pid=6146 started! Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iser_connect:connecting to: ffff810058f7be4cI4, port 0xbc0c Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iser_cma_handler:event 0 conn ffff810058f7b910 id ffff810057acf800 Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iser_cma_handler:event 2 conn ffff810058f7b910 id ffff810057acf800 Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iser_create_ib_conn_res:setting conn ffff810058f7b910 cma_id ffff810057acf800: fmr_pool ffff81005ae5f740 qp ffff810057acfc00 Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iser_cma_handler:event 9 conn ffff810058f7b910 id ffff810057acf800 Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iscsi_iser_ep_poll:ib conn ffff810058f7b910 rc = 1 Jan 3 21:25:54 dell-pe1950-03 kernel: scsi3 : iSCSI Initiator over iSER, v.0.1 Jan 3 21:25:54 dell-pe1950-03 kernel: iser: iscsi_iser_conn_bind:binding iscsi conn ffff81005a0a0a90 to iser_conn ffff810058f7b910 Jan 3 21:25:54 dell-pe1950-03 iscsid: Could not set session1 priority. READ/WRITE throughout and latency could be affected. Jan 3 21:25:54 dell-pe1950-03 setroubleshoot: SELinux is preventing iscsid (iscsid_t) "sys_ptrace" to <Unknown> (iscsid_t). For complete SELinux messages. run sealert -l a20587a6-8048-4dc8-961b-cdc048431b21 ===================================================== I have also enabled the kdump. The core dump file was exceed the maximum size of attachment(20MB+). I don't have redhat people page for share with you. If you need the vmcore deadly, I can split it into pieces. Let me know if you need any info.
Created attachment 471592 [details] log file extract from kdump vmcore. Or, This log file might be what you want. I just extract it from vmcore.
(In reply to comment #9) > Or, This log file might be what you want. yep, looking into that, any chance you can test also over Mellanox HCA on this server, I don't have the Qlogic/ipath HCA here.
Or, I only got 1 mlx4 server currently. This server act as both iscsi target and iscsi initiator. [root@ib-test1 ~]# iscsiadm -m session iser: [1] 192.0.0.1:3260,1 iqn.2010-10.com.example:storage-1000 It works well both on RHEL 5.5 GA and RHEL 5.6 RC1 I will tested it again if I got another mlx4 server.
I spoke to the ipath maintainer, and I guess iser is not supported on the driver. It has never been run/tested before. Should we just close this bugzilla? I am not too interested in supporting this setup if Qlogic does not and there is no customer demand - sorry Gris you do not count as customer demand :) However, if this is bug and can show up in other drivers we can use this bz to bring in a fix into rhel.
I am OK for closing this bug. If GSS need a tech-note, we can reopen this bug.
(In reply to comment #12) > I spoke to the ipath maintainer, and I guess iser is not supported on the > driver. It has never been run/tested before. Should we just close this > bugzilla? Mike, I'm not sure where the problem lies, it could be a bug in iser exposed only/easily over ipath, or bug in ipath exposed by iser, or some combination. Basically, I'm suspecting something is broken w.r.t to the dma mapping emulation done by the ipath/qib drivers (these drivers assume that each page provided for dma emulation is mapped to the kernel virtual address space, but aren't supported on 32bits) again maybe in iser but not surely. I wouldn't throw this away, maybe open a kernel.org ticket, and cc myself and the ipath maintainer?
(In reply to comment #14) > throw this away, maybe open a kernel.org ticket, and cc myself and the ipath > maintainer? Sounds good to me. Will do. Thanks.
This request was evaluated by Red Hat Product Management for inclusion in Red Hat Enterprise Linux 5.7 and Red Hat does not plan to fix this issue the currently developed update. Contact your manager or support representative in case you need to escalate this bug.
I am going to close this for now. If we find the fix or get an actual customer wanting to use ipath+iser or if Qlogic decides they want to support iser on their hw then we can reopen it.