Description of problem:
Kernel panic when logging in to an iSCSI target provided by OpenSolaris
Version-Release number of selected component (if applicable):
On AMD x86_64 hardware with 2G of physical mem
Steps to Reproduce:
1. Create target on Sun host:
sun# iscsitadm create target -b /dev/zvol/dsk/pool0/iscsi/target0 target0
sun# iscsitadm list target
iSCSI Name: iqn.1986-03.com.sun:02:607b69d6-fd3d-60b3-95d6-e629f3365280.target0
2. Discover LUN on Linux client:
rhel5# iscsiadm -m discovery -p 10.1.0.140:3260
3. Attempt to log in to target:
rhel5# iscsiadm -m node -T
Panic / reboot. This persists until I boot to single user, start iscsid, and '-o
delete' the node and discovery records.
Successful log in to target, or error message if a problem is found
I have only seen this behavior with the OpenSolaris iSCSI target. I used
OpenFiler and ietd both without issue previously.
FYI - I got a crash dump out of the box this afternoon, if anyone is interested.
I packaged the vmcore and a copy of the uncompressed debug kernel bzip'd up -
since I have 2G of mem, it's a bit on the large size, but compressed it's under
200M. If you're interested, let me know privately, and I'll put it up for download.
Do you have the panic? If you crash is occruing right when you try to login it
may be this bug
It very well may be. A panic string never got written to messages that I've
found, so I can't compare the exact panic. Here's what I get from the crash
DATE: Thu Jun 7 17:51:18 2007
LOAD AVERAGE: 0.04, 0.12, 0.10
NODENAME: <uname -n>
VERSION: #1 SMP Thu May 17 03:16:52 EDT 2007
MACHINE: x86_64 (997 Mhz)
MEMORY: 2 GB
TASK: ffff81007f8f37e0 [THREAD_INFO: ffff810037cdc000]
STATE: TASK_RUNNING (PANIC)
PID: 473 TASK: ffff81007f8f37e0 CPU: 0 COMMAND: "udevd"
#0 [ffffffff80402830] crash_kexec at ffffffff800a95f2
#1 [ffffffff804028b8] iscsi_tcp_data_recv at ffffffff88beac3a
#2 [ffffffff804028f0] __die at ffffffff80062e9d
#3 [ffffffff80402930] die at ffffffff80069459
#4 [ffffffff80402960] do_invalid_op at ffffffff80069a0f
#5 [ffffffff80402978] iscsi_tcp_data_recv at ffffffff88beac3a
#6 [ffffffff80402988] __wake_up at ffffffff8002dd9b
#7 [ffffffff804029c8] sock_def_readable at ffffffff800127fd
#8 [ffffffff804029d8] netlink_sendskb at ffffffff8021bcc3
#9 [ffffffff80402a08] __iscsi_complete_pdu at ffffffff88bd99bf
#10 [ffffffff80402a20] error_exit at ffffffff8005be1d
[exception RIP: iscsi_tcp_data_recv+4808]
RIP: ffffffff88beac3a RSP: ffffffff80402ad0 RFLAGS: 00010202
RAX: 0000000000000164 RBX: ffff810072f31a90 RCX: ffff81006fef0080
RDX: ffff81006fef0080 RSI: 0000000000000286 RDI: ffff81006fef0080
RBP: ffff81006fef0080 R8: ffff81007ae5e0b8 R9: ffff810071ecf500
R10: ffff81007a9dd880 R11: ffff810071ecf500 R12: 0000000000000023
R13: 0000000000000162 R14: 00000000d49a7d66 R15: ffff81005eb6c1d8
ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#11 [ffffffff80402b28] __qdisc_run at ffffffff80216d97
#12 [ffffffff80402b78] ip_output at ffffffff800316cc
#13 [ffffffff80402ba8] ip_queue_xmit at ffffffff80033d84
#14 [ffffffff80402c08] tcp_read_sock at ffffffff80035506
#15 [ffffffff80402c58] iscsi_tcp_data_ready at ffffffff88beb06a
#16 [ffffffff80402c98] tcp_rcv_established at ffffffff8001b344
#17 [ffffffff80402ce8] tcp_v4_do_rcv at ffffffff8003ab13
#18 [ffffffff80402d08] ip_confirm at ffffffff88ce0135
#19 [ffffffff80402d48] tcp_v4_rcv at ffffffff80026d39
#20 [ffffffff80402d68] nf_hook_slow at ffffffff80054866
#21 [ffffffff80402dd8] ip_local_deliver at ffffffff80033f76
#22 [ffffffff80402e08] ip_rcv at ffffffff8003504e
#23 [ffffffff80402e38] netif_receive_skb at ffffffff8001fdad
#24 [ffffffff80402e78] tg3_poll at ffffffff88210948
#25 [ffffffff80402ef8] net_rx_action at ffffffff8000c39b
#26 [ffffffff80402f00] tg3_interrupt at ffffffff882092f1
#27 [ffffffff80402f38] __do_softirq at ffffffff80011c19
#28 [ffffffff80402f40] end_level_ioapic_vector at ffffffff80075658
#29 [ffffffff80402f68] call_softirq at ffffffff8005c330
#30 [ffffffff80402f80] do_softirq at ffffffff8006a312
#31 [ffffffff80402f90] do_IRQ at ffffffff8006a19a
Please let me know if you need more information, or if this is a dup.
I'd be very interested in the patch as well. I would like to move forward with
my work, even if it's not officially supported.
Thanks for your time.
Would it be helpful to get access to the machine(s) in question? I can arrange for an interactive session, just shoot me a message. I have no idea if this is already
known, although if it is, I'd really like to see the previously-mentioned patch.
Created attachment 156808 [details]
Sorry about the delay for the patch. I got some review comments on it and had
to respin it and so I am still testing this, but it is stable enough for you to
test and see if it fixes your problem, but not stable enough for production
Also one other question. Are you using data digests?
I am not explicitly using data digests (i.e. the defaults are in place in the
I don't know how to gather this information in OpenSolaris - it's not exposed on
the target-side (iscsitadm(8)) as an option to specify.
I'll plug the attachment in tomorrow and start playing. I understand it is not
production-ready code, but will help very much in my testing.
Hmm - no dice for me:
[root@macosta-crash linux]# make modules >/tmp/build.err 2>&1
[root@macosta-crash linux]# cat /tmp/build.err
CC [M] drivers/scsi/iscsi_tcp.o
drivers/scsi/iscsi_tcp.c: In function â:
drivers/scsi/iscsi_tcp.c:1965: error: â undeclared (first use in this function)
drivers/scsi/iscsi_tcp.c:1965: error: (Each undeclared identifier is reported
drivers/scsi/iscsi_tcp.c:1965: error: for each function it appears in.)
drivers/scsi/iscsi_tcp.c:1970: error: â undeclared (first use in this function)
make: *** [drivers/scsi/iscsi_tcp.o] Error 1
make: *** [drivers/scsi] Error 2
make: *** [drivers] Error 2
[root@macosta-crash linux]# sed -n '1965p;1970p' drivers/scsi/iscsi_tcp.c
I do have some "extra" patches in, such as the NVidia layer and FUSE. Other than
that, I believe my kernel source is the stock kernel-2.6.18-8.1.4.el5 SRPM
install, linked to /usr/src/linux.
Sorry about that. You need to build with our experimentatl kernel. Here
is our kernel maintainer's snapshot of the current development RHEL5 kernel. Of
course this is going to be very unstable, but it gives you a look at what we are
You need his newest kernel snapshot here
The source is here
I've just upgraded to the 2.6.18-32.el5 kernel that includes the
"linux-2.6-scsi-update-iscsi_tcp-driver.patch" patch that appears to match this
issue. Since then, logging in to an OpenSolaris iSCSI target no longer panics
the system, but instead of being able to access the disk, I see this in
Jul 9 10:38:58 macosta-crash kernel: scsi1 : iSCSI Initiator over TCP/IP
Jul 9 10:38:59 macosta-crash iscsid: connection1:0 is operational now
Jul 9 10:39:00 macosta-crash kernel: iscsi: Got CHECK_CONDITION but invalid
data buffer size of 0
Is what Solaris presents forbidden by the spec, or should this work?
(In reply to comment #9)
> I've just upgraded to the 2.6.18-32.el5 kernel that includes the
> "linux-2.6-scsi-update-iscsi_tcp-driver.patch" patch that appears to match this
> issue. Since then, logging in to an OpenSolaris iSCSI target no longer panics
> the system, but instead of being able to access the disk, I see this in
> Jul 9 10:38:58 macosta-crash kernel: scsi1 : iSCSI Initiator over TCP/IP
> Jul 9 10:38:59 macosta-crash iscsid: connection1:0 is operational now
> Jul 9 10:39:00 macosta-crash kernel: iscsi: Got CHECK_CONDITION but invalid
> data buffer size of 0
> Is what Solaris presents forbidden by the spec, or should this work?
Returning a check condition, but not returning sense is not allowed in the SCSI
or iSCSI specs.
Is this the current open solaris target or a older version?
(In reply to comment #10)
> Is this the current open solaris target or a older version?
It's *relatively* new:
rmike ~ # cat /etc/release
Solaris Nevada snv_64a X86
Copyright 2007 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 18 May 2007
rmike ~ # uname -a
SunOS macosta-crash-iscsi 5.11 snv_64a i86pc i386 i86pc
Is there any new progress here? I'm willing to set up a custom kernel, and
provide remote access to get this working. Right now I have an idle OpenSolaris
box waiting for when RHEL5 can mount it (the MS iSCSI initiator doesn't appear
to have any issues in this space.)
Send me mail firstname.lastname@example.org with log in details. I think this problem of
getting a check condition with no sense is fixed on the opensolaris iscsi target
side in some version though.
This should be fixed in recent open solaris targets now. Closing. If it still occurs reopen.