Bug 549771
Summary: | brcm_iscsiuio daemon segfaults after boot | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | Phillip Sorensen <pas37> | ||||||||||
Component: | iscsi-initiator-utils | Assignee: | Chris Leech <cleech> | ||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Red Hat Kernel QE team <kernel-qe> | ||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | low | ||||||||||||
Version: | 5.4 | CC: | benli, coughlan, emoryb, enarvaez, gideonn, mchan, mchristi, pas37, thenzl | ||||||||||
Target Milestone: | rc | Flags: | pas37:
needinfo-
|
||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2014-06-02 13:07:31 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Phillip Sorensen
2009-12-22 16:17:43 UTC
Proper bug number for selinux trouble is Bug #548599 Adding broadcom's Ben Li. Hi Phillip, Could you provide more context on what was happening when the segfault occurred? Also would you have the core file or a stack trace? Thanks again. -Ben I have not been able to tell the context to much. It seems to happen if I reboot and never login, or if I login and run various programs. I loaded the debuginfo files for iscsi-initiator-utils and glibc and then attached to the deamon with gdb. I got the following backtrace (bt full): Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x4458b940 (LWP 2114)] 0x000000383b27bf0b in memcpy () from /lib64/libc.so.6 (gdb) bt full #0 0x000000383b27bf0b in memcpy () from /lib64/libc.so.6 No symbol table info available. #1 0x000000000040d75a in cnic_read () No symbol table info available. #2 0x0000000000405adb in process_packets () No symbol table info available. #3 0x0000000000406242 in nic_loop () No symbol table info available. #4 0x000000383c206617 in start_thread (arg=<value optimized out>) at pthread_create.c:297 __res = <value optimized out> pd = Could not find the frame base for "start_thread". unwind_buf = Could not find the frame base for "start_thread". not_first_call = <value optimized out> robust = <value optimized out> #5 0x000000383b2d3c2d in clone () from /lib64/libc.so.6 fstab_state = {fs_fp = 0x0, fs_buffer = 0x0, fs_mntres = { mnt_fsname = 0x0, mnt_dir = 0x0, mnt_type = 0x0, mnt_opts = 0x0, mnt_freq = 0, mnt_passno = 0}, fs_ret = {fs_spec = 0x0, fs_file = 0x0, fs_vfstype = 0x0, fs_mntops = 0x0, fs_type = 0x0, fs_freq = 0, fs_passno = 0}} __elf_set___libc_subfreeres_element_fstab_free__ = ( const void *) 0x383b30a9e0 I am attaching the core file I got with the gdb command generate-core-file. I don't know how good it is. I got the following errors when I ran the command: warning: Memory read failed for corefile section, 77824 bytes at 0x00002aaaaaaad000. warning: Memory read failed for corefile section, 77824 bytes at 0x00002aaaaaac4000. The core file is to big to attach. It can be downloaded from http://staff.chess.cornell.edu/~sorensen/core.1899 Unfortunately, gdb didn't like the core provided. I think we will have to debug this via the system logs and the brcm_iscsiuio logs. Phillip could you also attach the /var/log/messages* and /var/log/brcm_iscsiuio log files? Thanks again. Created attachment 382315 [details]
/var/log/messages file
Created attachment 382316 [details]
/var/log/brcm_iscsi.log file
Attaching /var/log/messages and /var/log/brcm_iscsi.log
Also added Emory to see if he had seen anything like this before during his testing. After looking through the /var/log/messages and /var/log/brcm-iscsi.log files I didn't see anything suspicious. But, I did notice that you were using an older version of brcm_iscsiuio. In the logs it showed version 0.4.3, this must mean you are using iscsi-initiator-utils-6.2.0.871-0.10.el5 If you get a chance could you try iscsi-initiator-utils-6.2.0.871-0.12.el5? There were number bugs fixes when going from 0.4.3 -> 0.4.8. Some of the bug fixes have to do with resource allocation and cleanup. Thanks again. I am seeing the same thing with the updated iscsi-initiator-utils-6.2.0.871-0.12.el5. The backtrace shows the same call sequence. Thanks Phillip for trying the later version of the iscsi-initiator-utils. But, I think we will need to reproduce this problem here in the Broadcom Lab to better understand what is going on. Could you provide a description of your test machine configuration (number of test machines, RHEL configuration, iSCSI iface files... so that we can mimic that here in our lab). And also could you provide the reproduction steps of what you did to cause this segfault (daemon running, commands executed)? Emory do you have a machine in the lab where we can try Phillip's configuration? Thanks again. Phillip, As Ben wrote can you please provide us the exact details as requested? Thanks, Gidi My initial test machine is a Cybertron build SuperMicro system based on the PDSML+ motherboard with a Intel X3220 processor. We are using the HP NC382T card for the iSCSI. The install is standard RHEL 5.4 with iscsi-initiator-utils-6.2.0.871-0.12.el5. I am connecting to a RHEL5 host running scsi-target-utils-0.0-0.20070620snap. I will attach the output of ps -e and dmesg after reboot and login. The network settings are: :::::::::::::: /etc/sysconfig/network-scripts/ifcfg-eth2 :::::::::::::: # Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express DEVICE=eth2 BOOTPROTO=dhcp HWADDR=18:A9:05:78:B7:FC ONBOOT=yes :::::::::::::: /etc/sysconfig/network-scripts/ifcfg-eth3 :::::::::::::: # Broadcom NetXtreme II BCM5709 1000Base-T (C0) PCI Express DEVICE=eth3 BOOTPROTO=static HWADDR=18:A9:05:78:B7:FE IPADDR=192.168.182.132 NETMASK=255.255.255.0 ONBOOT=yes the iface files look like :::::::::::::: ifaces/bnx2i.18:a9:05:78:b7:fd :::::::::::::: # BEGIN RECORD 2.0-871 iface.iscsi_ifacename = bnx2i.18:a9:05:78:b7:fd iface.ipaddress = 128.84.182.243 iface.hwaddress = 18:a9:05:78:b7:fd iface.transport_name = bnx2i # END RECORD :::::::::::::: ifaces/bnx2i.18:a9:05:78:b7:ff :::::::::::::: # BEGIN RECORD 2.0-871 iface.iscsi_ifacename = bnx2i.18:a9:05:78:b7:ff iface.ipaddress = 192.168.182.133 iface.hwaddress = 18:a9:05:78:b7:ff iface.transport_name = bnx2i # END RECORD The segfault occurs under different conditions. All I have to do is boot and wait. It will happen without me even logging in. Sometimes is seem to be right after boot, sometimes it will take 10 or 15 minutes. Yesterday I set up one of our Dell R210 (Xeon 3440 base system with the Dell BCM5709 card) production servers. I have not done much testing with it yet, but my initial testing show a segfualt. I still need to test if it looks the same. Let me know if there are additional details. Created attachment 383853 [details]
Output of the 'ps -e' command after boot
Created attachment 383876 [details]
Result of dmesg command after boot
Hi Philip, I was wondering if Emory and I could get access to bugzilla https://bugzilla.redhat.com/show_bug.cgi?id=545999 to see if there any additional setup we would need for the brcm_iscsiuio daemon to segfault. Thanks again. -Ben Ben, it's actually bug #548599. Since it is too late to address this issue in RHEL 5.5, it has been proposed for RHEL 5.6. Contact your support representative if you need to escalate this issue. This request was evaluated by Red Hat Product Management for inclusion in the current release of Red Hat Enterprise Linux. Because the affected component is not scheduled to be updated in the current release, Red Hat is unfortunately unable to address this request at this time. Red Hat invites you to ask your support representative to propose this request, if appropriate and relevant, in the next release of Red Hat Enterprise Linux. This bug/component is not included in scope for RHEL-5.11.0 which is the last RHEL5 minor release. This Bugzilla will soon be CLOSED as WONTFIX (at the end of RHEL5.11 development phase (Apr 22, 2014)). Please contact your account manager or support representative in case you need to escalate this bug. Thank you for submitting this request for inclusion in Red Hat Enterprise Linux 5. We've carefully evaluated the request, but are unable to include it in RHEL5 stream. If the issue is critical for your business, please provide additional business justification through the appropriate support channels (https://access.redhat.com/site/support). |