Bug 577935 - Smartctl segmentation fault and crash followed by kernel invalid opcode trace
Summary: Smartctl segmentation fault and crash followed by kernel invalid opcode trace
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Fedora
Classification: Fedora
Component: smartmontools
Version: 12
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Michal Hlavinka
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-03-29 19:21 UTC by Przemek Klosowski
Modified: 2010-11-23 22:01 UTC (History)
1 user (show)

Fixed In Version: smartmontools-5.39.1-3.fc12
Clone Of:
Environment:
Last Closed: 2010-11-23 21:59:04 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Przemek Klosowski 2010-03-29 19:21:02 UTC
Description of problem:

Smartctl segmentation fault and crash when asking for SMART test of
a disk on a DELL MegaRaid controller. 

Version-Release number of selected component (if applicable):

smartctl 5.39.1 2010-01-28 r3054 [x86_64-redhat-linux-gnu] (local build)
smartmontools-5.39.1-1.fc12.x86_64


How reproducible:

Always reproducible

Steps to Reproduce:
1. smartctl -t short /dev/sda -d megaraid,0
2. segmentation fault and crash
  
Actual results:
smartctl 5.39.1 2010-01-28 r3054 [x86_64-redhat-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

Segmentation fault
 
Message from syslogd@webster at Mar 29 14:45:01 ...
 kernel:invalid opcode: 0000 [#8] SMP 
 kernel:last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
 kernel:Stack:
 kernel:Call Trace:
 kernel:Code: 00 08 00 00 49 c1 ef 0b 4c 8b 75 c0 49 81 c6 ff 07 00 00 49 c1 ee 0b 48 81 7d c0 01 10 00 00 45 19 ed 41 83 c5 02 45 85 f6 75 04 <0f> 0b eb fe 48 c7 c7 c0 7a a0 81 45 89 ec e8 a9 10 22 00 49 89 

Expected results:

friendly message about SMART test being run

Additional info:
DELL PowerEdge R710 with 2 Xeon  E5530 with 8GB, running F12 x86_64
LSI Logic / Symbios Logic MegaRAID SAS 1078 (rev 04)

Comment 1 Przemek Klosowski 2010-03-29 20:08:00 UTC
The stack trace of smartmontools at the system call that causes the problem is
a little hard to get because the crash happens in the kernel so you can't just run the debugger to the error (stack is gone by that time), but it seems that 
the problem is in an ioctl:

#0  os_linux::linux_megaraid_device::megasas_cmd (this=0x7ffff821a030, cdbLen=<value optimized out>, cdb=0x7fffffffc8f0, dataLen=-134715736, data=0x0) at os_linux.cpp:1112
#1  0x00007ffff7fd682f in os_linux::linux_megaraid_device::scsi_pass_through (this=<value optimized out>, iop=0x7fffffffc870) at os_linux.cpp:1076
#2  0x00007ffff7fcba52 in scsiSendDiagnostic (device=0x7ffff821a030, functioncode=<value optimized out>, pBuf=<value optimized out>, bufLen=<value optimized out>) at scsicmds.cpp:722
#3  0x00007ffff7fcbb9f in scsiSmartExtendSelfTest (device=<value optimized out>) at scsicmds.cpp:1699
#4  0x00007ffff7fd45ad in scsiPrintMain (device=<value optimized out>, options=<value optimized out>) at scsiprint.cpp:1703
#5  0x00007ffff7fbbcf2 in main_worker (argc=<value optimized out>, argv=<value optimized out>) at smartctl.cpp:951
#6  0x00007ffff7fbc049 in main (argc=<value optimized out>, argv=<value optimized out>) at smartctl.cpp:967

line 1112 of os_linux.cpp is 
	  rc = ioctl(m_fd, MEGASAS_IOC_FIRMWARE, &uio);

where uio is:

{host_no = 2, __pad1 = 0, sgl_off = 48, sge_count = 1, sense_off = 0, sense_len = 0, frame = {
    raw = "\004\000\377\000\000\000\006\001\000\000\000\000\000\000\000\000\020", '\000' <repeats 15 times>, "\035@", '\000' <repeats 93 times>, hdr = {cmd = 4 '\004', sense_len = 0 '\000', 
      cmd_status = 255 '\377', scsi_status = 0 '\000', target_id = 0 '\000', lun = 0 '\000', cdb_len = 6 '\006', sge_count = 1 '\001', context = 0, pad_0 = 0, flags = 16, timeout = 0, data_xferlen = 0}}, 
  sgl = {{iov_base = 0x0, iov_len = 0} <repeats 16 times>}}


Don't know if it's useful but last non-hardware specific call level up the stack is line 722 of scsicmds.cpp : 
   if (!device->scsi_pass_through(&io_hdr)); 

at that point, io_hdr is
$1 = {cmnd = 0x7fffffffc8f0 "\035@", cmnd_len = 6, dxfer_dir = 0, dxferp = 0x0, dxfer_len = 0, sensep = 0x7fffffffc8d0 "HITACHI ", max_sense_len = 32, timeout = 18000, resp_sense_len = 0, 
  scsi_status = 0 '\000', resid = 0}

Comment 3 Bug Zapper 2010-11-03 18:19:27 UTC
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12.  It is Fedora's policy to close all
bug reports from releases that are no longer maintained.  At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '12'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 12's end of life.

Bug Reporter: Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 12 is end of life.  If you 
would still like to see this bug fixed and are able to reproduce it 
against a later version of Fedora please change the 'version' of this 
bug to the applicable version.  If you are unable to change the version, 
please add a comment here and someone will do it for you.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events.  Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

The process we are following is described here: 
http://fedoraproject.org/wiki/BugZappers/HouseKeeping

Comment 4 Fedora Update System 2010-11-15 15:14:40 UTC
smartmontools-5.40-3.fc14 has been submitted as an update for Fedora 14.
https://admin.fedoraproject.org/updates/smartmontools-5.40-3.fc14

Comment 5 Fedora Update System 2010-11-15 15:14:56 UTC
smartmontools-5.40-3.fc13 has been submitted as an update for Fedora 13.
https://admin.fedoraproject.org/updates/smartmontools-5.40-3.fc13

Comment 6 Fedora Update System 2010-11-15 15:47:59 UTC
smartmontools-5.39.1-3.fc12 has been submitted as an update for Fedora 12.
https://admin.fedoraproject.org/updates/smartmontools-5.39.1-3.fc12

Comment 7 Fedora Update System 2010-11-15 22:21:50 UTC
smartmontools-5.39.1-3.fc12 has been pushed to the Fedora 12 testing repository.  If problems still persist, please make note of it in this bug report.
 If you want to test the update, you can install it with 
 su -c 'yum --enablerepo=updates-testing update smartmontools'.  You can provide feedback for this update here: https://admin.fedoraproject.org/updates/smartmontools-5.39.1-3.fc12

Comment 8 Fedora Update System 2010-11-23 21:58:54 UTC
smartmontools-5.40-3.fc13 has been pushed to the Fedora 13 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 9 Fedora Update System 2010-11-23 22:01:03 UTC
smartmontools-5.40-3.fc14 has been pushed to the Fedora 14 stable repository.  If problems still persist, please make note of it in this bug report.

Comment 10 Fedora Update System 2010-11-23 22:01:35 UTC
smartmontools-5.39.1-3.fc12 has been pushed to the Fedora 12 stable repository.  If problems still persist, please make note of it in this bug report.


Note You need to log in before you can comment on or make changes to this bug.