Description of problem:
Segfault on startup when trying to monitor SCSI hard drives on AMD64
Version-Release number of selected component (if applicable):
# rpm -q kernel-utils
Steps to Reproduce:
1. Set up an AMD64 machine with SCSI hard drives
2. Add "/dev/sda -a -m root@x86_64-3as.test.redhat.com" to
3. launch "smartd -d", and see the crash.
The crash only occurs when smartd is compiled with -O2, and -O1, not
when compiled with -O0.
Backtrace with -O1:
scsidevicescan (devices=0x7fbffff3f0, cfg=0x0) at smartd.c:852
852 smartd.c: No such file or directory.
#0 scsidevicescan (devices=0x7fbffff3f0, cfg=0x0) at smartd.c:852
#1 0x0000000000405772 in main (argc=0, argv=0x0) at smartd.c:2175
It crashes as soon as it tries to access cfg->, as cfg is NULL, but
adding a 'printf ("%p\n", cfg);' says that cfg is not NULL...
Breakpoint Backtrace with -O0, as it doesn't crash with -O0:
Breakpoint 1, scsidevicescan (devices=0x7fbffff400, cfg=0x51cfa0)
758 smartd.c: No such file or directory.
#0 scsidevicescan (devices=0x7fbffff400, cfg=0x51cfa0) at smartd.c:758
#1 0x00000000004067e4 in main (argc=2, argv=0x7fbffff908) at
I tried debugging with ElectricFence, both under- and overfencing, and
there don't seem to be any memory corruption issues.
I still don't understand why "cfg" would show up as NULL when compiled
with -O1, and non-NULL, with -O0, under gdb.
(Bear in mind that those sources were modified a bit when trying to
debug the problem, the crash actually occurs when accessing cfg->:
// record number of device, type of device, increment device count
cfg->tryata = 0;
I think it could be a compilation/optimisation problem.
There seems to be some corruption in _testunitready() in scsicmds.c.
Commenting the whole function out removes the crash.
Created attachment 100584 [details]
This patch fixes the crashes on startup on AMD64.
Thanks very much for your patch. This was fixed in smartmontools
on November 19, 2003. If file scsicmds.c is version 1.65 or
greater then it incorporates this fix.
This is fixed in all smartmontools releases >= 5.26
My current suggestion is that the RH/fedora upgrade to version 5.30
of smartmontools with a one-line patch to fix one other segv.
This problem is resolved in the next release of Red Hat Enterprise Linux. Red
Hat does not currently plan to provide a resolution for this in a Red Hat
Enterprise Linux update for currently deployed systems.
With the goal of minimizing risk of change for deployed systems, and in response
to customer and partner requirements, Red Hat takes a conservative approach when
evaluating changes for inclusion in maintenance updates for currently deployed
products. The primary objectives of update releases are to enable new hardware
platform support and to resolve critical defects.
*** Bug 143553 has been marked as a duplicate of this bug. ***