Bug 427550

Summary: dmraid segfaults on boot resulting in broken mirror
Product: Red Hat Enterprise Linux 5 Reporter: Michael Young <m.a.young>
Component: dmraidAssignee: Ian Kent <ikent>
Status: CLOSED ERRATA QA Contact: Corey Marthaler <cmarthal>
Severity: low Docs Contact:
Priority: low    
Version: 5.1CC: agk, dwysocha, heinzm, mbroz, prockai
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2008-0475 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2008-05-21 17:21:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Patch to prevent SEGV when activting raid set none

Description Michael Young 2008-01-04 16:53:56 UTC
I have a Centos 5 box with RAID bus controller: Intel Corporation
631xESB/632xESB SATA RAID Controller (rev 09), which installs okay, but on first
boot I get the segfault
dmraid[784]: segfault at 00000000007038a0 rip 00000000007038a0 rsp
00007fff38f14958 error 15
in dmesg after which it boots but with / mounted from /dev/sda1 rather than the
device-mapped drive, meaning that the drives aren't actually mirrored.

The offending command in the initrd appears to be
dmraid -ay -i -p "ddf1_4c53492020202020808626820000000034afad3200000a28"
and running this by hand within gdb gives
(gdb) where
#0  0x00002aaaab8aca00 in main_arena () from /lib64/libc.so.6
#1  0x00002aaaaacd000f in group_set (lc=0x42bc5a0,
    name=0x7fff50a88bb6 "ddf1_4c53492020202020808626820000000034afad3200000a28")
at metadata/metadata.c:657
#2  0x0000000000402525 in build_sets (lc=0x42bc5a0, sets=<value optimized out>)
    at toollib.c:69
#3  0x0000000000401b6a in perform (lc=0x42bc5a0, argv=<value optimized out>)
    at commands.c:664
#4  0x000000000040166e in main (argc=<value optimized out>,
    argv=0x7fff50a87428) at dmraid.c:34
(gdb) up
#1  0x00002aaaaacd000f in group_set (lc=0x42bc5a0,
    name=0x7fff50a88bb6 "ddf1_4c53492020202020808626820000000034afad3200000a28")
at metadata/metadata.c:657
657             return rd->fmt->group(lc, rd);

Comment 1 Michael Young 2008-01-04 16:54:45 UTC
I forgot to mention this is on an x86_64 machine.

Comment 2 Michael Young 2008-01-04 17:46:35 UTC
It looks like my problem might be related to the one mentioned here
http://www.redhat.com/archives/ataraid-list/2007-November/msg00011.html
though it still a bug because the code shouldn't segfault.

Comment 8 Ian Kent 2008-01-24 11:59:14 UTC
I'll see if I can track this down.

Comment 9 Ian Kent 2008-02-06 02:06:50 UTC
Created attachment 294068 [details]
Patch to prevent SEGV when activting raid set

Turns out that, for this device, if the raid set
name given doesn't exist, doesn't match an exiting
raid set name or is a sub-set of a raid set then
dmraid would SEGV.

Comment 10 Ian Kent 2008-02-06 02:10:53 UTC
I'm not sure that this correction will actually resolve
the issue reported here but it does resolve the SEGV
that occurs.

Could someone give the patch posted a try please.

Ian


Comment 11 Michael Young 2008-02-06 22:33:37 UTC
Yes, it seems to fix the segfault. I am currently getting around the changing
name of the raid device by hacking dmraid -ay -i -P p into the initrd instead of
the existing dmraid and kpartx lines.

Comment 12 RHEL Program Management 2008-02-08 04:37:27 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 13 Ian Kent 2008-02-08 04:40:33 UTC
(In reply to comment #11)
> Yes, it seems to fix the segfault. I am currently getting around the changing
> name of the raid device by hacking dmraid -ay -i -P p into the initrd instead of
> the existing dmraid and kpartx lines.

Do you mean the way dmraid won't activate the RAID a specified
individual subset?

I believe we're waiting on patches for dmraid before this
support can be added.

Ian

Comment 14 Michael Young 2008-02-08 11:39:42 UTC
(In reply to comment #13)
> (In reply to comment #11)
> > Yes, it seems to fix the segfault. I am currently getting around the changing
> > name of the raid device by hacking dmraid -ay -i -P p into the initrd instead of
> > the existing dmraid and kpartx lines.
> 
> Do you mean the way dmraid won't activate the RAID a specified
> individual subset?
> 
> I believe we're waiting on patches for dmraid before this
> support can be added.
Well my main reason for moving to dmraid -ay -i -P p is that in the initrd I
couldn't use a fixed name because for these disks the full name changes between
boots (for example from
ddf1_4c53492020202020808626820000000034db222e00000a28 to
ddf1_4c53492020202020808626820000000034dd936800000a28
after a reboot). I don't know whether
dmraid -ay -i -p "ddf1_4c53492020202020808626820000000034dd936800000a28"
or similar actually starts the raid because the raid is running by the time I
can test it, so that might change the result, though currently I get
dmraid -ay -i -p "ddf1_4c53492020202020808626820000000034dd936800000a28"
No RAID sets and with names: "ddf1_4c53492020202020808626820000000034dd936800000a28"
if I try.

Comment 15 Ian Kent 2008-02-08 12:04:20 UTC
(In reply to comment #14)
> > 
> > Do you mean the way dmraid won't activate the RAID a specified
> > individual subset?
> > 
> > I believe we're waiting on patches for dmraid before this
> > support can be added.
> Well my main reason for moving to dmraid -ay -i -P p is that in the initrd I
> couldn't use a fixed name because for these disks the full name changes between
> boots (for example from
> ddf1_4c53492020202020808626820000000034db222e00000a28 to
> ddf1_4c53492020202020808626820000000034dd936800000a28
> after a reboot). I don't know whether
> dmraid -ay -i -p "ddf1_4c53492020202020808626820000000034dd936800000a28"
> or similar actually starts the raid because the raid is running by the time I
> can test it, so that might change the result, though currently I get
> dmraid -ay -i -p "ddf1_4c53492020202020808626820000000034dd936800000a28"
> No RAID sets and with names:
"ddf1_4c53492020202020808626820000000034dd936800000a28"
> if I try.

I believe that's correct, assuming these are RAID subsets, that
is the message you will get. In the metadata supplied the superset
name is .ddf1_disks. But using that will activate all the subsets
which may not be what you want.

It's not possible to activate individual subsets at the moment.
Sorry.

But them the superset name shouldn't change.

Ian


Comment 22 errata-xmlrpc 2008-05-21 17:21:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0475.html