This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours

Bug 523862

Summary: mdadm craps at boot
Product: [Fedora] Fedora Reporter: Nicolas Mailhot <nicolas.mailhot>
Component: mdadmAssignee: Doug Ledford <dledford>
Status: CLOSED RAWHIDE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: urgent Docs Contact:
Priority: low    
Version: rawhideCC: atorkhov, awilliam, bruno, dledford, hdegoede, lpoetter, sander
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-10-03 13:10:12 EDT Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Bug Depends On:    
Bug Blocks: 473303, 507678    
Attachments:
Description Flags
dmesg
none
Backtrace none

Description Nicolas Mailhot 2009-09-16 17:56:31 EDT
Description of problem:

see https://bugzilla.redhat.com/show_bug.cgi?id=521959#c13


Version-Release number of selected component (if applicable):

http://koji.fedoraproject.org/koji/buildinfo?buildID=132143
Comment 1 Nicolas Mailhot 2009-09-16 19:23:35 EDT
Created attachment 361397 [details]
dmesg
Comment 2 Adam Williamson 2009-09-18 12:58:19 EDT
could you be somewhat more specific? I really don't understand what's going on here.

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 3 Nicolas Mailhot 2009-09-20 13:29:30 EDT
mdadm[800]: segfault at 0 ip 00007fe511dd71f2 sp 00007fff7e693e68 error 4 in libc-2.10.90.so[7fe511d56000+176000]

It's all in the log
Comment 4 Nicolas Mailhot 2009-09-20 13:33:06 EDT
On the console you see something like

udev:

/sbin/mdmadm --detail
--export /dev/dm127' unexpected exit with status 0x000b
Comment 5 Alexey Torkhov 2009-09-22 16:50:29 EDT
Created attachment 362137 [details]
Backtrace

Mdadm crashes for me, when simply running "mdadm --detail --scan" with default mdadm.conf but md arrays present on disk.
Comment 6 Alexey Torkhov 2009-09-22 16:54:57 EDT
*** Bug 524381 has been marked as a duplicate of this bug. ***
Comment 7 Nicolas Mailhot 2009-09-30 14:45:14 EDT
anything post mdadm-3.0-2.fc12.x86_64 still makes this system crash at boot and drop in the maintenance console
Comment 8 Sander Hoentjen 2009-10-01 14:46:42 EDT
Another backtrace

# gdb --args mdadm -A /dev/md0
GNU gdb (GDB) Fedora (6.8.91.20090930-2.fc12)
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /sbin/mdadm...Reading symbols from /usr/lib/debug/sbin/mdadm.debug...done.
done.
(gdb) run
Starting program: /sbin/mdadm -A /dev/md0

Program received signal SIGSEGV, Segmentation fault.
__strlen_sse2 () at ../sysdeps/x86_64/strlen.S:31
31		pcmpeqb	(%rdi), %xmm2
Current language:  auto
The current source language is "auto; currently asm"
(gdb) bt
#0  __strlen_sse2 () at ../sysdeps/x86_64/strlen.S:31
#1  0x0000000000435df9 in set_member_info (st=0x85af00, ent=0x85d1a0) at mapfile.c:306
#2  0x00000000004362f8 in RebuildMap () at mapfile.c:369
#3  0x0000000000436861 in map_read (melp=<value optimized out>) at mapfile.c:166
#4  0x0000000000436c2d in map_update (mpp=0x0, devnum=<value optimized out>, metadata=0x7fffffffdfb4 "0.90", uuid=0x7fffffffdf28, path=<value optimized out>) at mapfile.c:206
#5  0x000000000040f52a in Assemble (st=<value optimized out>, mddev=<value optimized out>, ident=<value optimized out>, devlist=<value optimized out>, 
    backup_file=<value optimized out>, readonly=<value optimized out>, runstop=<value optimized out>, update=<value optimized out>, homehost=<value optimized out>, 
    require_homehost=<value optimized out>, verbose=<value optimized out>, force=<value optimized out>) at Assemble.c:1004
#6  0x000000000040547d in main (argc=<value optimized out>, argv=<value optimized out>) at mdadm.c:1055
Comment 9 Hans de Goede 2009-10-02 15:59:08 EDT
The mdadm crash is fixed by mdadm-3.0.2-1.fc12
A tag request for including this in to F-12 is here:
https://fedorahosted.org/rel-eng/ticket/2294
Comment 10 Adam Williamson 2009-10-02 22:36:46 EDT
if people could test this quickly over the weekend that'd be great; we have a go/no-go meeting on monday. you can get the fixed build here:

http://koji.fedoraproject.org/koji/buildinfo?buildID=134892

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 11 Bruno Wolff III 2009-10-03 01:03:32 EDT
I tried it out and the system boots successfully. There were no segfault or unexpected status messages. However I did see some warnings that I don't get with mkinitrd images which suggests there is still some minor problem.
For instance:
Buffer I/O error on device dm-0, logical block 64
Buffer I/O error on device dm-0, logical block 65
Buffer I/O error on device dm-0, logical block 66
Buffer I/O error on device dm-0, logical block 67
Buffer I/O error on device dm-0, logical block 68
Buffer I/O error on device dm-0, logical block 69
Buffer I/O error on device dm-0, logical block 70
Buffer I/O error on device dm-0, logical block 71
device-mapper: ioctl: unable to remove open device temporary-cryptsetup-930
Buffer I/O error on device dm-0, logical block 72
Buffer I/O error on device dm-0, logical block 73
Comment 12 Hans de Goede 2009-10-03 04:38:07 EDT
(In reply to comment #11)
> I tried it out and the system boots successfully. There were no segfault or
> unexpected status messages. However I did see some warnings that I don't get
> with mkinitrd images which suggests there is still some minor problem.
> For instance:
> Buffer I/O error on device dm-0, logical block 64
> Buffer I/O error on device dm-0, logical block 65
> Buffer I/O error on device dm-0, logical block 66
> Buffer I/O error on device dm-0, logical block 67
> Buffer I/O error on device dm-0, logical block 68
> Buffer I/O error on device dm-0, logical block 69
> Buffer I/O error on device dm-0, logical block 70
> Buffer I/O error on device dm-0, logical block 71
> device-mapper: ioctl: unable to remove open device temporary-cryptsetup-930
> Buffer I/O error on device dm-0, logical block 72
> Buffer I/O error on device dm-0, logical block 73  

Atleast the:
device-mapper: ioctl: unable to remove open device temporary-cryptsetup-930

Is a different issue, dmcrypt creates a temporary devicemapper device, for some reason and dracut's udev rules should not probe that, otherwise you get that unable to remove error, because it is busy due to the probing. We recently hit the same issue in anaconda.

I'm right now not behind my computer with the irc logs of when we discussed this, when I'm behind that machine I'll add another comment with some more info.

I think the other errors are related / caused by this same issue. Either way
I believe this is unrelated to mdraid / mdadm.
Comment 13 Sander Hoentjen 2009-10-03 06:40:48 EDT
new mdadm fixes it for me too
Comment 14 Bruno Wolff III 2009-10-03 12:42:11 EDT
For the other issue I'd like to be added to whatever bug is tracking it. If there isn't one, I can start one?
Comment 15 Adam Williamson 2009-10-03 13:10:12 EDT
closing this one, anyway, as it seems to be clearly fixed. yes, Bruno, if there's no hardware issue behind your other problem, look for a dupe or file a new bug (i'd search for "Buffer I/O error on device dm-0" to find dupes).

-- 
Fedora Bugzappers volunteer triage team
https://fedoraproject.org/wiki/BugZappers
Comment 16 Hans de Goede 2009-10-04 04:57:48 EDT
(In reply to comment #14)
> For the other issue I'd like to be added to whatever bug is tracking it. If
> there isn't one, I can start one?  

I don't think there is a bug for tracking the temp dmcrypt node probing from dracut, please file a bug against dracut for this. Also please include a note there, to see:
https://bugzilla.redhat.com/show_bug.cgi?id=526699#c5

For more info.
Comment 17 Bruno Wolff III 2009-10-04 12:49:05 EDT
I opened bug 527056 and made the suggested reference.
Comment 18 Hans de Goede 2009-10-04 13:01:28 EDT
(In reply to comment #17)
> I opened bug 527056 and made the suggested reference.  

Thanks!