Bug 879327 - System with Intel firmware RAID-1 does not power off or reboot
System with Intel firmware RAID-1 does not power off or reboot
Status: CLOSED ERRATA
Product: Fedora
Classification: Fedora
Component: dracut (Show other bugs)
18
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: dracut-maint
Fedora Extras Quality Assurance
:
: 895815 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-11-22 10:42 EST by SpuyMore
Modified: 2013-07-04 22:11 EDT (History)
28 users (show)

See Also:
Fixed In Version: mdadm-3.2.6-19.fc18
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 912735 (view as bug list)
Environment:
Last Closed: 2013-01-24 17:35:08 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Shutdown messages (70.66 KB, image/jpeg)
2012-11-22 10:43 EST, SpuyMore
no flags Details

  None (edit)
Description SpuyMore 2012-11-22 10:42:35 EST
Description of problem:
When I power off or reboot the system the system starts to shut down but stalls after telling me it is unmounting file systems.

Version-Release number of selected component (if applicable):
most current FC18 with all yum updates installed

How reproducible:
Assemble a system with an Intel DQ77MK board (Q77 chipset) with latest BIOS 00053, an Intel Core i5 3570 3.40GHz CPU with integrated Intel HD Graphics 2500 used for video (no other adapters installed at all), two onboard NIC's and two identical hard disks. System info:
http://ark.intel.com/products/59044/...p-Board-DQ77MK
http://downloadmirror.intel.com/2209...leaseNotes.pdf
http://ark.intel.com/products/65702/...Cache-3_40-GHz

In the BIOS both UEFI boot and Intel firmware RAID (Intel RST) are enabled and the two hard drives in my system are configured in RAID-1 mode. I managed to succesfully install FC18 in this setup, overcoming several issues regarding bug 873576 earlier, the system now boots and runs flawless besides this issue.

Steps to Reproduce:
1. boot without kernel options: rhgb quiet
2. reboot, shutdown -P now or shutdown -r now
3.
  
Actual results:
The shut down stalls as soon as "Unmounting file systems." appears. The other "Unmounting /sys|dev/..." messages from my attachment do not always appear, but I guess that is because these happen in parallel?

Expected results:
Expect the system to either power off or reboot eventually.

Additional info:
I suspect this has to do with either the firmware RAID configuration, ACPI or perhaps both.

With regard to ACPI the only boot warnings I get are:
...
[ 10.964510] ACPI Warning: 0x0000000000000428-0x000000000000042f SystemIO conflicts with Region \PMIO 1 (20120711/utaddress-251)
[ 10.965234] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
[ 10.965953] ACPI Warning: 0x0000000000000500-0x000000000000053f SystemIO conflicts with Region \GPIO 1 (20120711/utaddress-251)
[ 10.966669] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
...
[ 11.029171] ACPI Warning: 0x000000000000f040-0x000000000000f05f SystemIO conflicts with Region \_SB_.PCI0.SBUS.SMBI 1 (20120711/utaddress-251)
[ 11.030132] ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver
... 

Things I tried but do not help:
- toggling Wake on LAN on either NIC on or off
- start in multi-user instead of graphical mode

What else can I try debugging this?
Comment 1 SpuyMore 2012-11-22 10:43:10 EST
Created attachment 649895 [details]
Shutdown messages
Comment 3 SpuyMore 2012-11-29 10:34:30 EST
No change with newest BIOS 0054 update and all latest yum updates.
Comment 4 SpuyMore 2012-12-03 06:00:39 EST
sync && reboot -f does reboot the system! Too bad my RAID mirror has to resync afterwards...

I understand the information at http://freedesktop.org/wiki/Software/systemd/Debugging#Diagnosing_Shutdown_Problems correctly this means it is not a hardware or kernel bug, but instead it may indicate a problem with systemd instead, hence the component change of this bug. I suspect it has to do with me having a Intel firmware RAID-1 set, that's why I altered the bug summary.

Anyone has advise on how to debug this further?

For what it's worth, when the shutdown stalls like shown in the attachment and I unplug/plug some USB hardware, I still see kernel messages notifying me about the hardware changes, so the kernel is still alive at that point.
Comment 5 SpuyMore 2013-01-02 15:20:48 EST
Nobody has an idea?
Comment 6 Mark Harfouche 2013-01-02 15:49:16 EST
I think this thread https://bugzilla.redhat.com/show_bug.cgi?id=834245 talks about the same thing.
Comment 7 Jes Sorensen 2013-01-03 08:32:09 EST
This broke somewhere in the recent systemd/dracut changes, not in mdadm
Comment 8 Harald Hoyer 2013-01-03 08:53:27 EST
can you try:

# yum downgrade mdadm
# dracut -f


also have a look at 

https://bugzilla.redhat.com/show_bug.cgi?id=834245#c50
Comment 9 SpuyMore 2013-01-03 15:12:03 EST
I have:
mdadm-3.2.6-7.fc18.x86_64
kernel-3.6.11-3.fc18.x86_64
dracut-024-16.git20121220.fc18.x86_64

I downgraded to mdadm-3.2.6-1.fc18.x86_64 and ran dracut -f, but this didn't help.
Comment 10 Fedora Update System 2013-01-04 12:10:57 EST
mdadm-3.2.6-8.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/mdadm-3.2.6-8.fc18
Comment 11 Fedora Update System 2013-01-04 15:29:48 EST
Package mdadm-3.2.6-8.fc18:
* should fix your issue,
* was pushed to the Fedora 18 testing repository,
* should be available at your local mirror within two days.
Update it with:
# su -c 'yum update --enablerepo=updates-testing mdadm-3.2.6-8.fc18'
as soon as you are able to.
Please go to the following url:
https://admin.fedoraproject.org/updates/FEDORA-2013-0208/mdadm-3.2.6-8.fc18
then log in and leave karma (feedback).
Comment 12 SpuyMore 2013-01-04 15:55:48 EST
I have updated but it didn't work. System now does not boot completely. It hangs right after "started initialized storage subsystems (RAID, LVM, etc.)" and started monitoring of LVM2 mirrors, snapshots etc. using dmeventd or progress polling". Help! How can I boot the system now? I mast add that I also installed the latest release version of dracut (yesterday's release). After upgrading mdadm and dracut I ran dracut -f. I suspect the mdmon is not running.

Thanks, Dennis
Comment 13 SpuyMore 2013-01-05 05:30:01 EST
1. Pfffew... I booted with kernel option rd.shell and rd.break=pre-mount. Typing mdmon --all at the first intermediate shell tells me md is already managed. I then typed mdmon --all --takeover --offroot and that worked. I closed the shell and now the system continued to boot successfully.

2. Cat /proc/mdstat shows me the mirror is syncing. When the system previously didn't continue booting, pressing ctrl+alt+delete did somehow shutdown the system with lots of warnings, but apparently did not cleanly unbmount the disks.

3. In booted system I notice /etc/systemd/system/sysinit.target.wants/mdmonitor-takeover.service still exists and points to not-existing file

4. This all somewhat reminds me of the resolution of bug 873576, is this chicken and egg or is it irrelevant
Comment 14 SpuyMore 2013-01-05 08:57:57 EST
The good news is that rebooting/poweroff the system with this madm update does work fine and my raid mirror does not need a rebuild on next boot.

So to me it seems the only thing to fix is starting mdmon wit parameters like I did in my pre-mount shell. And remove the systemd symbolic link to obsolete mdmon takeover service.

For now I reverted to mdadm-3.2.6-q.fc18.
Comment 15 Jes Sorensen 2013-01-05 09:56:24 EST
Dennis,

It's bizarre that mdmon isn't launched correctly by the initrd. How many IMSM
raid devices do you have and how are they configured?

mdmon should be started from the initrd, if it isn't something is going wrong
there.

Can you provide /proc/mdstat output as well as df -h?

Thanks,
Jes
Comment 16 SpuyMore 2013-01-05 10:30:39 EST
Hi Jes, mdmon is started by initrd, hence it mentioned "... is already managed" when I typed: mdmon --all. Then I type instead: mdmon --all --takeover --offroot and it worked, so somehow the additional two parameters made the difference.
Comment 17 SpuyMore 2013-01-05 10:33:40 EST
Oh by the way, /proc/mdstat:
Personalities : [raid1]
md126 : active raid1 sda[1] sdb[0]
      976759808 blocks super external:/md127/0 [2/2] [UU]
md127 : inactive sdb[1](S) sda[0](S)
      5288 blocks super external:imsm
unused devices: <none>

and df -h:
Filesystem                   Size  Used Avail Use% Mounted on
devtmpfs                     7.6G     0  7.6G   0% /dev
tmpfs                        7.6G     0  7.6G   0% /dev/shm
tmpfs                        7.6G  8.0M  7.6G   1% /run
tmpfs                        7.6G     0  7.6G   0% /sys/fs/cgroup
/dev/mapper/fedora-root       32G  2.9G   28G  10% /
tmpfs                        7.6G  4.3M  7.6G   1% /tmp
/dev/sdc                     151G   23G  121G  16% /backup
/dev/md126p2                 194M   36M  149M  20% /boot
/dev/md126p1                 200M  8.0M  192M   4% /boot/efi
/dev/mapper/fedora-home      7.7G  202M  7.2G   3% /home
/dev/mapper/fedora-var       9.7G  2.4G  6.8G  26% /var
/dev/mapper/fedora-misc      722G  333G  352G  49% /misc
/dev/mapper/fedora-opt        16G  890M   14G   6% /opt
Comment 18 Harald Hoyer 2013-01-07 07:07:32 EST
(In reply to comment #16)
> Hi Jes, mdmon is started by initrd, hence it mentioned "... is already
> managed" when I typed: mdmon --all. Then I type instead: mdmon --all
> --takeover --offroot and it worked, so somehow the additional two parameters
> made the difference.

well, a
# ps ax | grep mdmon

would help to figure out the command line
Comment 19 Chema Casanova 2013-01-07 11:17:04 EST
> (In reply to comment #16)
> well, a
> # ps ax | grep mdmon
> 
> would help to figure out the command line

I confirm the same problem with mdadm-3.2.6-8.fc18.x86_64, after installing the upgrade my system wasn't able to complete the boot process. I manage to boot it using (comment #13).

Here is the output of the requested ps.

[chema@lorien ~]$ ps ax|grep dmon
  350 ?        SLsl   0:00 @dmon md127 --takeover --offroot
 2257 pts/0    S+     0:00 grep --color=auto dmon
[chema@lorien ~]$
Comment 20 Jes Sorensen 2013-01-08 05:26:39 EST
I finally got F18 installed and tried this out. Once the
mdmonitor-takeover.service is removed, I am able to reproduce the problem
where the boot hangs. Logged in, ran mdmon as described above and it was
fine.

I have however fixed the problem with the dangling symlink reported in
comment #13. This will be fixed in mdadm-3.2.6-10.

Jes
Comment 21 SpuyMore 2013-01-08 07:30:44 EST
Somehow good to hear you can reproduce the problem. Hope it can be fixed. Thanks.

(In reply to comment #20)
> I finally got F18 installed and tried this out. Once the
> mdmonitor-takeover.service is removed, I am able to reproduce the problem
> where the boot hangs. Logged in, ran mdmon as described above and it was
> fine.
> 
> I have however fixed the problem with the dangling symlink reported in
> comment #13. This will be fixed in mdadm-3.2.6-10.
> 
> Jes
Comment 22 Fedora Update System 2013-01-09 00:47:01 EST
mdadm-3.2.6-11.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/mdadm-3.2.6-11.fc18
Comment 23 Jes Sorensen 2013-01-09 00:53:23 EST
Working with Harald yesterday, I think we finally figured out the source of
the problem here.

Basically mdmon needs to leave the udev cgroup, during the initramfs stage of
the boot process, to avoid getting killed when udev stops. Doug earlier
introduced code to handle this, however this was happening prior to mdadm
fork()'ing which meant it ended up not being valid for the child process.

Moving the cgroup handling code into mdmon itself and doing this after we
have forked the daemon process makes it do the right thing, and at least it
works for me here.

Please give mdadm-3.2.6-11 a spin and see if it works for you too.

Jes
Comment 24 Jes Sorensen 2013-01-09 02:36:37 EST
Famous last words - we have another corner case. If one has two IMSM arrays
the still goes into hanging. It looks like mdmon again gets killed when the
second array is assembled.

If you just have one array 3.2.6-11 should be fine.

Haven't figured out this one yet.
Comment 25 SpuyMore 2013-01-09 04:19:55 EST
Thanks Jes. I am happy to try out 3.2.6-11. I want to avoid having the array to resync again after boot failure since we are using the system in production already (not smart I know). Is there a way to make sure the array is shutdown cleanly even though the boot hangs? Just in case.
Comment 26 Michal Schmidt 2013-01-09 08:51:02 EST
(In reply to comment #23)
> Basically mdmon needs to leave the udev cgroup, during the initramfs stage of
> the boot process, to avoid getting killed when udev stops. Doug earlier
> introduced code to handle this, however this was happening prior to mdadm
> fork()'ing which meant it ended up not being valid for the child process.

That's odd. fork() is not expected to move the child to another cgroup.
Comment 27 Doug Ledford 2013-01-09 11:32:31 EST
I agree with Michael, fork() should keep within the same cgroup.  That should probably be investigated.

As to the rest of things, it sounds to me like we are still running into the same ordering problems we had from the very beginning.  In particular, we start mdmon off root from the initrd, but only for the root filesystem.  In the case of two imsm arrays, I suspect the issue is that one is a root filesystem (and is being started and stopped from the initrd, and hence works) while the other is a non-root filesystem being started from the root filesystem and then I suspect the shutdown sequence is out of order.  Specifically, for the non-root imsm array startup/shutdown to work with a root based mdmon, the sequence taken by systemd would need to be:

close most apps
umount non-root filesystems
close all non-root mdmons
jump back to initrd as root
umount root filesystem
close root mdmon that is running from initrd

I suspect systemd is trying to do

close all apps (including mdmon for non-root filesystems)
umount non-root filesystems
jump back to initrd as root
umount root filesystem
close root mdmon that is running from initrd

I suspect that the systemd folks, in their desire not to have to special case mdmon, have forgotten that even for non-root instances of imsm arrays, things must be shut down in the proper order.  I could be totally off base of course, but it's worth checking.  An easy test would be to start all imsm arrays from the initrd and start all mdmon's from the initrd and treat the all as offroot.  If things start working then, it would point towards this explanation of things.
Comment 28 Jes Sorensen 2013-01-10 04:17:46 EST
Doug,

Actually the problem isn't so much ordering but how mdmon is launched, and it
getting killed because it is in the wrong cgroup when udev is being shut down.

The bizarre case I am seeing now, post my changes in 3.2.6-11, is that if I
have just one IMSM array, it is launched correctly and survives the shutdown
of udev in the initrd. However when the second one is present and getting
assembled later, something kills mdmon again. The only way I can boot the
system is by doing a login and launching mdmon as the first thing post boot,
otherwise it will lock up due to all writes stalling. Why it gets killed in this
case is really baffling since it should be going to the right cgroup with the
changes I added.

I am going to investigate what it will take to launch mdmon via systemctl
instead of via fork()/execl().

Jes
Comment 29 Michal Schmidt 2013-01-10 05:07:26 EST
(In reply to comment #28)
> it getting killed because it is in the wrong cgroup

Jes, could you show any proof of the cgroup membership of the mdmon process in the case of multiple IMSM arrays? What is in /proc/$PID_OF_MDMON/cgroup?
Comment 30 Michal Schmidt 2013-01-10 05:13:47 EST
And how does mdmonitor.service fit into the picture? Is mdmon sometimes spawned from udev rules and sometimes from this service?
Comment 31 Jes Sorensen 2013-01-10 05:16:53 EST
Michal,

I don't have any proof of how it happens in the multiple array case as I haven't
figured out yet just when the mdmon process gets killed. We were able to prove
that it happened with the single-array case, but I still need to try and figure
out what happens in the multiple-array case. It's rather non-obvious though
given that it's all started in weird order via systemd.

Basically mdmon is supposed to get launched by mdadm when it assembles an
array that requires mdmon (eg. IMSM and DDF - though we don't currently
support DDF).

Jes
Comment 32 Chema Casanova 2013-01-10 19:32:36 EST
I've just tested the las version of mdmon, and i continue having the same problem and i only have one raid volume. I'm only able to do a complete boot including the md.break=pre-mount and executing the mdmon command.

There is something extrange because if i have my external HD by firewire the booting doesn't reach the GDM, but if I remove it reachs the GDM but after 1-2 seconds neither keyboard or mouse are responsibe.

If i downgrade to mdadm-3.2.6-q.fc18 system boots normaly, but can not reboot or poweroff under control. So I would need to stick to mdadm 3.2.6 downgraded.

Here is my configuration:

[chema@lorien ~]$ rpm -qa|grep mda
mdadm-3.2.6-11.fc18.x86_64


$ cat /proc/mdstat 
Personalities : [raid1] 
md126 : active raid1 sdb[1] sdc[0]
      1953511424 blocks super external:/md127/0 [2/2] [UU]
      
md127 : inactive sdc[1](S) sdb[0](S)
      6056 blocks super external:imsm
       
unused devices: <none>

[chema@lorien ~]$ df -h
Sist. Fich     Tamaño Usado  Disp Uso% Montado en
devtmpfs         3,9G     0  3,9G   0% /dev
tmpfs            3,9G  212K  3,9G   1% /dev/shm
tmpfs            3,9G  4,8M  3,9G   1% /run
tmpfs            3,9G     0  3,9G   0% /sys/fs/cgroup
/dev/md126p2     193G  9,7G  173G   6% /
tmpfs            3,9G   36K  3,9G   1% /tmp
/dev/md126p1     985M  128M  807M  14% /boot
/dev/md126p3     1,4T  1,3T   38G  98% /home
Comment 33 Jes Sorensen 2013-01-16 03:32:44 EST
*** Bug 895815 has been marked as a duplicate of this bug. ***
Comment 34 Fedora Update System 2013-01-21 11:08:19 EST
dracut-024-23.git20130118.fc18,mdadm-3.2.6-12.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/dracut-024-23.git20130118.fc18,mdadm-3.2.6-12.fc18
Comment 35 Chema Casanova 2013-01-21 13:27:24 EST
This last package update worked for me. When I rebooted the system it didn't boot correctly, i proceed with the rd.brea=pre-mount in the kernel parameters. I enabled the mdmon --all --takeover --offroot and I could complete the booting.

Then i executed again the dracut -f, and then I could proceed with a complete reboot cycle. It worked :)
Comment 36 Mark-Jan de Jong 2013-01-22 12:55:54 EST
Confirmed. Packages from Comment 34 and procedure from Comment 35 fixed my issue as well.

Small typo correction for Comment 35... The correct kernel param is rd.break=pre-mount .
Comment 37 SpuyMore 2013-01-23 06:16:10 EST
Confirmed, the packages from Comment 34 work.
I didn't have to run dracut -f after a reboot, I ran dracut -f right after updating the packages, rebooted the system and it came up without problems and subsequent reboots/shutdowns wotk like a charm.

Many thanks for fixing this.
Comment 38 Jes Sorensen 2013-01-23 07:23:26 EST
Very pleased to hear it is working for you all. If you have a chance to leave
karma, we can get it pushed out to everybody faster.

https://admin.fedoraproject.org/updates/dracut-024-23.git20130118.fc18,mdadm-3.2.6-12.fc18

Cheers,
Jes
Comment 39 Fedora Update System 2013-01-24 17:35:11 EST
dracut-024-23.git20130118.fc18, mdadm-3.2.6-12.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 40 Fedora Update System 2013-02-05 11:10:16 EST
dracut-024-25.git20130205.fc18, mdadm-3.2.6-14.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/dracut-024-25.git20130205.fc18,mdadm-3.2.6-14.fc18
Comment 41 Chema Casanova 2013-02-05 13:31:58 EST
(In reply to comment #37)
> Confirmed, the packages from Comment 34 work.
> I didn't have to run dracut -f after a reboot, I ran dracut -f right after
> updating the packages, rebooted the system and it came up without problems
> and subsequent reboots/shutdowns wotk like a charm.

I've just tested today the new packages that worked fine, but i found another corner case that it is still present here.

When I've tried the first version that solved the problem i needed to do a hard reboot because it didn't worked the poweroff after the dracut -f . Today testing the new packages I realized that this problem happens when you have already suspend the system once, so when you try the poweroff it doesn't work.

I've checked the next sequence with lastest version dracut-024-25.git20130205.fc18, mdadm-3.2.6-14.fc18:

- Boot the system
- Wait the gdm to boot.
- Press the poweroff in the GDM greeting
- The system poweroff correctly.

But if I do next sequence:

- Boot the system
- Wait the gdm to boot.
- Suspend the system in the GDM
- Resume the system with the keyboard
- Press the poweroff in the GDM greeting
- The system poweroff is stalled.
- Hard-reboot needed and RAID is being rebuild during next reboot.

Could anybody confirm this problem with the suspension ?
Comment 42 Jes Sorensen 2013-02-07 13:30:21 EST
I haven't tested suspend on a raid system before since my test boxes are server
boxes. I will try and see if I can reproduce it, but I am not sure if they will
even do suspend.

After doing the suspend, can you try and check if mdmon is still running before
trying the shutdown?

ps -aux|grep dmon

Jes
Comment 43 Chema Casanova 2013-02-07 17:54:28 EST
Yes, it seems to be running:

Before the suspend:

root       283  0.0  0.1  14976 10876 ?        SLsl 10:54   0:02 @sbin/mdmon --foreground md127

After the suspend:

root       283  0.0  0.1  14976 10876 ?        SLsl 10:54   0:02 @sbin/mdmon --foreground md127
Comment 44 Jes Sorensen 2013-02-12 11:24:55 EST
ok that is puzzling, thanks for the info.
Comment 45 Tony Marchese 2013-02-15 11:31:48 EST
Hi guys,
it seems that I have another variation of this issue.

I have:

mdadm-3.2.6-12.fc18.x86_64
dracut-024-23.git20130118.fc18.x86_64
kernel-3.7.6-201.fc18.x86_64

# df -h
File system     Dim. Usati Dispon. Uso% Montato su
devtmpfs        5,9G     0    5,9G   0% /dev
tmpfs           5,9G  228K    5,9G   1% /dev/shm
tmpfs           5,9G  6,9M    5,9G   1% /run
tmpfs           5,9G     0    5,9G   0% /sys/fs/cgroup
/dev/sda2        42G  4,1G     35G  11% /
tmpfs           5,9G   32K    5,9G   1% /tmp
/dev/sda1       485M  108M    352M  24% /boot


/dev/md126 raid1 formed by 2 disks shoutl be mounted on /home (see last line of /etc/fstab)

# mdadm -D /dev/md126
/dev/md126:
      Container : /dev/md/imsm0, member 0
     Raid Level : raid1
     Array Size : 1953511424 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953511556 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 2

          State : clean, resyncing 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0

  Resync Status : 16% complete


           UUID : 6862ff21:72014ea5:67fa1e10:f1d2a26b
    Number   Major   Minor   RaidDevice State
       1       8       16        0      active sync   /dev/sdb
       0       8       32        1      active sync   /dev/sdc


The imsm container:

# mdadm -D /dev/md127
/dev/md127:
        Version : imsm
     Raid Level : container
  Total Devices : 2

Working Devices : 2


           UUID : bd9c2866:4fbc5b7b:3ba8e429:d291d6d7
  Member Arrays : /dev/md/Volume0_0

    Number   Major   Minor   RaidDevice

       0       8       32        -        /dev/sdc
       1       8       16        -        /dev/sdb



The issue is that everytime I reboot/boot the raid starts 3/4 hours resyncing, /dev/md126 is busy and cannot be mounted. 

I can mount the raid only after issuing the command mdmon --all from a root shell.

I tried to follow the istructions contained in Comments 34 and 35, but with no result.


Please help, I am already busy with this since more than one week.

Thanks,
Tony
Comment 46 Tony Marchese 2013-02-15 11:54:23 EST
I forgot to add /etc/fstab, here is:

# cat /etc/fstab 

#
# /etc/fstab
# Created by anaconda on Tue Feb 12 19:33:07 2013
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=7afe1956-b93a-4bdf-bb5b-83f0ae011a83 /                       ext4    defaults        1 1
UUID=db1a19a4-8db3-4d12-be7c-bcc28e3ce471 /boot                   ext4    defaults        1 2
UUID=3393924f-fdd0-4150-8e23-55f8fd679f1e swap                    swap    defaults        0 0
UUID=7f6f71c8-784e-4ec7-bdc0-11a48b6fa9e7 /home auto nosuid,nodev,nofail 0 0
Comment 47 Tony Marchese 2013-02-15 18:28:19 EST
I downgraded mdadm to mdadm-3.2.6-1.fc18.x86_64 and ran dracut -f
/home is not mounted yet, but now the raid is not anymore syncing at every boot.
Still I have to issue mdmon --all in order to be able to manually mount the volume
Comment 48 Tony Marchese 2013-02-16 07:08:31 EST
UPDATE: I raid-1 is not manually umounted before shutdown, on next boot the volume starts syncing
Comment 49 Harald Hoyer 2013-02-18 06:25:49 EST
(In reply to comment #45)
> Hi guys,
> it seems that I have another variation of this issue.
> 
> I have:
> 
> mdadm-3.2.6-12.fc18.x86_64
> dracut-024-23.git20130118.fc18.x86_64
> kernel-3.7.6-201.fc18.x86_64
> 

Can you please update to mdadm and dracut from comment #40 ?

(In reply to comment #40)
> dracut-024-25.git20130205.fc18, mdadm-3.2.6-14.fc18 has been submitted as an
> update for Fedora 18.
> https://admin.fedoraproject.org/updates/dracut-024-25.git20130205.fc18,mdadm-
> 3.2.6-14.fc18
Comment 50 Tony Marchese 2013-02-19 09:26:41 EST
I am now running:

mdadm-3.2.6-14.fc18.x86_64
dracut-024-25.git20130205.fc18.x86_64
kernel-3.7.8-202.fc18.x86_64

I overlooked that mdadm-3.2.6-14.fc18.x86_64 and dracut-024-25.git20130205.fc18.x86_64 were only available through the updates-testing repo. After installation I have been several times rebooted running dracut -f and issuing the command mdmon --all --takeover --offroot in the pre-mount shell invoked through boot parameter rd.break=pre-mount

Here is my fstab:


# /etc/fstab
# Created by anaconda on Tue Feb 12 19:33:07 2013
#
# Accessible filesystems, by reference, are maintained under '/dev/disk'
# See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info
#
UUID=7afe1956-b93a-4bdf-bb5b-83f0ae011a83 /                       ext4    defaults        1 1
UUID=db1a19a4-8db3-4d12-be7c-bcc28e3ce471 /boot                   ext4    defaults        1 2
UUID=3393924f-fdd0-4150-8e23-55f8fd679f1e swap                    swap    defaults        0 0
UUID=7f6f71c8-784e-4ec7-bdc0-11a48b6fa9e7 /home 		  ext4	  defaults,nofail 0 2

# mdadm -D /dev/md126
/dev/md126:
      Container : /dev/md/imsm0, member 0
     Raid Level : raid1
     Array Size : 1953511424 (1863.01 GiB 2000.40 GB)
  Used Dev Size : 1953511556 (1863.01 GiB 2000.40 GB)
   Raid Devices : 2
  Total Devices : 2

          State : clean 
 Active Devices : 2
Working Devices : 2
 Failed Devices : 0
  Spare Devices : 0


           UUID : 6862ff21:72014ea5:67fa1e10:f1d2a26b
    Number   Major   Minor   RaidDevice State
       1       8       16        0      active sync   /dev/sdb
       0       8       32        1      active sync   /dev/sdc

# mdadm -D /dev/md127
/dev/md127:
        Version : imsm
     Raid Level : container
  Total Devices : 2

Working Devices : 2


           UUID : bd9c2866:4fbc5b7b:3ba8e429:d291d6d7
  Member Arrays : /dev/md/Volume0_0

    Number   Major   Minor   RaidDevice

       0       8       16        -        /dev/sdb
       1       8       32        -        /dev/sdc


Th behaviour is actually that the system boots (the nofail in fstab helps), but the raid-1 volume is not mounted. Below is an extract from my journalctl -xb

...skipping...
feb 19 15:00:50 tonyhome kernel: md/raid1:md126: active with 2 out of 2 mirrors
feb 19 15:00:50 tonyhome kernel: md126: detected capacity change from 0 to 2000395698176
feb 19 15:00:50 tonyhome kernel:  md126: unknown partition table
feb 19 15:00:50 tonyhome kernel: asix 2-5.3:1.0 eth0: register 'asix' at usb-0000:00:1d.7-5.3, ASIX AX88772 USB 2.0 Ethernet, 00:50:b6:54:89:0c
feb 19 15:00:50 tonyhome kernel: usbcore: registered new interface driver asix
feb 19 15:00:50 tonyhome kernel: Adding 14336916k swap on /dev/sda3.  Priority:-1 extents:1 across:14336916k SS
feb 19 15:00:50 tonyhome systemd-fsck[593]: /dev/sda1: clean, 375/128016 files, 165808/512000 blocks
feb 19 15:00:50 tonyhome systemd-fsck[599]: /dev/md126 is in use.
feb 19 15:00:50 tonyhome systemd-fsck[599]: e2fsck: Impossibile continuare, operazione annullata.
feb 19 15:00:50 tonyhome systemd-fsck[599]: fsck failed with error code 8.
feb 19 15:00:50 tonyhome systemd-fsck[599]: Ignoring error.
feb 19 15:00:50 tonyhome mount[606]: mount: /dev/md126 is already mounted or /home busy
feb 19 15:00:50 tonyhome kernel: EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: (null)
feb 19 15:00:50 tonyhome kernel: SELinux: initialized (dev sda1, type ext4), uses xattr
feb 19 15:00:50 tonyhome kernel: md: export_rdev(sdc)
feb 19 15:00:50 tonyhome kernel: md: export_rdev(sdb)
feb 19 15:00:50 tonyhome kernel: md: md126 switched to read-write mode.
feb 19 15:00:51 tonyhome kernel: input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input17
feb 19 15:00:51 tonyhome kernel: input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input18
feb 19 15:00:51 tonyhome kernel: input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input19
feb 19 15:00:51 tonyhome kernel: input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.0/0000:03:00.1/sound/card2/input20
feb 19 15:00:51 tonyhome fedora-storage-init[627]: Impostazione del Logical Volume Management:   No volume groups found
feb 19 15:00:51 tonyhome fedora-storage-init[627]: [  OK  ]
feb 19 15:00:51 tonyhome fedora-storage-init[635]: Impostazione del Logical Volume Management:   No volume groups found
feb 19 15:00:51 tonyhome fedora-storage-init[635]: [  OK  ]
feb 19 15:00:51 tonyhome lvm[642]: No volume groups found
feb 19 15:00:51 tonyhome auditd[645]: Started dispatcher: /sbin/audispd pid: 648
...skipping...

Afterwords I can login in the system as root and I can manually run mount -a which normally mount the raid-1 volume in /home

Feb 19 15:01:38 tonyhome kernel: [   55.531726] EXT4-fs (md126): mounted filesystem with ordered data mode. Opts: (null)

From there on the system works normally until next reboot...

I don't know whether this issue is still related to this bug or it is about something else.
Thank you for analyzing!
Comment 51 Doug Ledford 2013-02-19 09:32:39 EST
Tony, since your problem is occurring with the latest software, and given that your problem (now) is not the same as the one in this bug report, I'm cloning just your last comment into a new bug.
Comment 52 Tony Marchese 2013-02-19 09:39:06 EST
Sounds good.
Comment 53 Fedora Update System 2013-02-25 21:54:44 EST
dracut-024-25.git20130205.fc18, mdadm-3.2.6-14.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.
Comment 54 Fedora Update System 2013-04-23 07:27:04 EDT
mdadm-3.2.6-18.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/mdadm-3.2.6-18.fc18
Comment 55 Fedora Update System 2013-04-24 09:54:20 EDT
mdadm-3.2.6-19.fc18 has been submitted as an update for Fedora 18.
https://admin.fedoraproject.org/updates/mdadm-3.2.6-19.fc18
Comment 56 Fedora Update System 2013-07-04 22:11:16 EDT
mdadm-3.2.6-19.fc18 has been pushed to the Fedora 18 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.