Bug 1092937

Summary:

"mdadm --stop" of the root device takes a loooong time

Product:

[Fedora] Fedora

Reporter:

Harald Reindl <h.reindl>

Component:

mdadm

Assignee:

Jes Sorensen <Jes.Sorensen>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Fedora Extras Quality Assurance <extras-qa>

Severity:

unspecified

Docs Contact:

Priority:

unspecified

Version:

CC:

agk, amigo.elite, dledford, dracut-maint-list, harald, jblawn, Jes.Sorensen, jonathan, sergio

Target Milestone:

---

Keywords:

Reopened

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2014-06-12 17:51:05 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
photo of hanging shutdown	none
photo with echo "rd.debug" >> /run/initramfs/etc/cmdline.d/debug.conf	none
strace -fittryTv -s 111111 mdadm $_offroot -vv --stop --scan	none

Description Harald Reindl 2014-04-30 09:52:08 UTC

looks like with dracut-037-11.git20140402.fc20.x86_64 shutdown sometimes hangs with the last message "disassembling mdraid devices" at least in combination dracut-037 / Kernel 3.14 and having /boot, rottfs and data on RAID1/RAID10

that's not everytime, otherwise i would not have given positive karma but way too often given that i did not face this issue in the past

Comment 1 Harald Reindl 2014-05-01 18:39:30 UTC

also sometimes several (random) services need very long and partly up to systemd-timeouts at shutdown to stop - recently faced also on a virtual machine

sounds like similar to https://bugzilla.redhat.com/show_bug.cgi?id=1073714

Comment 2 Sergio Basto 2014-05-02 20:45:40 UTC

I downgrade dracut 

rpm -q dracut
dracut-034-64.git20131205.fc20.x86_64

but don't fixed the problem: 

Mai 02 21:34:17 x systemd[1]: session-1.scope stopping timed out. Killing. 

btw first time that happens was : 
Abr 21 05:20:43 x systemd[1]: session-1.scope stopping timed out. Killing.

Comment 3 Harald Reindl 2014-05-07 10:47:35 UTC

Created attachment 893203 [details]
photo of hanging shutdown

look at the pircture - that is for 100% sure dracut
what is it waiting for?

* all filesystems are unmounted
* swap is disabled
* raid devices are detached
* what has it to "disassemble mdraid" for minutes, hours and sometimes forever

frankly i can reproduce this in the meantime on 8 out of 10 shutdowns in case of my office workstation - the kernel is alive for sure becasue CTRL+ALT+PRINT+S repsonds with "SysRq: Emergency Sync" and "Emergency Sync complete"

Comment 4 Sergio Basto 2014-05-07 16:42:04 UTC

yeah , I feel the difference between dracut-034 and dracut-037 is the timeout is more short in  034, so we don't see waiting 60 seconds for timeout , but timeout is there , is what I feel with my test, not sure that is .

Comment 5 Harald Reindl 2014-05-07 16:54:29 UTC

that i WOL'ed my workstation in the office at sunday to sync some data, typed "systemctl poweroff" around 15:00 and the "disassemble mdraid" still was on the screen and the machine not powered off 19 hours later can hardly be called a "timeout" :-)

Comment 6 Sergio Basto 2014-05-07 17:17:24 UTC

My case , I just have service timeout, not hang, after one minute or two at most , the system processed and shutdown without problems. 

I saw in journalctl: systemd[1]: session-1.scope stopping timed out. Killing.

maybe we have a different problem .

Comment 7 Harald Hoyer 2014-05-12 10:47:15 UTC

(In reply to Harald Reindl from comment #5)
> that i WOL'ed my workstation in the office at sunday to sync some data,
> typed "systemctl poweroff" around 15:00 and the "disassemble mdraid" still
> was on the screen and the machine not powered off 19 hours later can hardly
> be called a "timeout" :-)

This should be fixed with
http://git.kernel.org/cgit/boot/dracut/dracut.git/commit/?id=4e58a1ffc760e5c54e6cae5924a2439cae196848

Comment 8 Harald Hoyer 2014-05-12 10:50:04 UTC

(In reply to Harald Hoyer from comment #7)
> (In reply to Harald Reindl from comment #5)
> > that i WOL'ed my workstation in the office at sunday to sync some data,
> > typed "systemctl poweroff" around 15:00 and the "disassemble mdraid" still
> > was on the screen and the machine not powered off 19 hours later can hardly
> > be called a "timeout" :-)
> 
> This should be fixed with
> http://git.kernel.org/cgit/boot/dracut/dracut.git/commit/
> ?id=4e58a1ffc760e5c54e6cae5924a2439cae196848

s/should/might/

Comment 9 Harald Reindl 2014-05-12 10:52:25 UTC

can we have a update for F20 or at least a scratch-build, i had the same today (WOL my workstation on saturday from home, apply updates to keep both synchronous) and needed to call the office hard power off my computer to start the syncs of yesterdays work while going to subway :-(

Comment 10 Harald Reindl 2014-05-12 20:10:40 UTC

since there is still no build some additional infos:

after hamemring blindly STRG+ALT+PRINT+S and other disabled or invalid SysRQ combinationsthe output changes from "Disassembling mdraid devices" to "shutdown : line 90: 6035 quit" and "dracut: Wating for mdraid  devices to be clean" and continue to hammer around with SysRQ combinations leads to a reboot

well, nice workaround if you are in front of the phyiscal machine but unnaceptable in case of rebooting remote machines hundrets of miles away from your location and nobody to call there for hard power off

so this should be handeled as *very critical*

shutdown / reboot in general is a problem in F20 be it systemd or dracut
not so long ago systemd freezed ssh clients, now it kills blindly processes like VMware guests supposed to be suspended before shutdown which was perfectly clean in F19 while it started to be broken a long time after the siwtch to systemd in F15, don#t get me wrong but the "optimizations" of the last months leaving a bad taste in the admins mouth seeking for rock stable systems as known

Comment 11 Harald Hoyer 2014-05-13 10:26:05 UTC

(In reply to Harald Reindl from comment #10)
> since there is still no build some additional infos:
> 
> after hamemring blindly STRG+ALT+PRINT+S and other disabled or invalid SysRQ
> combinationsthe output changes from "Disassembling mdraid devices" to
> "shutdown : line 90: 6035 quit" and "dracut: Wating for mdraid  devices to
> be clean" and continue to hammer around with SysRQ combinations leads to a
> reboot
> 
> well, nice workaround if you are in front of the phyiscal machine but
> unnaceptable in case of rebooting remote machines hundrets of miles away
> from your location and nobody to call there for hard power off
> 
> so this should be handeled as *very critical*
> 
> shutdown / reboot in general is a problem in F20 be it systemd or dracut
> not so long ago systemd freezed ssh clients, now it kills blindly processes
> like VMware guests supposed to be suspended before shutdown which was
> perfectly clean in F19 while it started to be broken a long time after the
> siwtch to systemd in F15, don#t get me wrong but the "optimizations" of the
> last months leaving a bad taste in the admins mouth seeking for rock stable
> systems as known

Oh, while you are at it. Can you debug the shutdown, so that we can fix the real culprit. It could be dracut or mdadm.

    info "Waiting for mdraid devices to be clean."
    mdadm $_offroot -vv --wait-clean --scan| vinfo

Please follow:
https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#debugging-dracut-on-shutdown

Comment 12 Harald Hoyer 2014-05-13 10:30:54 UTC

(In reply to Harald Hoyer from comment #11)
> (In reply to Harald Reindl from comment #10)
> > since there is still no build some additional infos:
> > 
> > after hamemring blindly STRG+ALT+PRINT+S and other disabled or invalid SysRQ
> > combinationsthe output changes from "Disassembling mdraid devices" to
> > "shutdown : line 90: 6035 quit" and "dracut: Wating for mdraid  devices to
> > be clean" and continue to hammer around with SysRQ combinations leads to a
> > reboot
> > 
> > well, nice workaround if you are in front of the phyiscal machine but
> > unnaceptable in case of rebooting remote machines hundrets of miles away
> > from your location and nobody to call there for hard power off
> > 
> > so this should be handeled as *very critical*
> > 
> > shutdown / reboot in general is a problem in F20 be it systemd or dracut
> > not so long ago systemd freezed ssh clients, now it kills blindly processes
> > like VMware guests supposed to be suspended before shutdown which was
> > perfectly clean in F19 while it started to be broken a long time after the
> > siwtch to systemd in F15, don#t get me wrong but the "optimizations" of the
> > last months leaving a bad taste in the admins mouth seeking for rock stable
> > systems as known
> 
> Oh, while you are at it. Can you debug the shutdown, so that we can fix the
> real culprit. It could be dracut or mdadm.
> 
>     info "Waiting for mdraid devices to be clean."
>     mdadm $_offroot -vv --wait-clean --scan| vinfo
> 
> Please follow:
> https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#debugging-
> dracut-on-shutdown

oh, and you might want to add "rd.debug"

# echo "rd.debug" >> /run/initramfs/etc/cmdline.d/debug.conf

Comment 13 Harald Reindl 2014-05-13 11:05:07 UTC

where is the debug-information supposed to be stored after reboot with echo "rd.debug" >> /run/initramfs/etc/cmdline.d/debug.conf before?

as you can see in the photo all filesystems are already unmounted at this point

Comment 14 Harald Reindl 2014-05-13 12:28:32 UTC

well, i created that script below and rebootet the machine 10 times, i see for a very short time frame debug-messages and without *any* delay the machine reboots successful - so the complete behavior with debugging on is different and it's pretty clear dracut itself

[root@rh:~]$ cat /scripts/dracut-debug.sh 
#!/usr/bin/bash
mkdir -p /run/initramfs/etc/cmdline.d/
echo "rd.debug" >> /run/initramfs/etc/cmdline.d/debug.conf

Comment 15 Harald Hoyer 2014-05-13 13:06:00 UTC

(In reply to Harald Reindl from comment #14)
> well, i created that script below and rebootet the machine 10 times, i see
> for a very short time frame debug-messages and without *any* delay the
> machine reboots successful - so the complete behavior with debugging on is
> different and it's pretty clear dracut itself
> 
> [root@rh:~]$ cat /scripts/dracut-debug.sh 
> #!/usr/bin/bash
> mkdir -p /run/initramfs/etc/cmdline.d/
> echo "rd.debug" >> /run/initramfs/etc/cmdline.d/debug.conf

and the reverse is true also, that it stalls, if you don't do that?

alt-sysrq-t should display the running processes.
alt-sysrq-t should display the processes in 'D' state.

Maybe mdadm resolved the "clean" state, if there is some time in between the commands?

Comment 16 Harald Hoyer 2014-05-13 13:07:01 UTC

(In reply to Harald Hoyer from comment #15)
> alt-sysrq-t should display the processes in 'D' state.

sorry, "w" according to http://en.wikipedia.org/wiki/Magic_SysRq_key

alt-sysrq-w should display the processes in 'D' state.

Comment 17 Harald Reindl 2014-05-13 13:12:37 UTC

> and the reverse is true also, that it stalls, if you don't do that?

8 out of 10 times and mostly if i want to reboot a remote-machine, that's why i wrote that bugreport

what about your comment https://bugzilla.redhat.com/show_bug.cgi?id=1092937#c8 and a scratch-build? in the time we discuss and try a enduser debug initrd i could have made 50 reboots on two physical machines from yesterday to now

Comment 18 Harald Reindl 2014-05-13 13:25:46 UTC

BTW: did you look at the photo i attached some days ago? https://bugzilla.redhat.com/attachment.cgi?id=893203

* All file systems unmounted
* All swaps deactivated
* All loop devices detached
* All DM devices detached
* Storage in finalized

i don't get what is there to wait for?
there is nothing left to do and no unwritten data at all

Comment 19 Harald Hoyer 2014-05-13 15:33:44 UTC

(In reply to Harald Reindl from comment #18)
> BTW: did you look at the photo i attached some days ago?
> https://bugzilla.redhat.com/attachment.cgi?id=893203
> 
> * All file systems unmounted
> * All swaps deactivated
> * All loop devices detached
> * All DM devices detached
> * Storage in finalized
> 
> i don't get what is there to wait for?
> there is nothing left to do and no unwritten data at all

same thing.. most likely hanging in:

  mdadm  --wait-clean --scan

And because there is no pidof() involved in the shutdown anymore, I don't think, this has anything to do with comment 8

Comment 20 Harald Hoyer 2014-05-13 16:15:12 UTC

What you can do to test, if this is the culprit:

edit
/usr/lib/dracut/modules.d/90mdraid/md-shutdown.sh

comment out the wait-clean and recreate the initramfs with:

# dracut -f

and see, if it still hangs.

Comment 21 Harald Hoyer 2014-05-13 16:16:46 UTC

(In reply to Harald Hoyer from comment #20)
> What you can do to test, if this is the culprit:
> 
> edit
> /usr/lib/dracut/modules.d/90mdraid/md-shutdown.sh
> 
> comment out the wait-clean and recreate the initramfs with:
> 
> # dracut -f
> 
> and see, if it still hangs.

you can also "echo" some debug messages around it, which should go the console

Comment 22 Harald Reindl 2014-05-13 19:51:10 UTC

Created attachment 895238 [details]
photo with echo "rd.debug" >> /run/initramfs/etc/cmdline.d/debug.conf

ok ,now i "managed" to get a photo with debug, hopefully that gives some insight

yes != yes lookes hmm - strange -> dracut-lib.sh@49

the same happened (most likely) on my remote-machine too while try to reboot with rd.debug after update to kernel-4.14.4 from koji where i thought "damned thing is knowing when i am far away from the office, well photo tomorrow" but that one decided after around 20 minutes to finsih the reboot

Comment 23 Harald Hoyer 2014-05-14 14:34:02 UTC

ok, so the machine hangs in "mdadm -vv --stop --scan"

ignore the yes != yes... that is only from info()/vinfo(), which should pipe the output to the log file and the console.

reassigning to mdadm.

Don't know what changed in mdadm or the kernel.

The last change in md-shutdown was in 2012, so I don't think this is a dracut bug.

Comment 24 Harald Reindl 2014-05-14 14:52:21 UTC

hmm - the first Kernel 3.14 and dracut-37 arrived here at the same time

maybe https://bugzilla.redhat.com/show_bug.cgi?id=1096414 (raid-check with 3.14 freezes machine) has a coommon root cause i don't understand right now

Comment 25 Vladimir Stackov 2014-05-31 21:32:43 UTC

Created attachment 901132 [details]
strace -fittryTv -s 111111 mdadm $_offroot -vv --stop --scan

Comment 26 Vladimir Stackov 2014-05-31 21:34:29 UTC

I've ran into the same issue.

dracut-037-11.git20140402.fc20.x86_64
mdadm-3.3-4.fc20.x86_64
Linux version 3.14.4-200.fc20.x86_64 (mockbuild@bkernel02) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #1 SMP Tue May 13 13:51:08 UTC 2014

Here is the output from modified /usr/lib/dracut/modules.d/90mdraid/md-shutdown.sh where mdadm $_offroot -vv --stop --scan is running under strace -fittryTv -s 111111: https://bugzilla.redhat.com/attachment.cgi?id=901132

Comment 27 Harald Reindl 2014-06-03 19:02:52 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1096414
https://bugzilla.redhat.com/show_bug.cgi?id=1092937

*both* seems to be fixed with 3.14.5-200.fc20.x86_64
while i am unable to find the relevant change in
the kernel-upstream-changelog

however, rebootet my workstation 40 times after the
update from koji and 4 raid-check runs without any 
freeze on two different machines

if it comes back i will re-open that bug

Comment 28 Harald Reindl 2014-06-05 12:08:49 UTC

re-opened, my co-developers machine was hanging the whole night at "disassembling raid devices" and after 6 SYSRQ+S (emergency sync) it decided to shut down - unacceptable in case of remote-machines

Comment 29 Harald Reindl 2014-06-07 23:50:15 UTC

the two commits below smell like related
https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.14.6
____________________________________________________________________________________________

commit 0bc4091108e8f2e65faef3082e5261f2c35cd2b4
Author: NeilBrown <neilb>
Date:   Tue May 6 09:36:08 2014 +1000

    md: avoid possible spinning md thread at shutdown.
    
    commit 0f62fb220aa4ebabe8547d3a9ce4a16d3c045f21 upstream.
    
    If an md array with externally managed metadata (e.g. DDF or IMSM)
    is in use, then we should not set safemode==2 at shutdown because:
    
    1/ this is ineffective: user-space need to be involved in any 'safemode' handling,
    2/ The safemode management code doesn't cope with safemode==2 on external metadata
       and md_check_recover enters an infinite loop.
    
    Even at shutdown, an infinite-looping process can be problematic, so this
    could cause shutdown to hang.
    
    Signed-off-by: NeilBrown <neilb>
    Signed-off-by: Greg Kroah-Hartman <gregkh>
____________________________________________________________________________________________

commit 8c7311a1c4a8d804bde91b00a2f2c1a22a954c30
Author: NeilBrown <neilb>
Date:   Mon May 5 13:34:37 2014 +1000

    md/raid10: call wait_barrier() for each request submitted.
    
    commit cc13b1d1500656a20e41960668f3392dda9fa6e2 upstream.
    
    wait_barrier() includes a counter, so we must call it precisely once
    (unless balanced by allow_barrier()) for each request submitted.
    
    Since
    commit 20d0189b1012a37d2533a87fb451f7852f2418d1
        block: Introduce new bio_split()
    in 3.14-rc1, we don't call it for the extra requests generated when
    we need to split a bio.
    
    When this happens the counter goes negative, any resync/recovery will
    never start, and  "mdadm --stop" will hang.
    
    Reported-by: Chris Murphy <lists>
    Fixes: 20d0189b1012a37d2533a87fb451f7852f2418d1
    Cc: Kent Overstreet <kmo>
    Signed-off-by: NeilBrown <neilb>
    Signed-off-by: Greg Kroah-Hartman <gregkh>
____________________________________________________________________________________________

Comment 30 Harald Reindl 2014-06-12 17:51:05 UTC

closed again - 3.14.6 fixed it really, moved around some TB of data over days while repeatly check/resync 4 TB RAID10, no freeze and no hang at shutdown