Bug 1402073 - "device-mapper: remove ioctl on fedora-root failed: Device or resource busy" on shutdown
Summary: "device-mapper: remove ioctl on fedora-root failed: Device or resource busy" ...
Status: NEW
Alias: None
Product: Fedora
Classification: Fedora
Component: plymouth   
(Show other bugs)
Version: 28
Hardware: x86_64
OS: Linux
unspecified
low
Target Milestone: ---
Assignee: Ray Strode [halfline]
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-12-06 17:33 UTC by Alexander Korsunsky
Modified: 2018-09-27 14:47 UTC (History)
42 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Screenshot (539.87 KB, image/jpeg)
2017-02-04 12:03 UTC, Arnaud Kleinveld
no flags Details
Contains the list of updated packages. (8.62 KB, text/plain)
2017-02-11 13:13 UTC, blachniom
blachniom: review-
Details
report from dracut (12.50 KB, text/plain)
2017-06-27 14:52 UTC, Henrique Martins
no flags Details
output of ps -axel (10.06 KB, text/plain)
2017-06-27 15:19 UTC, Henrique Martins
no flags Details
Initramfs pre-shutdown shell log (14.36 KB, text/plain)
2017-12-06 08:52 UTC, blachniom
no flags Details
Sleepy lazy debugging patch to dracut's shutdown.sh (1.56 KB, patch)
2018-02-09 16:38 UTC, Ulrik Dickow
no flags Details | Diff

Description Alexander Korsunsky 2016-12-06 17:33:46 UTC
Description of problem:
When shutting down, message appears on the screen: 
device-mapper: remove ioctl on fedora-root failed: Device or resource busy



Version-Release number of selected component (if applicable):
044-78.fc25.x86_64


How reproducible:


Steps to Reproduce:
1. Shut system down
2. Observe the screen

Actual results:
Message on the screen appears:
device-mapper: remove ioctl on fedora-root failed: Device or resource busy


Expected results:
No errors appearing on the screen


Additional info:
As suggested in bug 1305831, comment #25
> The systemd/dracut shutdown hook is trying to deactivate a device which is still in use (not an LVM/device-mapper issue).

Please change the component, if you feel that dracut is not at fault.

Comment 1 Chris Horn 2016-12-22 14:00:45 UTC
Since Fedora 24 I have these messages on all of my PCs - it's annoying.

I found out 
if you comment ("#") the first line in "/etc/crypttab" 
the messages are gone on screen and in "journalctl"

if you shutdown or reboot with --force option the messages dont appear on screen nor in journalctl.

Comment 2 Alexander Korsunsky 2017-01-09 13:37:27 UTC
(In reply to Chris Horn from comment #1)
> Since Fedora 24 I have these messages on all of my PCs - it's annoying.
> 
> I found out 
> if you comment ("#") the first line in "/etc/crypttab" 
> the messages are gone on screen and in "journalctl"

Cannot confirm this workaround. I commented the first line, and the messages still appear. Also, my crypttab is empty except for one empty line.


> if you shutdown or reboot with --force option the messages dont appear on screen nor in journalctl.

This I can confirm.

Comment 3 Chris Horn 2017-01-10 11:22:40 UTC
Okay.

I see on my other machine it doesnt work too.

I tried to mask plymouth services. 
I masked every plymouth service except plymouth-start.service and plymouth-read-write.service and the messages are gone, may it help you too.

Best regards

Comment 4 Harald Hoyer 2017-01-12 13:14:52 UTC
Please debug, if this is a dracut issue, by following the "Debugging dracut on shutdown" section.

https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#debugging-dracut-on-shutdown

Of course best would be logs via serial console.

Or photos of the screen. Maybe scroll up with <shift>+<pageup> to the error message.

Comment 5 Harald Hoyer 2017-01-30 09:54:08 UTC
(In reply to Alexander Korsunsky from comment #2)
> (In reply to Chris Horn from comment #1)
> > Since Fedora 24 I have these messages on all of my PCs - it's annoying.
> > 
> > I found out 
> > if you comment ("#") the first line in "/etc/crypttab" 
> > the messages are gone on screen and in "journalctl"
> 
> Cannot confirm this workaround. I commented the first line, and the messages
> still appear. Also, my crypttab is empty except for one empty line.
> 
> 
> > if you shutdown or reboot with --force option the messages dont appear on screen nor in journalctl.
> 
> This I can confirm.

--force will give you an unclean shutdown, please don't do that!

Comment 6 Arnaud Kleinveld 2017-02-04 12:03 UTC
Created attachment 1247677 [details]
Screenshot

Quality is unfortunately not that good. I had to be fast as the message is only visible for less than a second.

Comment 7 blachniom 2017-02-04 13:08:41 UTC
Product: 	Fedora
Version: 	25
Hardware: 	x86_64 Linux on SSD SATA drive
Kernel:         4.9.6-200.fc25.x86_64

Hi,
I can confirm the problem without LUKS-encrypted partitions.
The problem started appearing not right after I upgraded from 24 to 25, it took some more updates.
The message is nearly identical to the ones attached by the OP. FS is still being used on shutdown. The screen blanks, but the computer does not power off. At all.

I managed to shutdown properly after previously having suspended the computer.
This reminds me a bit of the SSD drive being locked by the OS while trying to do Secure Erase. You need to suspend the OS first, only then is SSD ready to be erased.

Please let me know if you need more info.

Regards
Smirk

Comment 8 blachniom 2017-02-11 13:13 UTC
Created attachment 1249266 [details]
Contains the list of updated packages.

One (or more) of the packages solves the problem of machine not shutting down

Comment 9 Sebastien Chapuis 2017-04-14 20:08:23 UTC
I also have this problem since Fedora 24.
Has anyone try that ?: https://help.onapp.com/hc/en-us/articles/222048088-Workarould-for-Device-mapper-remove-ioctl-on-failed-Device-or-resource-busy-issue

Comment 10 blachniom 2017-04-15 15:29:47 UTC
Yup, just did. The messages still appear.

Comment 11 Harald Hoyer 2017-05-18 13:54:03 UTC
(In reply to Arnaud Kleinveld from comment #6)
> Created attachment 1247677 [details]
> Screenshot
> 
> Quality is unfortunately not that good. I had to be fast as the message is
> only visible for less than a second.

strange... something is keeping your root device open.

please try:

To debug the shutdown sequence on systemd systems, you can rd.break on pre-shutdown or shutdown.

To do this from an already booted system:

# mkdir -p /run/initramfs/etc/cmdline.d
# echo "rd.debug rd.break=pre-shutdown rd.break=shutdown" > /run/initramfs/etc/cmdline.d/debug.conf
# touch /run/initramfs/.need_shutdown

This will give you a dracut shell after the system pivot’ed back in the initramfs.

you might see some processes with
# ps ax

or messages scrolling up the console with <shift>+<pageup>

Comment 12 Henrique Martins 2017-05-19 15:03:50 UTC
Not completely sure but:
- I only saw this on the only machine I have with an SSD disk, a laptop, not on any of my other boxes with spinning hard disks.
- It seems (rebooted a couple of times to check) that it may be gone with the sssd related rpms I installed yesterday, all tagged 1.15.2-2.fc25.x86_64.
If I see it again I'll post back here.

Comment 13 gesserat 2017-06-26 12:24:01 UTC
Can confirm that on F26 Beta. Using two drives, one SSD plus one HDD. 
"halt" command from superuser can assist in gathering evidence. Shutdown dmesg is full of "Kernel not configured to use semaphores (System V IPC), not using udev rules" followed by OP lines for luks sub-volumes. 
It does slow shutdown and restart times due to console output. Not very significant, but still.

Comment 14 Henrique Martins 2017-06-27 14:49:06 UTC
Seems like, contrary to what I said in comment #12, the problem is still there, just it doesn't show up at every single reboot, and I'm not sure what I do differently with this laptop when on linux to cause it or not.

I followed the instructions above to fall into a dracut shell, The root file system is still mounted on /oldroot, read only.  I was able to save rdsosreport.txt, which I'll attach, but there must be a watchdog timeout of some sort as the system rebooted before I could do a ps.

Comment 15 Henrique Martins 2017-06-27 14:52 UTC
Created attachment 1292370 [details]
report from dracut

Attaching rdsosreport.txt created by the process described in comment #11

Comment 16 Henrique Martins 2017-06-27 15:19 UTC
Created attachment 1292374 [details]
output of ps -axel

Was able to repeat and capture the output of ps -axel.
Not sure which process is olding the root fs. 
Note that this was not captured at the same time as the previous rdsosreport, but conditions should be very similar.

Comment 17 Henrique Martins 2017-06-27 15:30:01 UTC
(In reply to Harald Hoyer from comment #11)
> strange... something is keeping your root device open.

Not sure whether relevant, but when in the dracut shell I can unmount /oldroot

Comment 18 Harald Hoyer 2017-06-29 11:39:08 UTC
(In reply to Chris Horn from comment #3)
> Okay.
> 
> I see on my other machine it doesnt work too.
> 
> I tried to mask plymouth services. 
> I masked every plymouth service except plymouth-start.service and
> plymouth-read-write.service and the messages are gone, may it help you too.
> 
> Best regards

ok, next wild guess:

Does it help if you add "plymouth.enable=0" on the kernel command line?

Comment 19 Alexander Korsunsky 2017-07-11 11:08:01 UTC
(In reply to Harald Hoyer from comment #18)
> ok, next wild guess:
> 
> Does it help if you add "plymouth.enable=0" on the kernel command line?

It does in fact help, yes! No more device-mapper messages.

However, considering the main grievance for me was the "ugliness" of the boot/shutdown process, disabling plymouth entirely is not a very satisfactory workaround.

I hope this can be resolved without disabling plymouth.

Comment 20 Henrique Martins 2017-07-11 15:56:20 UTC
Seems to work after three reboots on the laptop where I had this problem.

Disabling plymouth on that particular box is not that "ugly", as it boots so fast that the plymouth screen was up for less than a second anyway, and now shuts down just as fast. 

 (And I have two old boxes where plymouth always drops down to the text mode bar, thus I dnf erased it).

Move this bug to plymouth?

Comment 21 Harald Hoyer 2017-07-12 08:53:43 UTC
Yeah, plymouth needs to exit earlier in the shutdown process it seems.

Comment 22 Alexander Korsunsky 2017-07-12 11:38:52 UTC
Still present in Fedora 26.

Comment 23 Sebastien Chapuis 2017-07-14 21:58:09 UTC
I removed the kernel parameter rhbg and it fixed the problem.

Comment 24 Rafael José 2017-12-06 03:22:24 UTC
I've had this problem since Fedora 25.
I've been updating Fedora since then and the problem still persists on 27.

Comment 25 blachniom 2017-12-06 08:50:55 UTC
Confirmed, x86_64 F27 still suffers from it.

Removing "rhgb" had no effect at all as I didn't use it when the problem appeared.

"plymouth.enable=0" also did not resolve the issue for me. I've done the steps pointed in <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1402073#c11">comment#11</a>

Attaching my "rdsosreport.txt".
When dropped to the shell, I wasn't able to umount /oldroot normally. I needed to do it "the lazy way" as normal exited with "Device is busy".

Regards
Smirk

Comment 26 blachniom 2017-12-06 08:52 UTC
Created attachment 1363549 [details]
Initramfs pre-shutdown shell log

Comment 27 h3dkandi 2017-12-26 12:42:41 UTC
On fedora 27 disabling plymouth resolved the issue for me. I used this command to disable it:

# plymouth-set-default-theme -R details

Now I don't get a pretty encrypt password prompt but I can live with that. The error messages were eating at my soul every time I shut down the laptop.

Comment 28 h3dkandi 2017-12-26 12:56:54 UTC
(In reply to h3dkandi from comment #27)
> On fedora 27 disabling plymouth resolved the issue for me. I used this
> command to disable it:
> 
> # plymouth-set-default-theme -R details
> 
> Now I don't get a pretty encrypt password prompt but I can live with that.
> The error messages were eating at my soul every time I shut down the laptop.

Actually the problem is still here. I don't know if I have deluded my self or if only the first shut down after the change didn't display the error.

Comment 29 Jedrzej Nowak 2017-12-27 17:00:26 UTC
I'm having the same problem.

Fedora 27,  no encryption, "just" lvm.

I'm pretty sure that it started to appear when I upgraded from 25 to 26 and now still persists on 27.

Comment 30 Dio Putra 2018-01-17 02:29:55 UTC
Sometime this happens after my encrypted Debian GNU/Linux has been unmounted successfully in chrooted environment atop Slackware. Although my Debian GNU/Linux was mounted with "-o bind" and unmounted with -R options.

Comment 31 Ulrik Dickow 2018-02-09 16:38 UTC
Created attachment 1393856 [details]
Sleepy lazy debugging patch to dracut's shutdown.sh

The patch adds short 0.2/0.5 second sleeps between umount/device-mapper retries,
plus last-resort lazy umount with long sleep if all 41 previous umount attempts failed,
plus extra warnings and sleeps for easier further debugging of any remaining problems.
Rebuild initramfs with dracut after applying the patch (possibly after reducing/removing the very last sleep if you think it's too long).

On every reboot since the patch I now only get exactly 1 device-mapper warning, i.e. it fails the first time, but then sleeps 0.5 second and always succeeds the second time.  This prevents my user-1000.journal on btrfs from loosing all of its extended attributes (bug 1447750).

On the first and third shutdown after the patch, the umount loop also succeeded with only 1 error and thus a single 0.2 second sleep before the second try succeed.  At the second shutdown none of the 41 umount_a attempts worked, so it fell through to the 10 second sleep, 'umount -ARdl /oldroot', 20 more seconds sleep and then the disassembly of LUKS device and underlying logical volume succeeded with 1 device-mapper warning as above.

To further reduce the number of warnings at shutdown -- and until now also avoid to hit the 41 umount_a failure limit again -- I now combine the shutdown.sh patch with Salvador Ortiz' nice little work-around script in
https://bugzilla.redhat.com/show_bug.cgi?id=1385432#c150 .  It eliminates all of the SELinux warnings/errors and usually (or always?) the need for lazy umount, but doesn't eliminate the single remaining device-mapper warning (_cnt=1 for the umount_a as well as the shutdown-hook/dm-disassembly loop at every shutdown).

Comment 32 Ulrik Dickow 2018-02-09 21:14:03 UTC
The reason for the 1 remaining device-mapper error is obvious in my case: the 'dmsetup info -c --noheadings -o name' in /usr/lib/dracut/modules.d/90dm/dm-shutdown.sh generates a device name list where my lv vg0-lv2rootall happens to appear before the luks-3ea... device built on top of it.  So at first dm-shutdown run, the lv disassembly fails, while the luks takedown succeeds.  On second run, the lv is no longer busy.

Such a layered setup is perfectly normal, the default for encrypted installation actually.  Closing all luks before non-luks would help here, but others may have lv's inside luks or even more layers.  That's why calling dm-shutdown.sh repeatedly makes sense.  (I only reduced the # of calls in my patch to 8+1 to avoid flooding the console with errors; the original 40+1 is ok in general).

I suggest that stderr is thrown away from the 'dmsetup [...] remove' command in at least the first many calls, or more simply all the non-final ones (argument "final" not yet set), because "Device busy" _is_ an expected outcome initially.  (Alternatively grep away only the "Device or resource busy" ones).

Comment 33 Henrique Martins 2018-02-09 21:22:45 UTC
Obvious in your case, but I don't have any luks partitions on the laptop.
It does have the default Fedora LVM setup, but no  luks.

Comment 34 Ulrik Dickow 2018-02-10 07:48:02 UTC
So when you disable plymouth -- or force it to exit earlier and/or add extra sleep -- then you have _zero_ device-mapper errors left, right?  Late plymouth + no sleep => lots of errors for all of us, usually.

Comment 35 Andrej Podzimek 2018-04-07 11:52:17 UTC
I'm still seeing this on Fedora 28. There are just fewer of those messages and hangs are less likely. :-/

Comment 36 Andrej Podzimek 2018-04-07 12:40:23 UTC
A heretic idea: If the whole error message ("Kernel not configured for semaphores (System V IPC). Not using udev synchronisation code.") and occasional freeze problem is SELinux-related, would it be feasible / helpful to set SELinux into permissive mode (only) for that final short moment of the shutdown process? If that step was taken with all services already down, it may not be that much of a risk...

Comment 37 Ondrej Kozina 2018-04-16 09:02:30 UTC
Following service should help you to workaround issues with stacked device-mapper devices (including both LVM2 and cryptsetup devices). It won't help with umount issues though (where a process hold open file descriptor).

systemctl start blk-availability.service

If it does help, you may try to enable the service on boot (systemctl enable...). Not sure why Fedora haven't enabled the service by default...

Comment 38 Matthew Miller 2018-04-26 13:48:43 UTC
This appears to be at least part of bug #1385432.... I'd like to get it untangled. What are the next steps here?

Comment 39 Harald Hoyer 2018-05-11 09:03:04 UTC
Here is a nice summary of the plymouth "bug" involved: https://bugzilla.redhat.com/show_bug.cgi?id=1575376#c8


Note You need to log in before you can comment on or make changes to this bug.