1402073 – "device-mapper: remove ioctl on fedora-root failed: Device or resource busy" on shutdown

Bug 1402073 - "device-mapper: remove ioctl on fedora-root failed: Device or resource busy" on shutdown

Summary: "device-mapper: remove ioctl on fedora-root failed: Device or resource busy" ...

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	plymouth
Sub Component:
Version:	28
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	low
Target Milestone:	---
Assignee:	Ray Strode [halfline]
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2016-12-06 17:33 UTC by Alexander Korsunsky
Modified:	2019-08-08 16:18 UTC (History)
CC List:	43 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2019-05-28 22:37:16 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Screenshot (539.87 KB, image/jpeg) 2017-02-04 12:03 UTC, Arnaud Kleinveld	no flags	Details
Contains the list of updated packages. (8.62 KB, text/plain) 2017-02-11 13:13 UTC, blachniom	blachniom: review-	Details
report from dracut (12.50 KB, text/plain) 2017-06-27 14:52 UTC, Henrique Martins	no flags	Details
output of ps -axel (10.06 KB, text/plain) 2017-06-27 15:19 UTC, Henrique Martins	no flags	Details
Initramfs pre-shutdown shell log (14.36 KB, text/plain) 2017-12-06 08:52 UTC, blachniom	no flags	Details
Sleepy lazy debugging patch to dracut's shutdown.sh (1.56 KB, patch) 2018-02-09 16:38 UTC, Ulrik Dickow	no flags	Details \| Diff
View All

Description Alexander Korsunsky 2016-12-06 17:33:46 UTC

Description of problem:
When shutting down, message appears on the screen: 
device-mapper: remove ioctl on fedora-root failed: Device or resource busy



Version-Release number of selected component (if applicable):
044-78.fc25.x86_64


How reproducible:


Steps to Reproduce:
1. Shut system down
2. Observe the screen

Actual results:
Message on the screen appears:
device-mapper: remove ioctl on fedora-root failed: Device or resource busy


Expected results:
No errors appearing on the screen


Additional info:
As suggested in bug 1305831, comment #25
> The systemd/dracut shutdown hook is trying to deactivate a device which is still in use (not an LVM/device-mapper issue).

Please change the component, if you feel that dracut is not at fault.

Comment 1 Chris Horn 2016-12-22 14:00:45 UTC

Since Fedora 24 I have these messages on all of my PCs - it's annoying.

I found out 
if you comment ("#") the first line in "/etc/crypttab" 
the messages are gone on screen and in "journalctl"

if you shutdown or reboot with --force option the messages dont appear on screen nor in journalctl.

Comment 2 Alexander Korsunsky 2017-01-09 13:37:27 UTC

(In reply to Chris Horn from comment #1)
> Since Fedora 24 I have these messages on all of my PCs - it's annoying.
> 
> I found out 
> if you comment ("#") the first line in "/etc/crypttab" 
> the messages are gone on screen and in "journalctl"

Cannot confirm this workaround. I commented the first line, and the messages still appear. Also, my crypttab is empty except for one empty line.


> if you shutdown or reboot with --force option the messages dont appear on screen nor in journalctl.

This I can confirm.

Comment 3 Chris Horn 2017-01-10 11:22:40 UTC

Okay.

I see on my other machine it doesnt work too.

I tried to mask plymouth services. 
I masked every plymouth service except plymouth-start.service and plymouth-read-write.service and the messages are gone, may it help you too.

Best regards

Comment 4 Harald Hoyer 2017-01-12 13:14:52 UTC

Please debug, if this is a dracut issue, by following the "Debugging dracut on shutdown" section.

https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#debugging-dracut-on-shutdown

Of course best would be logs via serial console.

Or photos of the screen. Maybe scroll up with <shift>+<pageup> to the error message.

Comment 5 Harald Hoyer 2017-01-30 09:54:08 UTC

(In reply to Alexander Korsunsky from comment #2)
> (In reply to Chris Horn from comment #1)
> > Since Fedora 24 I have these messages on all of my PCs - it's annoying.
> > 
> > I found out 
> > if you comment ("#") the first line in "/etc/crypttab" 
> > the messages are gone on screen and in "journalctl"
> 
> Cannot confirm this workaround. I commented the first line, and the messages
> still appear. Also, my crypttab is empty except for one empty line.
> 
> 
> > if you shutdown or reboot with --force option the messages dont appear on screen nor in journalctl.
> 
> This I can confirm.

--force will give you an unclean shutdown, please don't do that!

Comment 6 Arnaud Kleinveld 2017-02-04 12:03:00 UTC

Created attachment 1247677 [details]
Screenshot

Quality is unfortunately not that good. I had to be fast as the message is only visible for less than a second.

Comment 7 blachniom 2017-02-04 13:08:41 UTC

Product: 	Fedora
Version: 	25
Hardware: 	x86_64 Linux on SSD SATA drive
Kernel:         4.9.6-200.fc25.x86_64

Hi,
I can confirm the problem without LUKS-encrypted partitions.
The problem started appearing not right after I upgraded from 24 to 25, it took some more updates.
The message is nearly identical to the ones attached by the OP. FS is still being used on shutdown. The screen blanks, but the computer does not power off. At all.

I managed to shutdown properly after previously having suspended the computer.
This reminds me a bit of the SSD drive being locked by the OS while trying to do Secure Erase. You need to suspend the OS first, only then is SSD ready to be erased.

Please let me know if you need more info.

Regards
Smirk

Comment 8 blachniom 2017-02-11 13:13:12 UTC

Created attachment 1249266 [details]
Contains the list of updated packages.

One (or more) of the packages solves the problem of machine not shutting down

Comment 9 Sebastien Chapuis 2017-04-14 20:08:23 UTC

I also have this problem since Fedora 24.
Has anyone try that ?: https://help.onapp.com/hc/en-us/articles/222048088-Workarould-for-Device-mapper-remove-ioctl-on-failed-Device-or-resource-busy-issue

Comment 10 blachniom 2017-04-15 15:29:47 UTC

Yup, just did. The messages still appear.

Comment 11 Harald Hoyer 2017-05-18 13:54:03 UTC

(In reply to Arnaud Kleinveld from comment #6)
> Created attachment 1247677 [details]
> Screenshot
> 
> Quality is unfortunately not that good. I had to be fast as the message is
> only visible for less than a second.

strange... something is keeping your root device open.

please try:

To debug the shutdown sequence on systemd systems, you can rd.break on pre-shutdown or shutdown.

To do this from an already booted system:

# mkdir -p /run/initramfs/etc/cmdline.d
# echo "rd.debug rd.break=pre-shutdown rd.break=shutdown" > /run/initramfs/etc/cmdline.d/debug.conf
# touch /run/initramfs/.need_shutdown

This will give you a dracut shell after the system pivot’ed back in the initramfs.

you might see some processes with
# ps ax

or messages scrolling up the console with <shift>+<pageup>

Comment 12 Henrique Martins 2017-05-19 15:03:50 UTC

Not completely sure but:
- I only saw this on the only machine I have with an SSD disk, a laptop, not on any of my other boxes with spinning hard disks.
- It seems (rebooted a couple of times to check) that it may be gone with the sssd related rpms I installed yesterday, all tagged 1.15.2-2.fc25.x86_64.
If I see it again I'll post back here.

Comment 13 gesserat 2017-06-26 12:24:01 UTC

Can confirm that on F26 Beta. Using two drives, one SSD plus one HDD. 
"halt" command from superuser can assist in gathering evidence. Shutdown dmesg is full of "Kernel not configured to use semaphores (System V IPC), not using udev rules" followed by OP lines for luks sub-volumes. 
It does slow shutdown and restart times due to console output. Not very significant, but still.

Comment 14 Henrique Martins 2017-06-27 14:49:06 UTC

Seems like, contrary to what I said in comment #12, the problem is still there, just it doesn't show up at every single reboot, and I'm not sure what I do differently with this laptop when on linux to cause it or not.

I followed the instructions above to fall into a dracut shell, The root file system is still mounted on /oldroot, read only.  I was able to save rdsosreport.txt, which I'll attach, but there must be a watchdog timeout of some sort as the system rebooted before I could do a ps.

Comment 15 Henrique Martins 2017-06-27 14:52:12 UTC

Created attachment 1292370 [details]
report from dracut

Attaching rdsosreport.txt created by the process described in comment #11

Comment 16 Henrique Martins 2017-06-27 15:19:23 UTC

Created attachment 1292374 [details]
output of ps -axel

Was able to repeat and capture the output of ps -axel.
Not sure which process is olding the root fs. 
Note that this was not captured at the same time as the previous rdsosreport, but conditions should be very similar.

Comment 17 Henrique Martins 2017-06-27 15:30:01 UTC

(In reply to Harald Hoyer from comment #11)
> strange... something is keeping your root device open.

Not sure whether relevant, but when in the dracut shell I can unmount /oldroot

Comment 18 Harald Hoyer 2017-06-29 11:39:08 UTC

(In reply to Chris Horn from comment #3)
> Okay.
> 
> I see on my other machine it doesnt work too.
> 
> I tried to mask plymouth services. 
> I masked every plymouth service except plymouth-start.service and
> plymouth-read-write.service and the messages are gone, may it help you too.
> 
> Best regards

ok, next wild guess:

Does it help if you add "plymouth.enable=0" on the kernel command line?

Comment 19 Alexander Korsunsky 2017-07-11 11:08:01 UTC

(In reply to Harald Hoyer from comment #18)
> ok, next wild guess:
> 
> Does it help if you add "plymouth.enable=0" on the kernel command line?

It does in fact help, yes! No more device-mapper messages.

However, considering the main grievance for me was the "ugliness" of the boot/shutdown process, disabling plymouth entirely is not a very satisfactory workaround.

I hope this can be resolved without disabling plymouth.

Comment 20 Henrique Martins 2017-07-11 15:56:20 UTC

Seems to work after three reboots on the laptop where I had this problem.

Disabling plymouth on that particular box is not that "ugly", as it boots so fast that the plymouth screen was up for less than a second anyway, and now shuts down just as fast. 

 (And I have two old boxes where plymouth always drops down to the text mode bar, thus I dnf erased it).

Move this bug to plymouth?

Comment 21 Harald Hoyer 2017-07-12 08:53:43 UTC

Yeah, plymouth needs to exit earlier in the shutdown process it seems.

Comment 22 Alexander Korsunsky 2017-07-12 11:38:52 UTC

Still present in Fedora 26.

Comment 23 Sebastien Chapuis 2017-07-14 21:58:09 UTC

I removed the kernel parameter rhbg and it fixed the problem.

Comment 24 Rafael José 2017-12-06 03:22:24 UTC

I've had this problem since Fedora 25.
I've been updating Fedora since then and the problem still persists on 27.

Comment 25 blachniom 2017-12-06 08:50:55 UTC

Confirmed, x86_64 F27 still suffers from it.

Removing "rhgb" had no effect at all as I didn't use it when the problem appeared.

"plymouth.enable=0" also did not resolve the issue for me. I've done the steps pointed in <a href="https://bugzilla.redhat.com/show_bug.cgi?id=1402073#c11">comment#11</a>

Attaching my "rdsosreport.txt".
When dropped to the shell, I wasn't able to umount /oldroot normally. I needed to do it "the lazy way" as normal exited with "Device is busy".

Regards
Smirk

Comment 26 blachniom 2017-12-06 08:52:48 UTC

Created attachment 1363549 [details]
Initramfs pre-shutdown shell log

Comment 27 h3dkandi 2017-12-26 12:42:41 UTC

On fedora 27 disabling plymouth resolved the issue for me. I used this command to disable it:

# plymouth-set-default-theme -R details

Now I don't get a pretty encrypt password prompt but I can live with that. The error messages were eating at my soul every time I shut down the laptop.

Comment 28 h3dkandi 2017-12-26 12:56:54 UTC

(In reply to h3dkandi from comment #27)
> On fedora 27 disabling plymouth resolved the issue for me. I used this
> command to disable it:
> 
> # plymouth-set-default-theme -R details
> 
> Now I don't get a pretty encrypt password prompt but I can live with that.
> The error messages were eating at my soul every time I shut down the laptop.

Actually the problem is still here. I don't know if I have deluded my self or if only the first shut down after the change didn't display the error.

Comment 29 Jedrzej Nowak 2017-12-27 17:00:26 UTC

I'm having the same problem.

Fedora 27,  no encryption, "just" lvm.

I'm pretty sure that it started to appear when I upgraded from 25 to 26 and now still persists on 27.

Comment 30 Dio Putra 2018-01-17 02:29:55 UTC

Sometime this happens after my encrypted Debian GNU/Linux has been unmounted successfully in chrooted environment atop Slackware. Although my Debian GNU/Linux was mounted with "-o bind" and unmounted with -R options.

Comment 31 Ulrik Dickow 2018-02-09 16:38:52 UTC

Created attachment 1393856 [details]
Sleepy lazy debugging patch to dracut's shutdown.sh

The patch adds short 0.2/0.5 second sleeps between umount/device-mapper retries,
plus last-resort lazy umount with long sleep if all 41 previous umount attempts failed,
plus extra warnings and sleeps for easier further debugging of any remaining problems.
Rebuild initramfs with dracut after applying the patch (possibly after reducing/removing the very last sleep if you think it's too long).

On every reboot since the patch I now only get exactly 1 device-mapper warning, i.e. it fails the first time, but then sleeps 0.5 second and always succeeds the second time.  This prevents my user-1000.journal on btrfs from loosing all of its extended attributes (bug 1447750).

On the first and third shutdown after the patch, the umount loop also succeeded with only 1 error and thus a single 0.2 second sleep before the second try succeed.  At the second shutdown none of the 41 umount_a attempts worked, so it fell through to the 10 second sleep, 'umount -ARdl /oldroot', 20 more seconds sleep and then the disassembly of LUKS device and underlying logical volume succeeded with 1 device-mapper warning as above.

To further reduce the number of warnings at shutdown -- and until now also avoid to hit the 41 umount_a failure limit again -- I now combine the shutdown.sh patch with Salvador Ortiz' nice little work-around script in
https://bugzilla.redhat.com/show_bug.cgi?id=1385432#c150 .  It eliminates all of the SELinux warnings/errors and usually (or always?) the need for lazy umount, but doesn't eliminate the single remaining device-mapper warning (_cnt=1 for the umount_a as well as the shutdown-hook/dm-disassembly loop at every shutdown).

Comment 32 Ulrik Dickow 2018-02-09 21:14:03 UTC

The reason for the 1 remaining device-mapper error is obvious in my case: the 'dmsetup info -c --noheadings -o name' in /usr/lib/dracut/modules.d/90dm/dm-shutdown.sh generates a device name list where my lv vg0-lv2rootall happens to appear before the luks-3ea... device built on top of it.  So at first dm-shutdown run, the lv disassembly fails, while the luks takedown succeeds.  On second run, the lv is no longer busy.

Such a layered setup is perfectly normal, the default for encrypted installation actually.  Closing all luks before non-luks would help here, but others may have lv's inside luks or even more layers.  That's why calling dm-shutdown.sh repeatedly makes sense.  (I only reduced the # of calls in my patch to 8+1 to avoid flooding the console with errors; the original 40+1 is ok in general).

I suggest that stderr is thrown away from the 'dmsetup [...] remove' command in at least the first many calls, or more simply all the non-final ones (argument "final" not yet set), because "Device busy" _is_ an expected outcome initially.  (Alternatively grep away only the "Device or resource busy" ones).

Comment 33 Henrique Martins 2018-02-09 21:22:45 UTC

Obvious in your case, but I don't have any luks partitions on the laptop.
It does have the default Fedora LVM setup, but no  luks.

Comment 34 Ulrik Dickow 2018-02-10 07:48:02 UTC

So when you disable plymouth -- or force it to exit earlier and/or add extra sleep -- then you have _zero_ device-mapper errors left, right?  Late plymouth + no sleep => lots of errors for all of us, usually.

Comment 35 Andrej Podzimek 2018-04-07 11:52:17 UTC

I'm still seeing this on Fedora 28. There are just fewer of those messages and hangs are less likely. :-/

Comment 36 Andrej Podzimek 2018-04-07 12:40:23 UTC

A heretic idea: If the whole error message ("Kernel not configured for semaphores (System V IPC). Not using udev synchronisation code.") and occasional freeze problem is SELinux-related, would it be feasible / helpful to set SELinux into permissive mode (only) for that final short moment of the shutdown process? If that step was taken with all services already down, it may not be that much of a risk...

Comment 37 Ondrej Kozina 2018-04-16 09:02:30 UTC

Following service should help you to workaround issues with stacked device-mapper devices (including both LVM2 and cryptsetup devices). It won't help with umount issues though (where a process hold open file descriptor).

systemctl start blk-availability.service

If it does help, you may try to enable the service on boot (systemctl enable...). Not sure why Fedora haven't enabled the service by default...

Comment 38 Matthew Miller 2018-04-26 13:48:43 UTC

This appears to be at least part of bug #1385432.... I'd like to get it untangled. What are the next steps here?

Comment 39 Harald Hoyer 2018-05-11 09:03:04 UTC

Here is a nice summary of the plymouth "bug" involved: https://bugzilla.redhat.com/show_bug.cgi?id=1575376#c8

Comment 40 Ben Cotton 2019-05-02 20:49:23 UTC

This message is a reminder that Fedora 28 is nearing its end of life.
On 2019-May-28 Fedora will stop maintaining and issuing updates for
Fedora 28. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora 'version' of '28'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 28 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 41 Ben Cotton 2019-05-28 22:37:16 UTC

Fedora 28 changed to end-of-life (EOL) status on 2019-05-28. Fedora 28 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Note You need to log in before you can comment on or make changes to this bug.

agk
andrej
arnaud.kleinveld
blachniom
bugzilla77
chris_horn
diooktput97
dopey
dracut-maint-list
elad
elia.f.geretto
emmanuel.pacaud
fat.lobyte9
fedora
fedora
gesserat
harald
helloevuez
herrold
igeorgex
jonathan
laurent.rineau__fedora
lucl
lukaszwojdyla
martinsson.patrik
mattdm
michael.scheiffler
mihai
mmahudha
mrunge
palazzotti
pigmej
prajnoha
prd-fedora
rjsilvestre
robin.zlatic
rstrode
scottsloan
sebastien
SteveCGElliott
svoboda.public
udickow
zbyszek