Created attachment 517001 [details] smartctl output Description of problem: I filed this bug to systemd because it is shutdown related, but it was a pure guess. On shutdown, I recently discovered that I always get three messages on screen for just a second or two (sorry I don't have them full text but it is too short to recognise them completely): cryptsetup /some/long/path1 cannot umount/remove/..: resource is busy cryptsetup /some/long/path2 cannot umount/remove/..: resource is busy cryptsetup /some/long/path3 cannot umount/remove/..: resource is busy This is equivalent to my three encrypted hard disk partitions (/, /home, swap). When booting the system afterwards, I always get a recovered journal, so the filesystems weren't removed cleanly. At some point, apparently they got so badly corrupted that I got disk read errors like this one which went away completely after an fsck run in maintenance mode (which also resulted in some files lost forever, so actual data loss!!): [ 1418.029152] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 1418.032714] ata1.00: BMDMA stat 0x25 [ 1418.036078] ata1.00: failed command: READ DMA EXT [ 1418.039440] ata1.00: cmd 25/00:06:56:03:2d/00:00:12:00:00/e0 tag 0 dma 3072 in [ 1418.039443] res 51/40:00:56:03:2d/40:00:12:00:00/e0 Emask 0x9 (media error) [ 1418.046305] ata1.00: status: { DRDY ERR } [ 1418.049716] ata1.00: error { UNC } [ 1418.062550] end_request: I/O error, dev sda, sector 304939862 [ 1418.066023] Buffer I/O error on device dm-2, logical block 122683401 [ 1418.069467] Buffer I/O error on device dm-2, logical block 122683402 [ 1418.072831] Buffer I/O error on device dm-2, logical block 122683403 Now while after that fsck run that fixed the filesystem for now made them go away, the source of the problem still persists and I fear running into new data loss and read errors as above soon when it isn't fixed in one way or another. The failing unmount occurs no matter whether I use "init 0", "init 6" or "reboot" for shutdown. Version-Release number of selected component (if applicable): bash-4.2$ systemctl --version systemd 26 fedora +PAM +LIBWRAP +AUDIT +SELINUX +SYSVINIT +LIBCRYPTSETUP bash-4.2$ uname -a Linux jth 2.6.40-4.fc15.i686 #1 SMP Fri Jul 29 18:54:39 UTC 2011 i686 i686 i386 GNU/Linux bash-4.2$ How reproducible: Always Steps to Reproduce: 1. Shutdown 2. Boot Actual results: On shutdown, the three errors above are printed. During boot, journal is examined. After many boots, I get read errors and other issues until I run fsck which clearly shows a borked filesystem. Expected results: On shutdown, none of the above errors are printed. During normal boot, everything is fine and no journal is examined or any other indication of an unclean unmount visible. Additional info: smartctl examination output is appended just in case this is related to hard disk failure. The read errors in there are from that point where the filesystem was so badly trashed, after an fsck repair they're now all gone. The self-test of the hard disk which went an hour was prompted by me afterwards, so is very recent and up-to-date (and as far as I can see, pretty much ok). If you need more info, then just ask me for it and I will see whether I can gather it.
btw, just in case that is possibly related, I use (in /etc/rc.local): echo 1500 > /proc/sys/vm/dirty_writeback_centisecs
Since this is on a production system, some advice or workaround except not rebooting (which I relied on for now) would be nice.
I got a better look at the error message now. Sorry I don't have the numbers, it isn't on the screen very long: [...numbers....] systemd-cryptsetup[number]: failed to deactivate: device or resource is busy
[ 1418.029152] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0 [ 1418.032714] ata1.00: BMDMA stat 0x25 [ 1418.036078] ata1.00: failed command: READ DMA EXT [ 1418.039440] ata1.00: cmd 25/00:06:56:03:2d/00:00:12:00:00/e0 tag 0 dma 3072 in [ 1418.039443] res 51/40:00:56:03:2d/40:00:12:00:00/e0 Emask 0x9 (media error) [ 1418.046305] ata1.00: status: { DRDY ERR } [ 1418.049716] ata1.00: error { UNC } This is a hardware/driver problem and is unrelated to systemd. If / is encrypted we cannot detach it on shutdown in F15 (and any older fedora version), since we cannot unmount the root file system. In F16 for the first time we will be able to jump back into the initrd which then unmounts the root fs and detaches all remaining crypto disks afterwards. The fact that we cannot detach/unmount the root fs is not a problem however, since we sync everything to disk, and the kernel will do so again. So there's no systemd problem here. Please file a new bug about your ATA media error problem, against the kernel.
I am just asking to be sure: It seems to me that /home and swap aren't cleanly unmounted either. Is that also normal for Fedora 15 and not possibly causing any file system corruption? Also the subsequent data loss and file system corruption happened on /home, not on /.