Description of problem:
I'm going back and forth between my main fedora 12 partition and my
fedora 13 alpha partition, rebooting to test f13 from time to time.
Several times now when I have used the normal shutdown button from
the gnome menus while logged into a fedora 13 gnome session, when I
boot back to fedora 12, it says "recovering journal" on the fedora 13
partition, and has to clean up 1 or 2 orphan inodes.
If the standard shutdown button is supposed to shutdown cleanly, it
doesn't appear to be doing so.
Version-Release number of selected component (if applicable):
Definitely somewhat random, I have shutdown from f13 and come back to f12
with no journal problems.
Steps to Reproduce:
I have rhgb turned off, so I can watch the messages during boot and shutdown,
and I can't remember seeing anything unusual during shutdown (though the
messages don't stay on the screen long :-).
Since this is f13, I'm often doing updates just before rebooting. I sometimes
wonder if updates can install changes incompatible with what was there
when I booted, and cause an unclean shutdown just that one time.
I submit this against upstart just as a guess for a good component to
use in the bug report.
There is an associated thread in the test list on this as well:
Running in runlevel 3, shutting down with the halt command, I sometimes (maybe 10-20% of the time) see
mount: / is busy
just before power down. On the next reboot there are often a few orphan inodes. It's been happening for quite a while.
If you stuff a 'lsof' in /etc/init.d/halt near the end, what does it say is staying open?
What sort of partitioning do you have?
I saw this again in Rawhide after applying today's updates (I think it happens in both Rawhide and F13, and may even have happened before F12 Final). It seems to be strongly correlated with applying a lot of updates. Where exactly should the lsof go - before the $kexec_command attempt, or before exec $command? And will the list be short enough that it's actually readable within the second or so that the message appears before power down?
My partitioning is the simplest possible - I'm using VirtualBox, and when installing I told it to use the entire drive.
Either place is fine; it's debatable how long it will take. You can halt without powering off if you just want to leave the messages up.
An interesting thing to test would be if you can duplicate it by upgrading glibc while the system is up, and not duplicate it otherwise.
I included the lsof just before the $kexec_command attempt. Upon running "halt -f", there was just one line printed saying "System halted.", and no apparent output from lsof. (This seems to be the only way to avoid the guest window closing - without "-f", it powers off and closes the window.) I tried to downgrade glibc but it said there was no downgrade available. I'll use "halt -f" from now on in both F13 and Rawhide and see if anything appears.
Meanwhile, I have rebooted my f13 partition several times
recently while fiddling with configuring bridge networking, virtualization,
and wot-not, and have not seen the journal errors. I also have not done
any updates before any of these shutdowns, so maybe this is something that
only happens due to certain updates.
I don't believe "halt -f" is invoking /etc/init.d/halt at all, since nothing is ever printed except "System halted.". I can use regular halt, and pause the guest when the lsof messages appear. There are many pages of them with no way to view them all. I can reproduce the "mount: / is busy" message by either "yum downgrade glibc\*" and "halt", or "yum update glibc\*" and "halt". It always prints
Unmounting pipe file systems:
Unmounting file systems:
mount: / is busy
after I removed the lsof from the halt script.
Andre - can you reproduce it when a glibc upgrade (or downgrade) is *not* involved?
Well, I just did today's batch of F13 updates, not including glibc, and there was no such message. I'll have to watch it for the next few weeks (and Rawhide as well).
I just did a text install of x86_64 Beta.TC1 in VirtualBox. Then I enabled the network, installed yum-presto, and updated everything except glibc\*, then halted. The "mount: / is busy" warning appeared again, so it can appear without a glibc update. Then I booted again, and updated glibc\*, then halted, and saw the warning again. So it can happen without a glibc version change, but seems to happen reliably with a glibc change.
Further checking shows that updates or downgrades in either dbus-libs or glibc\* trigger the problem, no other packages in the minimal install of Beta.TC1 seem to be involved.
FWIW, this is not unique to Fedora. I have a lot of multiboot systems. It happens to me with Mandriva and openSUSE as well, not only after updating, but sometimes after booting a distro that has since had its / mounted by some other distro.
OK, so earlier in /etc/init.d/halt we have:
# Tell init to re-exec itself.
kill -TERM 1
This is supposed to make upstart's init re-exec itself against the newly upgraded libdbus and libc, so it's not holding open inodes. Obviously, it's not working. This will require more serious debugging. Thanks for helping narrow this down!
I just got this same error rebooting fedora 12, and I notice in the last
batch of fedora 12 updates was an update to upstart itself, so I guess that's
another thing that can cause a problem in addition to the libraries.
After applying today's Rawhide updates including gcc and related packages, I saw the problem again, so gcc is probably another package that triggers it. (I wasn't able to do a gcc downgrade to be sure.)
This is because upstream dropped the SIGTERM handler.
Created attachment 410629 [details]
re-exec on SIGTERM
Here's a forward-ported patch from 0.3.x to re-exec on SIGTERM.
It does not attempt to save state. Passes very minimal testing.
upstart-0.6.5-5.fc13 has been submitted as an update for Fedora 13.
upstart-0.6.5-5.fc13 has been pushed to the Fedora 13 testing repository. If problems still persist, please make note of it in this bug report.
If you want to test the update, you can install it with
su -c 'yum --enablerepo=updates-testing update upstart'. You can provide feedback for this update here: http://admin.fedoraproject.org/updates/upstart-0.6.5-5.fc13
upstart-0.6.5-5.fc13 has been pushed to the Fedora 13 stable repository. If problems still persist, please make note of it in this bug report.