Created attachment 499088 [details] config file Description of problem: Using Fedora 14 as host, one of my LXC domains died on startup (that's bug 694963) but my Fedora 15 LXC domain started OK. Now that I'm using Fedora 15 as a host, even my Fedora 15 LXC domain dies on startup. Version-Release number of selected component (if applicable): libvirt-0.8.8-4.fc15.i686 How reproducible: Always Steps to Reproduce: 1. virsh --connect lxc:// define fedora.xml 2. virsh --connect lxc:// start fedora 3. virsh --connect lxc:// list --all Actual results: Fedora domain is shown as "shut off" Expected results: Domain should be started Additional info: kernel-PAE-2.6.38.5-24.fc15.i686
Created attachment 503895 [details] Config file with filesystem tag
Created attachment 503896 [details] Log file when a filesystem tag is defined
Same problem trying to run a very basic LXC container based on busybox. Attached the configuration file and the logs where libvirt complains about not being able to read from fd 7. If I remove the filesystem tag, it works. Looks like the "chroot" is the problem for me.
Forgot to mention I'm using kernel: 2.6.38.7-30.fc15.x86_64 libvirt: 0.8.8-4.fc15.x86_64
The problem is that the container startup process is closing /dev/console and re-opening it. Unfortunately libvirt uses the closing of /dev/console to detect the quit condition. This is fixed upstream with commit 4e3117ae50efc0fcbd5ce485cd610dfab7f5c625 Author: Daniel P. Berrange <berrange> AuthorDate: Tue Feb 22 17:35:06 2011 +0000 Commit: Daniel P. Berrange <berrange> CommitDate: Tue Mar 15 12:12:53 2011 +0000 Make LXC container startup/shutdown/I/O more robust but not in the 0.8.8 version of F15. It can likely be backported
I believe there may actually be a problem with systemd here causing LXC startup failure with libvirt. In particular it appears to be impossible for LXC to unmount the systemd autofs filesystems when inside its namespace. To test to see if this is your root cause problem, you can disable all the systemd autofs mounts for i in `systemctl --full | grep automount | awk '{print $1}'` do systemctl stop $i done THe goal is that /proc/mounts must *not* show any filesystems 'autofs' from systemd.
Argh!
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
(In reply to comment #6) > I believe there may actually be a problem with systemd here causing LXC startup > failure with libvirt. In particular it appears to be impossible for LXC to > unmount the systemd autofs filesystems when inside its namespace. > > To test to see if this is your root cause problem, you can disable all the > systemd autofs mounts > > for i in `systemctl --full | grep automount | awk '{print $1}'` > do > systemctl stop $i > done > > THe goal is that /proc/mounts must *not* show any filesystems 'autofs' from > systemd. We've since worked around that upstream; someone should investigate whether backporting this commit would help matters: commit 878cc33a6ad63efc3bb9332deda6c0ddbaff8b95 Author: Daniel P. Berrange <berrange> Date: Tue Nov 1 12:56:53 2011 +0000 Workaround for broken kernel autofs mounts The kernel automounter is mostly broken wrt to containers. Most notably if you start a new filesystem namespace and then attempt to unmount any autofs filesystem, it will typically fail with a weird error message like Failed to unmount '/.oldroot/sys/kernel/security':Too many levels of symbolic links Attempting to detach the autofs mount using umount2(MNT_DETACH) will also fail with the same error. Therefore if we get any error on unmount()ing a filesystem from the old root FS when starting a container, we must immediately break out and detach the entire old root filesystem (ignoring any mounts below it).
Yes, that commit will definitely improve matters
Well since this bug hasn't killed anybody over the course of F15, I don't feel that compelled to do a backport with < 3 weeks left in F15 life cycle. Closing this WONTFIX If anyone is still seeing similar issues in F16+, please reopen.