Bug 704953 - Fedora 15 LXC domain dies immediately on startup
Summary: Fedora 15 LXC domain dies immediately on startup
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: libvirt
Version: 15
Hardware: i686
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Libvirt Maintainers
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-05-16 06:48 UTC by Robin Green
Modified: 2012-06-07 00:56 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-07 00:56:29 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
config file (826 bytes, text/xml)
2011-05-16 06:48 UTC, Robin Green
no flags Details
Config file with filesystem tag (703 bytes, text/xml)
2011-06-09 13:47 UTC, Jean-Marc ANDRE
no flags Details
Log file when a filesystem tag is defined (10.00 KB, text/x-log)
2011-06-09 13:49 UTC, Jean-Marc ANDRE
no flags Details

Description Robin Green 2011-05-16 06:48:26 UTC
Created attachment 499088 [details]
config file

Description of problem:
Using Fedora 14 as host, one of my LXC domains died on startup (that's bug 694963) but my Fedora 15 LXC domain started OK. Now that I'm using Fedora 15 as a host, even my Fedora 15 LXC domain dies on startup.

Version-Release number of selected component (if applicable):
libvirt-0.8.8-4.fc15.i686

How reproducible:
Always

Steps to Reproduce:
1. virsh --connect lxc:// define fedora.xml
2. virsh --connect lxc:// start fedora
3. virsh --connect lxc:// list --all

Actual results:
Fedora domain is shown as "shut off"

Expected results:
Domain should be started

Additional info:
kernel-PAE-2.6.38.5-24.fc15.i686

Comment 1 Jean-Marc ANDRE 2011-06-09 13:47:28 UTC
Created attachment 503895 [details]
Config file with filesystem tag

Comment 2 Jean-Marc ANDRE 2011-06-09 13:49:13 UTC
Created attachment 503896 [details]
Log file when a filesystem tag is defined

Comment 3 Jean-Marc ANDRE 2011-06-09 13:54:09 UTC
Same problem trying to run a very basic LXC container based on busybox.
Attached the configuration file and the logs where libvirt complains about not being able to read from fd 7.
If I remove the filesystem tag, it works. Looks like the "chroot" is the problem for me.

Comment 4 Jean-Marc ANDRE 2011-06-09 13:55:29 UTC
Forgot to mention I'm using

kernel: 2.6.38.7-30.fc15.x86_64
libvirt: 0.8.8-4.fc15.x86_64

Comment 5 Daniel Berrangé 2011-06-09 14:09:27 UTC
The problem is that the container startup process is closing /dev/console and re-opening it. Unfortunately libvirt uses the closing of /dev/console to detect the quit condition. This is fixed upstream with

commit 4e3117ae50efc0fcbd5ce485cd610dfab7f5c625
Author:     Daniel P. Berrange <berrange>
AuthorDate: Tue Feb 22 17:35:06 2011 +0000
Commit:     Daniel P. Berrange <berrange>
CommitDate: Tue Mar 15 12:12:53 2011 +0000

    Make LXC container startup/shutdown/I/O more robust


but not in the 0.8.8 version of F15. It can likely be backported

Comment 6 Daniel Berrangé 2011-07-13 16:07:40 UTC
I believe there may actually be a problem with systemd here causing LXC startup failure with libvirt. In particular it appears to be impossible for LXC to unmount the systemd autofs filesystems when inside its namespace.

To test to see if this is your root cause problem, you can disable all the systemd autofs mounts

 for i in `systemctl --full | grep automount | awk '{print $1}'`
  do
     systemctl stop $i
  done

THe goal is that /proc/mounts must *not* show any filesystems 'autofs' from systemd.

Comment 7 Albert Strasheim 2011-07-20 15:37:05 UTC
Argh!

Comment 8 Fedora Admin XMLRPC Client 2011-09-22 17:55:10 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 9 Fedora Admin XMLRPC Client 2011-09-22 17:58:38 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 10 Fedora Admin XMLRPC Client 2011-11-30 20:06:07 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 11 Fedora Admin XMLRPC Client 2011-11-30 20:06:18 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 12 Fedora Admin XMLRPC Client 2011-11-30 20:09:46 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 13 Fedora Admin XMLRPC Client 2011-11-30 20:10:02 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 14 Eric Blake 2011-12-01 18:20:20 UTC
(In reply to comment #6)
> I believe there may actually be a problem with systemd here causing LXC startup
> failure with libvirt. In particular it appears to be impossible for LXC to
> unmount the systemd autofs filesystems when inside its namespace.
> 
> To test to see if this is your root cause problem, you can disable all the
> systemd autofs mounts
> 
>  for i in `systemctl --full | grep automount | awk '{print $1}'`
>   do
>      systemctl stop $i
>   done
> 
> THe goal is that /proc/mounts must *not* show any filesystems 'autofs' from
> systemd.

We've since worked around that upstream; someone should investigate whether backporting this commit would help matters:

commit 878cc33a6ad63efc3bb9332deda6c0ddbaff8b95
Author: Daniel P. Berrange <berrange>
Date:   Tue Nov 1 12:56:53 2011 +0000

    Workaround for broken kernel autofs mounts
    
    The kernel automounter is mostly broken wrt to containers. Most
    notably if you start a new filesystem namespace and then attempt
    to unmount any autofs filesystem, it will typically fail with a
    weird error message like
    
      Failed to unmount '/.oldroot/sys/kernel/security':Too many levels of symbolic links
    Attempting to detach the autofs mount using umount2(MNT_DETACH)
    will also fail with the same error. Therefore if we get any error on
    unmount()ing a filesystem from the old root FS when starting a
    container, we must immediately break out and detach the entire
    old root filesystem (ignoring any mounts below it).

Comment 15 Daniel Berrangé 2011-12-01 20:04:17 UTC
Yes, that commit will definitely improve matters

Comment 16 Cole Robinson 2012-06-07 00:56:29 UTC
Well since this bug hasn't killed anybody over the course of F15, I don't feel that compelled to do a backport with < 3 weeks left in F15 life cycle. Closing this WONTFIX

If anyone is still seeing similar issues in F16+, please reopen.


Note You need to log in before you can comment on or make changes to this bug.