Bug 592170
| Summary: | [LXC] The related process still exist after os container has been shutdown. | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | dyuan |
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | medium | Docs Contact: | |
| Priority: | low | ||
| Version: | 6.0 | CC: | ajia, berrange, dallan, jdenemar, llim, moli, mzhan, xen-maint, yoyzhang |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-0.9.1-1.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-12-06 10:44:07 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 621776, 693512 | ||
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion. Should be fixed by 7af5f4689f63bc6ffec441178166c562fee28bc6 upstream commit. 7af5f4689f63bc6ffec441178166c562fee28bc6 only fixes one scenario. It deals with cleanup when libvirt_lxc shuts down cleanly. If you run 'virsh destroy' though, libvirt_lxc just gets SIGKILL, so can't then kill off the container init process. We need to explicitly kill every single PID in the $CGROUP/tasks file really. The following two changes are required to fully solve this http://www.redhat.com/archives/libvir-list/2011-February/msg01005.html http://www.redhat.com/archives/libvir-list/2011-February/msg01006.html Fixed upstream by v0.8.8-76-g33191b4 and v0.8.8-179-g4e3117a:
commit 33191b419c8c8b17af7c6100997e64ed18bd5f62
Author: Daniel P. Berrange <berrange>
Date: Tue Feb 22 17:33:59 2011 +0000
Add APIs for killing off processes inside a cgroup
The virCgroupKill method kills all PIDs found in a cgroup
The virCgroupKillRecursively method does this recursively
for child cgroups.
The virCgroupKillPainfully method does a recursive kill
several times in a row until everything has really died
commit 4e3117ae50efc0fcbd5ce485cd610dfab7f5c625
Author: Daniel P. Berrange <berrange>
Date: Tue Feb 22 17:35:06 2011 +0000
Make LXC container startup/shutdown/I/O more robust
The current LXC I/O controller looks for HUP to detect
when a guest has quit. This isn't reliable as during
initial bootup it is possible that 'init' will close
the console and let mingetty re-open it. The shutdown
of containers was also flakey because it only killed
the libvirt I/O controller and expected container
processes to gracefully follow.
Change the I/O controller such that when it see HUP
or an I/O error, it uses kill($PID, 0) to see if the
process has really quit.
Change the container shutdown sequence to use the
virCgroupKillPainfully function to ensure every
really goes away
This change makes the use of the 'cpu', 'devices'
and 'memory' cgroups controllers compulsory with
LXC
Verified this bug pass with libvirt-0.9.1-1.el6.x86_64
1. create a root filesytem with febootstrap
# febootstrap --group-install="base" rawhide /tmp/rawhide
2. define a OS container with following xml:
virsh # dumpxml fedora-rawhide
<domain type='lxc' id='30013'>
<name>fedora-rawhide</name>
<uuid>6222c8db-8764-9c54-8fed-2646b8c4ef78</uuid>
<memory>32768</memory>
<currentMemory>32768</currentMemory>
<vcpu>1</vcpu>
<os>
<type arch='x86_64'>exe</type>
<init>/sbin/init</init>
</os>
<clock offset='utc'/>
<on_poweroff>destroy</on_poweroff>
<on_reboot>restart</on_reboot>
<on_crash>destroy</on_crash>
<devices>
<emulator>/usr/libexec/libvirt_lxc</emulator>
<filesystem type='mount'>
<source dir='/tmp/rawhide'/>
<target dir='/'/>
</filesystem>
<interface type='network'>
<mac address='52:54:00:73:6b:43'/>
<source network='default'/>
<target dev='veth1'/>
</interface>
<console type='pty'>
<target port='0'/>
</console>
</devices>
</domain>
3. start container
virsh # start fedora-rawhide
Domain fedora-rawhide started
4. check cgroup filesystem and processes
# cat /cgroup/cpu/libvirt/lxc/fedora-rawhide/tasks
27936
27961
27991
27996
# ps axu|grep 27936
root 27936 0.0 0.0 38388 1268 ? Ss 21:58 0:00 /usr/libexec/libvirt_lxc --name fedora-rawhide --console 22 --background --veth veth3
# ps axu|grep 27961
root 27961 0.0 0.0 34124 3580 pts/0 Ss+ 21:58 0:00 /sbin/init
# ps axu|grep 27991
root 27991 0.0 0.0 14844 992 ? Ss 21:58 0:00 /sbin/udevd
5. destroy container
virsh # destroy fedora-rawhide
Domain fedora-rawhide destroyed
virsh # list --all
Id Name State
----------------------------------
- fedora-rawhide shut off
6. # cat /cgroup/cpu/libvirt/lxc/fedora-rawhide/tasks
cat: /cgroup/cpu/libvirt/lxc/fedora-rawhide/tasks: No such file or directory
Move to Verified according to Comment #10 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html |
Description of problem: The related processes still exist after os container has been shutdown. But after reboot host, the related processes does not exist. Version-Release number of selected component (if applicable): libvirt-0.8.1-3.el6.i686 kernel-2.6.32-25.el6.i686 How reproducible: always Steps to Reproduce: 1. create a root filesytem with febootstrap # febootstrap --group-install="base" rawhide /tmp/rawhide 2. define a OS container with following xml: virsh # dumpxml fedora-rawhide <domain type='lxc' id='30013'> <name>fedora-rawhide</name> <uuid>6222c8db-8764-9c54-8fed-2646b8c4ef78</uuid> <memory>32768</memory> <currentMemory>32768</currentMemory> <vcpu>1</vcpu> <os> <type arch='x86_64'>exe</type> <init>/sbin/init</init> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/libvirt_lxc</emulator> <filesystem type='mount'> <source dir='/tmp/rawhide'/> <target dir='/'/> </filesystem> <interface type='network'> <mac address='52:54:00:73:6b:43'/> <source network='default'/> <target dev='veth1'/> </interface> <console type='pty'> <target port='0'/> </console> </devices> </domain> 3. start container virsh # start fedora-rawhide Domain fedora-rawhide started 4. check cgroup filesystem and processes # cat /cgroup/cpu/libvirt/lxc/fedora-rawhide/tasks 2883 2889 2991 4222 4223 4224 4225 4251 4297 4342 4351 4360 4371 # ps aux|grep 2883 root 2883 0.0 0.0 4092 944 ? Ss 13:22 0:00 /usr/libexec/libvirt_lxc --name fedora-rawhide --console 17 --background --veth veth0 # ps aux|grep 2889 root 2889 0.0 0.0 2768 1136 pts/0 Ss+ 13:22 0:00 /sbin/init # ps aux|grep 2991 root 2991 0.0 0.0 2428 556 ? S<s 13:22 0:00 /sbin/udevd -d 5. shutdown container virsh # shutdown fedora-rawhide Domain fedora-rawhide is being shutdown virsh # list --all Id Name State ---------------------------------- - fedora-rawhide shut off 6. check cgroup filesystem and processes # cat /cgroup/cpu/libvirt/lxc/fedora-rawhide/tasks 2889 2991 4222 4223 4224 4225 4251 4297 4342 4351 4360 4371 # ps aux|grep 2889 root 2889 0.0 0.0 2824 1408 ? Ss 13:22 0:00 /sbin/init # ps aux|grep 2991 root 2991 0.0 0.0 2428 556 ? S<s 13:22 0:00 /sbin/udevd -d 7. start again # virsh start fedora-rawhide Domain fedora-rawhide started # cat /cgroup/cpu/libvirt/lxc/fedora-rawhide/tasks 2889 2991 4222 4223 4224 4225 4251 4297 4342 4351 4360 4371 4478 4499 4505 4527 4601 4942 # ps aux|grep 2889 root 2889 0.0 0.0 2824 1408 ? Ss 13:22 0:00 /sbin/init # ps aux|grep 2991 root 2991 0.0 0.0 2428 556 ? S<s 13:22 0:00 /sbin/udevd -d # ps aux|grep 4478 root 4478 0.0 0.0 4092 940 ? Ss 13:29 0:00 /usr/libexec/libvirt_lxc --name fedora-rawhide --console 17 --background --veth veth0 # ps aux|grep 4499 root 4499 0.0 0.0 2768 1136 pts/0 Ss+ 13:29 0:00 /sbin/init Actual results: The related processes still exist after the os container has been shutdown. After reboot host, the 'fedora-rawhide' related item does not exist in cgroup filesystem and the processes does not exist in ps result. Expected results: The related processes and dir items should not exist after the os container has been shutdown for no need to reboot host. Additional info: