Bug 1787093 - lxc in Rawhide (Dec 2019) does not start a container in its default config
Summary: lxc in Rawhide (Dec 2019) does not start a container in its default config
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: lxc
Version: 32
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: Thomas Moschny
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-12-31 01:52 UTC by Ryutaroh Matsumoto
Modified: 2021-05-25 17:17 UTC (History)
6 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2021-05-25 17:17:30 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Ryutaroh Matsumoto 2019-12-31 01:52:14 UTC
Description of problem:

1. LXC in rawhide needs lxcbr0 to start a container but it is disabled
in the initial config.

2. We need to manually start ovsdb-server.service and ovs-vswitchd.service.

3. Container config always needs
lxc.cgroup.devices.allow =
lxc.cgroup.devices.deny =
Otherwise no container can start.

Version-Release number of selected component (if applicable):

# dnf list --installed | grep lxc
lxc.x86_64  3.2.1-1.fc32                           @@commandline      
lxc-libs.x86_64   3.2.1-1.fc32                           @@commandline      
lxc-templates.x86_64  3.2.1-1.fc32                           @@commandline      


How reproducible: Always

Steps to Reproduce:

# dnf -y install https://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/x86_64/os/Packages/l/lxc-libs-3.2.1-1.fc32.x86_64.rpm
# dnf -y install https://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/x86_64/os/Packages/l/lxc-3.2.1-1.fc32.x86_64.rpm
# dnf -y install https://dl.fedoraproject.org/pub/fedora/linux/development/rawhide/Everything/x86_64/os/Packages/l/lxc-templates-3.2.1-1.fc32.x86_64.rpm
# lxc-create -n fedora31 -t download -- -d fedora -r 31 -a amd64

Reboot the system.

# lxc-start -F -n fedora31

lxc-start: fedora31: network.c: lxc_ovs_attach_bridge: 2500 Failed to attach "lxcbr0" to openvswitch bridge "vethTHK3FT": ovs-vsctl: no bridge named lxcbr0
lxc-start: fedora31: network.c: instantiate_veth: 395 Operation not permitted - Failed to attach "vethTHK3FT" to bridge "lxcbr0"
lxc-start: fedora31: network.c: lxc_create_network_priv: 3264 Failed to create network device
lxc-start: fedora31: start.c: lxc_spawn: 1846 Failed to create the network
lxc-start: fedora31: start.c: __lxc_start: 2036 Failed to spawn container "fedora31"
lxc-start: fedora31: tools/lxc_start.c: main: 329 The container failed to start
lxc-start: fedora31: tools/lxc_start.c: main: 334 Additional information can be obtained by setting the --logfile and --logpriority options

lxc-net.service is responsible to bring up lxcbr0, but
/etc/sysconfig/lxc has
USE_LXC_BRIDGE="false"  # overridden in lxc-net
and there is no /etc/sysconfig/lxc-net, so there is no lxcbr0 in its default
config, preventing lxc from starting a container.

It also seems that lxc needs ovsdb-server.service and ovs-vswitchd.service
running, but I need to manually start ovsdb-server.service and
ovs-vswitchd.service.

After bringing up the lxcbr0 and OVS, we have another error previngint
a container from starting as

# lxc-start -F -n fedora31
lxc-start: fedora31: cgroups/cgfsng.c: cg_legacy_set_data: 2298 Failed to setup limits for the "devices" controller. The controller seems to be unused by "cgfsng" cgroup driver or not enabled on the cgroup hierarchy
lxc-start: fedora31: start.c: lxc_spawn: 1883 Failed to setup legacy device cgroup controller limits
lxc-start: fedora31: start.c: __lxc_start: 2036 Failed to spawn container "fedora31"
lxc-start: fedora31: tools/lxc_start.c: main: 329 The container failed to start
lxc-start: fedora31: tools/lxc_start.c: main: 334 Additional information can be obtained by setting the --logfile and --logpriority options

The above error can be fixed as follows:

# cat >>/var/lib/lxc/fedora31/config <<'EOF'
> lxc.cgroup.devices.allow =
> lxc.cgroup.devices.deny =
> EOF


Actual results:

See above.

Expected results:

Fedora 31 can start in an lxc container...

Additional info:

Items 1 and 2 above are different from
https://bugzilla.redhat.com/show_bug.cgi?id=1765821

Item 3 above seems to have some overlap with
https://bugzilla.redhat.com/show_bug.cgi?id=1765821

Comment 1 Ryutaroh Matsumoto 2020-01-01 05:43:40 UTC
For record, starting from
Fedora Rawhide Server Installer DVD minimum installation
(choosing "Fedora Custom Operating System" with no add-ons),
the following steps are necessary to start an LXC container

(1)
lxc-create -n fedora31 -t download -- -d fedora -r 31 -a amd64
requires "dnf install wget tar".

(2) lxc-start requires

(2-1) "dnf install openvswitch dnsmasq"

(2-2) We need the following modifications:

--- etc/sysconfig/lxc-orig	2019-12-15 20:41:04.000000000 +0900
+++ etc/sysconfig/lxc	2020-01-01 14:00:29.856450559 +0900
@@ -23,6 +23,6 @@
 #	If you want to kill containers fast, use -k
 STOPOPTS="-a -A -s"
 
-USE_LXC_BRIDGE="false"  # overridden in lxc-net
+USE_LXC_BRIDGE="true"  # overridden in lxc-net
 
 [ ! -f /etc/sysconfig/lxc-net ] || . /etc/sysconfig/lxc-net
--- lib/systemd/system/lxc-net.service-orig	2019-12-15 20:41:03.000000000 +0900
+++ lib/systemd/system/lxc-net.service	2020-01-01 14:04:36.796450990 +0900
@@ -3,6 +3,7 @@
 After=network-online.target
 Wants=network-online.target
 Before=lxc.service
+Wants=openvswitch.service
 
 [Service]
 Type=oneshot


(2-3) We need systemctl enable --now lxc-net.service

(3-1) To start Fedora 31 in an LXC container, we have to add
lxc.cgroup.devices.allow =
lxc.cgroup.devices.deny =
to the container config. Because CGroup V1 controllers are
unavailable on host Fedora 31 and newer.

(3-2) With
lxc.cgroup.devices.allow =
lxc.cgroup.devices.deny =
Ubuntu Eoan (19.10) in LXC container fails with the error message

Failed to mount cgroup at /sys/fs/cgroup/systemd: Operation not permitted
[!!!!!!] Failed to mount API filesystems.
Exiting PID 1...

This is a bug in the upstream pointed out at
https://github.com/lxc/lxc/issues/3183#issuecomment-560307314

To address this bug (3-2), newer pre-release source of LXC is required.
This has been fixed in the github branches (master and stable-3.0), but
no official release fixes this bug.

With this bug fixed, "lxc.mount.auto = cgroup:rw:force" config item
enables Ubuntu Eoan to start in an LXC container on a Fedora Rawhide host.

I believe that packaging of LXC on Fedora can become more friendly, and 
I wish use of LXC on Fedora 32 will become easier than now.

Comment 2 Ryutaroh Matsumoto 2020-01-01 11:57:50 UTC
I am sorry, I was wrong.

> To address this bug (3-2), newer pre-release source of LXC is required.

The above was incorrect. Fedora Rawhide LXC package (3.2.1-1.fc32)
can start Ubuntu Eoan in its container by adding

lxc.cgroup.devices.allow =
lxc.cgroup.devices.deny =
lxc.init.cmd = /sbin/init systemd.unified_cgroup_hierarchy=1

to the container config file.

Comment 3 Ryutaroh Matsumoto 2020-01-03 00:03:41 UTC
The cause of the first half of these symptoms is that
/etc/lxc/default.conf has
lxc.net.0.type = veth
and /etc/sysconfig/lxc hasUSE_LXC_BRIDGE="false".
They are logically contradicting to each other.

If the package default is no networking in LXC containers,
/etc/lxc/default.conf should have
lxc.net.0.type = empty

Comment 4 Thomas Moschny 2020-01-04 16:47:13 UTC
Some comments:

CGroups v2 support: Indeed one has to unset lxc.cgroup.devices.allow and lxc.cgroup.devices.deny, these are v1 only. The corresponding v2 mechanism uses BPF and is afaik not present in any released LXC version. This should indeed be documented, maybe here: https://fedoraproject.org/wiki/Common_F31_bugs#Docker_package_no_longer_available_and_will_not_run_by_default_.28due_to_switch_to_cgroups_v2.29

Networking: Open vSwitch is not needed, I think, at least according to my experiments. Setting USE_LXC_BRIDGE=true in /etc/sysconfig/lxc-net (creating that file, if not present) is sufficient to get LXC networking. However (as with most services in Fedora), the lxc-net service is not automatically enabled, so systemctl enable lxc-net has to be issued once. One could think of providing /etc/sysconfig/lxc-net pre-populated.

Comment 5 Thomas Moschny 2020-01-04 17:54:09 UTC
And I can confirm that LXC 3.2.1 can run Eoan with lxc.init.cmd = /sbin/init systemd.unified_cgroup_hierarchy=1. That however I would consider a bug to be reported upstream.

Comment 6 Ryutaroh Matsumoto 2020-01-04 22:15:01 UTC
(In reply to Thomas Moschny from comment #5)
> And I can confirm that LXC 3.2.1 can run Eoan with lxc.init.cmd = /sbin/init
> systemd.unified_cgroup_hierarchy=1. That however I would consider a bug to
> be reported upstream.

Thank you for paying attention and giving comments.

* "lxc.cgroup.devices.* =" can start Ubuntu Trusty but does not allow start of
  Fedora 30- or Ubuntu Eoan is/was
  discussed at https://github.com/lxc/lxc/issues/3183
  The conclusion seems addition of "lxc.mount.auto = cgroup:rw:force" to the container config
  which triggers another bug (which was fixed in latest LXC github branches) and
  does not allow LXC 3.2.1 nor 3.0.4 to start Fedora 30-/Ubuntu Eoan containers.
  If you do not like that conclusion by the upstream developers, could you express
  your opinion there?
  IMHO the LXC upstream developers should have throughly tested their product
  on systemd.unified_cgroup_hierarchy=1 before claiming "support" of CGroup V2.

* I verified that virt-install --memory 2048  --connect lxc:/ --os-variant ubuntu19.10 --filesystem /var/lib/lxc/ubuntueoan/rootfs,/ --network none   --transient --import --name ubuntueoan
  can start Ubuntu Eoan (and also Fedora 30-) with the following versions of libvirt-daemon-lxc.
  It seems that the same bug (of LXC upstream?) is handled/worked around by libvirt upstream.

[root@localhost ~]# dnf list --installed | grep lxc
libvirt-daemon-driver-lxc.x86_64 5.10.0-2.fc32 @rawhide 
libvirt-daemon-lxc.x86_64        5.10.0-2.fc32 @rawhide 
lxc.x86_64                       3.2.1-1.fc32  @rawhide 
lxc-libs.x86_64                  3.2.1-1.fc32  @rawhide 
lxc-templates.x86_64             3.2.1-1.fc32  @rawhide

Comment 7 Ryutaroh Matsumoto 2020-01-05 02:06:52 UTC
(In reply to Thomas Moschny from comment #5)
> And I can confirm that LXC 3.2.1 can run Eoan with lxc.init.cmd = /sbin/init
> systemd.unified_cgroup_hierarchy=1. That however I would consider a bug to
> be reported upstream.

lxc.init.cmd = /sbin/init also allows LXC 3.0.4 shipped with Fedora 31 to
start Ubuntu Eoan & Fedora 30- containers.

(In reply to Thomas Moschny from comment #4)
> Networking: Open vSwitch is not needed, I think, at least according to my
> experiments.

I partially agree. If bridge-utils is not installed, lxc_bridge_attach() in
https://github.com/lxc/lxc/blob/master/src/lxc/network.c 
tries to use Open VSwitch as

	if (is_ovs_bridge(bridge))
		return lxc_ovs_attach_bridge(bridge, ifname);

For the networking by LXC containers, either bridge-utils or Open VSwitch is (weakly)
depended by lxc. I verified we can also use open vswitch instead of bridge-utils for LXC.

According to http://rpmfind.net/linux/RPM/fedora/devel/rawhide/x86_64/l/lxc-3.2.1-1.fc32.x86_64.html,
3.2.1 RPM included pam_cgfs. pam_cgfs is completely useless in purely V2 CGroup hierarchy,
for detail, please have a look at

https://github.com/lxc/lxc/issues/3198
and
https://bugzilla.redhat.com/show_bug.cgi?id=1787097

Comment 8 Ben Cotton 2020-02-11 17:36:13 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 32 development cycle.
Changing version to 32.

Comment 9 Thomas Moschny 2020-11-14 17:14:31 UTC
An update to 4.0.5 has been created for F32: https://bodhi.fedoraproject.org/updates/FEDORA-2020-f44601cdfd .

Please retry with that version.

Comment 10 Fedora Program Management 2021-04-29 16:51:24 UTC
This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Ben Cotton 2021-05-25 17:17:30 UTC
Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.


Note You need to log in before you can comment on or make changes to this bug.