2012838 – Setting the default maximum container root partition size for Overlay with CRI-O stop working

Bug 2012838 - Setting the default maximum container root partition size for Overlay with CRI-O stop working

Summary: Setting the default maximum container root partition size for Overlay with CR...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.7
Hardware:	x86_64
OS:	All
Priority:	high
Severity:	urgent
Target Milestone:	---
Target Release:	4.10.0
Assignee:	Qi Wang
QA Contact:	MinLi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	2020716 2020724 2020729
TreeView+	depends on / blocked

Reported:	2021-10-11 12:14 UTC by Franck Grosjean
Modified:	2022-03-10 16:19 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Clones:	2020716 2020724 2020729 (view as bug list)
Environment:
Last Closed:	2022-03-10 16:18:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	cri-o cri-o pull 5423	0	None	open	Bug 2012838: fix override storage options from storage.conf	2021-10-25 20:43:24 UTC
Red Hat Product Errata	RHSA-2022:0056	0	None	None	None	2022-03-10 16:19:10 UTC

Description Franck Grosjean 2021-10-11 12:14:14 UTC

Description of problem:
Setting the default maximum container root partition size for Overlay with CRI-O  as describe in doc stop working

Reference
[1] https://docs.openshift.com/container-platform/4.7/post_installation_configuration/machine-configuration-tasks.html#set-the-default-max-container-root-partition-size-for-overlay-with-crio_post-install-machine-configuration-tasks

Daemon machine work and create the storage.conf file as expected but it has no effect in the pod

Version-Release number of selected component (if applicable):
4.7 et 4.8

How reproducible:
Apply configuration as describe in [1]
Check in container with df -h

Steps to Reproduce:
1. Apply configuration as describe in [1]
2. Check in container with df -h

Actual results:
All the underlying storage is seen

Expected results:
Only the specified size should be seen from the pod

Additional info:
There is an upstream PR on configuration file and it seems the configuration deployed is not compliant with the new format in cri-o
https://github.com/containers/storage/commit/ff125a5657075bc14048f2f3742a08db11287c0a
It looks like it has been merge upstream and downstream as well

Comment 2 Peter Hunt 2021-10-11 23:50:55 UTC

Qi can you take a look please? I am having trouble reassigning you. This seems to be an issue with ContainerRuntimeConfig and newer versions of c/storage

Comment 15 Qi Wang 2021-10-25 20:44:31 UTC

I have linked the PR for the fix https://github.com/cri-o/cri-o/pull/5423

Comment 16 Pablo Alonso Rodriguez 2021-10-27 15:30:36 UTC

*** Bug 2017756 has been marked as a duplicate of this bug. ***

Comment 18 MinLi 2021-11-17 03:39:20 UTC

reproduce! And the overlay-size doesn't exist in /etc/crio/crio.conf.d/00-default

$ oc get clusterversion 
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2021-11-15-034648   True        False         18h     Cluster version is 4.10.0-0.nightly-2021-11-15-034648

$ oc get containerruntimeconfig -o yaml 
...
  spec:
    containerRuntimeConfig:
      logLevel: debug
      overlaySize: 9G
    machineConfigPoolSelector:
      matchLabels:
        custom-crio-overlay: overlay-size
...

$ oc debug node/minmli11164101-qgs26-worker-0-28hqn
sh-4.4# chroot /host 
sh-4.4# head -n 7 /etc/containers/storage.conf 
[storage]
  driver = "overlay"
  runroot = "/var/run/containers/storage"
  graphroot = "/var/lib/containers/storage"
  [storage.options]
    additionalimagestores = []
    size = "9G"

sh-4.4# grep -i overlay /etc/crio/crio.conf.d/00-default
storage_driver = "overlay"
    "overlay.override_kernel_check=1",
sh-4.4# 
sh-4.4# df -h 
Filesystem      Size  Used Avail Use% Mounted on
overlay          40G  7.5G   33G  19% /
tmpfs            64M     0   64M   0% /dev
tmpfs           3.9G     0  3.9G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
tmpfs           3.9G   50M  3.9G   2% /host/run
/dev/vda4        40G  7.5G   33G  19% /host
tmpfs           3.9G     0  3.9G   0% /host/sys/fs/cgroup
devtmpfs        3.9G     0  3.9G   0% /host/dev


check pod status :
$ oc get pod
NAME                                        READY   STATUS             RESTARTS      AGE
hello-openshift-minmli                      1/1     Running            0             46m

$ oc rsh hello-openshift-minmli
/ # df -h 
Filesystem                Size      Used Available Use% Mounted on
overlay                  39.5G      7.0G     32.5G  18% /
tmpfs                    64.0M         0     64.0M   0% /dev
tmpfs                     3.9G         0      3.9G   0% /sys/fs/cgroup
shm                      64.0M         0     64.0M   0% /dev/shm
tmpfs                     3.9G     49.3M      3.8G   1% /etc/resolv.conf
tmpfs                     3.9G     49.3M      3.8G   1% /etc/hostname
/dev/vda4                39.5G      7.0G     32.5G  18% /tmp
/dev/vda4                39.5G      7.0G     32.5G  18% /etc/hosts
/dev/vda4                39.5G      7.0G     32.5G  18% /dev/termination-log
tmpfs                     3.9G     49.3M      3.8G   1% /run/secrets
tmpfs                     6.7G     20.0K      6.7G   0% /var/run/secrets/kubernetes.io/serviceaccount
tmpfs                     3.9G         0      3.9G   0% /proc/acpi
tmpfs                    64.0M         0     64.0M   0% /proc/kcore
tmpfs                    64.0M         0     64.0M   0% /proc/keys
tmpfs                    64.0M         0     64.0M   0% /proc/timer_list
tmpfs                    64.0M         0     64.0M   0% /proc/sched_debug
tmpfs                     3.9G         0      3.9G   0% /proc/scsi
tmpfs                     3.9G         0      3.9G   0% /sys/firmware
/ # exit

Comment 19 Qi Wang 2021-11-17 16:41:35 UTC

Checked the crio version of 4.10.0-0.nightly-2021-11-15-034648
The crio is 1.23.0-12.rhaos4.10.git6ee64e9.el8. It seems the fix hasn't been added to this crio yet. The crio commit https://github.com/cri-o/cri-o/commits/6ee64e9 does not include the fix PR.

@minmli Could you verify the fix after the crio is built with the fix? For verifying the fix, the 
/etc/crio/crio.conf.d/00-default is not expected to be overwritten. Only the df -h inside the container will show the correct overlay size.

Comment 20 Peter Hunt 2021-11-22 16:09:55 UTC

this should be in the nightlies now

Comment 22 MinLi 2021-11-23 09:07:15 UTC

verified on 4.10.0-0.nightly-2021-11-22-195410

$ oc get pod 
NAME                                        READY   STATUS    RESTARTS   AGE
hello-openshift-buxybox                     1/1     Running   0          20s

$ oc rsh hello-openshift-buxybox
/ # df -h 
Filesystem                Size      Used Available Use% Mounted on
overlay                   9.0G     12.0K      9.0G   0% /
tmpfs                    64.0M         0     64.0M   0% /dev
tmpfs                     3.9G         0      3.9G   0% /sys/fs/cgroup
shm                      64.0M         0     64.0M   0% /dev/shm
tmpfs                     3.9G     51.7M      3.8G   1% /etc/resolv.conf
tmpfs                     3.9G     51.7M      3.8G   1% /etc/hostname
tmpfs                     3.9G     51.7M      3.8G   1% /run/.containerenv
/dev/vda4                39.5G      7.5G     31.9G  19% /tmp
/dev/vda4                39.5G      7.5G     31.9G  19% /etc/hosts
/dev/vda4                39.5G      7.5G     31.9G  19% /dev/termination-log
tmpfs                     3.9G     51.7M      3.8G   1% /run/secrets
tmpfs                     6.7G     20.0K      6.7G   0% /var/run/secrets/kubernetes.io/serviceaccount
tmpfs                     3.9G         0      3.9G   0% /proc/acpi
tmpfs                    64.0M         0     64.0M   0% /proc/kcore
tmpfs                    64.0M         0     64.0M   0% /proc/keys
tmpfs                    64.0M         0     64.0M   0% /proc/timer_list
tmpfs                    64.0M         0     64.0M   0% /proc/sched_debug
tmpfs                     3.9G         0      3.9G   0% /proc/scsi
tmpfs                     3.9G         0      3.9G   0% /sys/firmware

Comment 33 errata-xmlrpc 2022-03-10 16:18:42 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:0056

Note You need to log in before you can comment on or make changes to this bug.