Bug 1903553 - systemd container renders node NotReady after deleting it
Summary: systemd container renders node NotReady after deleting it
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 4.6
Hardware: x86_64
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: Giuseppe Scrivano
QA Contact: Weinan Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-02 11:16 UTC by Zvonko Kosic
Modified: 2021-07-27 22:34 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-27 22:34:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:2438 0 None None None 2021-07-27 22:34:49 UTC

Description Zvonko Kosic 2020-12-02 11:16:21 UTC
Description of problem:

Creating and deleting the following yaml, renders a node NotReady. Only
a reboot can get the node into normal state. 


apiVersion: v1
kind: Pod
metadata:
  name: systemd
spec:
  containers:
    - name: systemd-ubi8
      image: registry.access.redhat.com/ubi8/ubi
      command: ["/sbin/init"]
      securityContext:
        privileged: true
      volumeMounts:
        - name: run-dummy
          mountPath: /run/dummy
          mountPropagation: Bidirectional
  volumes:
    - name: run-dummy
      hostPath:
        path: /run/dummy



Without the volume mount it works. We need the privileged context to have the bidirectional mount propagation. 

Version-Release number of selected component (if applicable):

cri-o://1.19.0-22.rhaos4.6.gitc0306f1.el8

$ oc get clusterversion
NAME      VERSION   AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.6.3     True        False         20d     Cluster version is 4.6.3


oc create -f <name>.yaml
oc delete pod <name> --wait=false 

Wait for 1-2 min and the node is NotReady where the pod is running


Additional info:

Looks like crio is killing host processes: 

Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 10071 (kworker/0:8-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 11643 (kworker/1:0-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 11874 (kworker/0:1-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 11886 (kworker/u4:1…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 11891 (kworker/1:3-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13130 (kworker/1:1-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13269 (kworker/0:0-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13340 (kworker/1:2-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13341 (kworker/1:4-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13342 (kworker/u4:0…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13343 (kworker/1:5-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13345 (kworker/u4:3…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 13352 (kworker/0:2-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14359 (kworker/0:3-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14360 (kworker/0:4-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14361 (kworker/0:5-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14362 (kworker/0:6-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14363 (kworker/0:7-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14364 (kworker/0:9-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14365 (kworker/0:10…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14366 (kworker/0:11…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14367 (kworker/0:12…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14368 (kworker/0:13…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14369 (kworker/0:14…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14370 (kworker/0:15…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14371 (kworker/0:16…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14389 (kworker/0:17…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14390 (kworker/0:18…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14391 (kworker/0:19…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14392 (kworker/0:20…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14393 (kworker/0:21…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14394 (kworker/0:22…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14395 (kworker/0:23…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14396 (kworker/0:24…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14397 (kworker/0:25…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14398 (kworker/0:26…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14399 (kworker/0:27…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14416 (kworker/0:28…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14417 (kworker/0:29…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14418 (kworker/0:30…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14419 (kworker/0:31…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14420 (kworker/0:32…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14421 (kworker/0:33…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14422 (kworker/0:34…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14423 (kworker/0:35…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14424 (kworker/0:36…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14425 (kworker/0:37…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14426 (kworker/0:38…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14427 (kworker/0:39…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14428 (kworker/0:40…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14429 (kworker/0:41…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14430 (kworker/0:42…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14431 (kworker/0:43…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14432 (kworker/0:44…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14433 (kworker/0:45…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14434 (kworker/0:46…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14435 (kworker/0:47…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14436 (kworker/0:48…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14437 (kworker/0:49…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14438 (kworker/0:50…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14439 (kworker/0:51…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14440 (kworker/0:52…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14441 (kworker/0:53…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14442 (kworker/0:54…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14443 (kworker/0:55…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14444 (kworker/0:56…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14445 (kworker/0:57…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14446 (kworker/0:58…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14479 (kworker/1:6-…) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14601 (systemd-udevd) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Killing process 14218 (systemd-journal) with signal SIGKILL.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: systemd-journald.service: Main process exited, code=killed, status=9/KILL
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: systemd-journald.service: Failed with result 'signal'.
Dec 02 11:00:22 ip-10-0-208-51 systemd[1]: systemd-journald.service: Consumed 283ms CPU time
Dec 02 11:00:23 ip-10-0-208-51 kernel: audit: type=1130 audit(1606906823.438:443): pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=systemd-journald comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=?>
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Still around after SIGKILL. Ignoring.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: crio-d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.scope: Failed with result 'timeout'.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Stopped libcontainer container d1440f6299679e69f5a994357e3d98ffc4b8d39e4164436f08f37b7acce7ac47.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Removed slice libcontainer container kubepods-besteffort-podf4a7d0b7_7f1f_4647_8f8b_e3daa65904ea.slice.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: kubepods-besteffort-podf4a7d0b7_7f1f_4647_8f8b_e3daa65904ea.slice: Consumed 818ms CPU time
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Removed slice libcontainer container kubepods-besteffort.slice.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: kubepods-besteffort.slice: Consumed 1.299s CPU time
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Removed slice libcontainer container kubepods.slice.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: kubepods.slice: Consumed 35.868s CPU time
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Reached target Shutdown.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Reached target Final Step.
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Starting Power-Off...
Dec 02 11:00:53 ip-10-0-208-51 systemd[1]: Shutting down.

Comment 1 Giuseppe Scrivano 2020-12-03 09:31:47 UTC
any reason for using mountPropagation: Bidirectional?  Could it be on a path different than /run on the host?


All the mounts created in the container on /run/dummy will be propagated to the host, and that can confuse systemd.

Can you show the output for `findmnt -R /` on the host when you hit this issue?

Comment 2 Zvonko Kosic 2020-12-03 16:22:52 UTC
Bidirectional because the use-case is to share files that are created during runtime via prestart hook with other containers (NVIDIA). 

We do not want files created by the container to be there after reboot, that is why we're using /run. 

Need to recreate and add a node with ssh access. Will update as soon as. possible.

Comment 3 Giuseppe Scrivano 2020-12-03 16:48:42 UTC
the bidirectional flag means that mounts happening in the container are propagated to the host (internally it translates to the shared mount flag).  If you want want just to share files you don't need the mount to be bidirectional.

A bidirectional mount can be quite dangerous, as the container can create mounts in the host mount namespace.

Aren't files persisted if you drop "mountPropagation: Bidirectional" ?

Comment 4 Zvonko Kosic 2020-12-03 18:05:21 UTC
That is the intent. To have the mounts from the container on the host. Because the cri-o prestart hook can only mount files into the container from the host. 

How would this work from your POV? Lets say container A builds some binaries and libraries that need to be mounted via prestart hook into the container B before the command/entrypoint starts? 


The key point here is we have to use a prestart hook. 


Persisted to what? We do not want to have persistent files, they should be gone when the container is gone.

Comment 5 Giuseppe Scrivano 2020-12-03 18:44:04 UTC
If you'd like we can take a look at it together.  I'd expect that to to share files from one container to the other using the same bind mount should be enough (i.e. use the same volume from two different containers).  It should not require the bidirectional mount propagation flag.

Comment 7 Giuseppe Scrivano 2021-01-20 17:50:26 UTC
does it make any difference if you setup `/run/dummy` to be a mount on the host first (mount --bind /run/dummy /run/dummy)?

Comment 8 Giuseppe Scrivano 2021-02-03 18:42:53 UTC
> Without the volume mount it works. We need the privileged context to have the bidirectional mount propagation. 

Do you also drop the privileged context when you try without the bidirectional mount propagation?

A potential issue is that the privileged container is running systemd and systemd inside the container might try to manage more than it should.

Comment 9 Zvonko Kosic 2021-02-08 17:33:52 UTC
I cannot drop the privileged context, it is a driver-container so I need elevated privs to modrobe the modules. 



"does it make any difference if you setup `/run/dummy` to be a mount on the host first (mount --bind /run/dummy /run/dummy)?"

You mean first mkdir /run/dummy and then bind mount it? 

The /run/nvidia/driver is a host mount.

Comment 10 Zvonko Kosic 2021-02-08 17:34:35 UTC
Could be that the privileged context is doing more then intended. It looks like it is also trying to stop the services on the node rather then only stopping the services in the container.

Comment 11 Giuseppe Scrivano 2021-02-09 07:59:56 UTC
so is /run/nvidia/driver created before the container runs?

If you can reproduce the issue just by having "securityContext: privileged: true" then we can be sure it is not because of the bind mount but systemd touching the services on the node.

Comment 12 Zvonko Kosic 2021-02-09 20:32:20 UTC
No it is not, the container creates it and then mounts the root file system of the container to /run/nvidia/driver. 

I can only reproduce with bidirectional bind mount and privileged context. 

Only a privileged context with bidirectional without systemd works perfect. 

The combination of privileged bidirectional bind mount and systemd breaks it. 

I tagged you no another PR that discusses systemd and mounts could be related. 

https://github.com/openshift/enhancements/pull/593/files#r573214375

Comment 13 Giuseppe Scrivano 2021-02-10 16:32:15 UTC
We can debug it further but I am not sure how CRI-O+OCI runtime can help here, as the issue seems related to a privileged systemd container messing with the host.  Or do you think something should have been configured differently?

A potential long term fix is to permit bidirectional mounts without privileged.  Bidirectional mounts are a privileged operation, but they are in no way connected to the container running with additional capabilities.

A few suggestions:

- Try a different path for the mount (e.g. /tmp)?
- Could the container not use systemd?
- Restrict the capabilities when inside the container with capsh   // this will invalidate the systemd detection code in CRI-O though :(

An interesting test would be to kill -9 the systemd in the container to not give it time to cleanup and see if it is still messing with the host.

Comment 14 Giuseppe Scrivano 2021-02-15 13:50:12 UTC
found a fix for CRI-O: https://github.com/cri-o/cri-o/pull/4575

Comment 15 Giuseppe Scrivano 2021-03-19 12:53:16 UTC
backported to crio 1.19

Comment 20 errata-xmlrpc 2021-07-27 22:34:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438


Note You need to log in before you can comment on or make changes to this bug.