Bug 1758500
Summary: | Restarting the crio service remove config.json for podman containers. | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Praveen Kumar <prkumar> |
Component: | Containers | Assignee: | Peter Hunt <pehunt> |
Status: | CLOSED ERRATA | QA Contact: | weiwei jiang <wjiang> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.2.0 | CC: | aos-bugs, bbaude, cfergeau, dwalsh, jokerman, mheon, pehunt, scuppett |
Target Milestone: | --- | ||
Target Release: | 4.3.0 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
CRI-O not properly filtering podman containers on a restore (i.e. stop and start of CRI-O)
Consequence:
Starting up CRI-O caused CRI-O to see podman containers. Since podman containers don't have CRI-O specific metadata, CRI-O asked the storage library to delete them, as CRI-O mistakenly thought the podman containers were incorrectly created CRI-O containers.
Fix:
Properly filter podman containers on CRI-O restore
Result:
podman containers are no longer deleted from storage as a consequence of CRI-O starting
|
Story Points: | --- |
Clone Of: | Environment: | ||
Last Closed: | 2020-05-13 21:26:47 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Praveen Kumar
2019-10-04 10:24:45 UTC
I think this might be related to CRI-O wipe. Ah, while it may seem as though it could be crio-wipe, it's actually crio failing to restore the container, and deciding to remove it (despite not really owning it). This is definitely a bug. Looking into a fix now. Checked with following version, and the issue is fixed now. [root@qe-wj-6lx69-worker-g7pvh core]# rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:376b65e856d3e939fbe6f4440ad3d670ca9b5a9f90483e25f2baf7b0095da68f CustomOrigin: Managed by machine-config-operator Version: 43.80.20191014.2 (2019-10-14T23:57:48Z) [root@qe-wj-6lx69-worker-g7pvh core]# rpm -qa|grep -i -E "cri-o|podman" podman-manpages-1.6.2-0.3.gita8993ba.el8.noarch podman-1.6.2-0.3.gita8993ba.el8.x86_64 cri-o-1.14.11-0.23.dev.rhaos4.2.gitc41de67.el8.x86_64 [root@qe-wj-6lx69-worker-g7pvh core]# podman run -d httpd:alpine Trying to pull registry.access.redhat.com/httpd:alpine... name unknown: Repo not found Trying to pull docker.io/library/httpd:alpine... Getting image source signatures Copying blob 9d48c3bd43c5 done Copying blob d3565940ff69 done Copying blob 17877ce0de23 done Copying blob 1cc6c921162a done Copying blob 4e10ed3cf6fc done Copying config 141bb8d01f done Writing manifest to image destination Storing signatures 01747085de95d1d28e0ddd873d84c523f9e1f0b25b364bc5232cd0ce24ac84b5 [root@qe-wj-6lx69-worker-g7pvh core]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 01747085de95 docker.io/library/httpd:alpine httpd-foreground 2 seconds ago Up 1 second ago brave_gould [root@qe-wj-6lx69-worker-g7pvh core]# systemctl restart crio [root@qe-wj-6lx69-worker-g7pvh core]# systemctl status crio ● crio.service - Open Container Initiative Daemon Loaded: loaded (/usr/lib/systemd/system/crio.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/crio.service.d └─10-default-env.conf Active: active (running) since Tue 2019-10-15 08:47:05 UTC; 14s ago Docs: https://github.com/cri-o/cri-o Main PID: 173974 (crio) Tasks: 20 Memory: 56.1M CPU: 1.460s CGroup: /system.slice/crio.service └─173974 /usr/bin/crio --enable-metrics=true --metrics-port=9537 Oct 15 08:47:04 qe-wj-6lx69-worker-g7pvh systemd[1]: Starting Open Container Initiative Daemon... Oct 15 08:47:05 qe-wj-6lx69-worker-g7pvh systemd[1]: Started Open Container Initiative Daemon. [root@qe-wj-6lx69-worker-g7pvh core]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 01747085de95 docker.io/library/httpd:alpine httpd-foreground 38 seconds ago Up 38 seconds ago brave_gould [root@qe-wj-6lx69-worker-g7pvh core]# podman inspect 01747085de95 [ { "Id": "01747085de95d1d28e0ddd873d84c523f9e1f0b25b364bc5232cd0ce24ac84b5", "Created": "2019-10-15T08:46:53.934441274Z", "Path": "httpd-foreground", "Args": [ "httpd-foreground" ], "State": { "OciVersion": "1.0.1-dev", "Status": "running", "Running": true, "Paused": false, "Restarting": false, [...] Also checked with 4.2 version, also fixed. [root@preserve-42stg-5wt9r-worker-zmgnz core]# rpm -qa|grep -i -E "cri-o|podman" podman-manpages-1.4.2-5.el8.noarch cri-o-1.14.11-0.23.dev.rhaos4.2.gitc41de67.el8.x86_64 podman-1.4.2-5.el8.x86_64 [root@preserve-42stg-5wt9r-worker-zmgnz core]# rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:db2b9ac6cd5ae6eb30b1b2c5f9739734edc7b628862072fb7399b4377684265b CustomOrigin: Managed by machine-config-operator Version: 42.80.20191010.0 (2019-10-10T20:18:10Z) [root@preserve-42stg-5wt9r-worker-zmgnz core]# podman run -d httpd:alpine Trying to pull registry.access.redhat.com/httpd:alpine...ERRO[0000] Error pulling image ref //registry.access.redhat.com/httpd:alpine: Error initializing source docker://registry.access.redhat.com/httpd:alpine: Error reading manifest alpine in registry.access.redhat.com/h ttpd: name unknown: Repo not found Failed Trying to pull docker.io/library/httpd:alpine...Getting image source signatures Copying blob d3565940ff69 done Copying blob 4e10ed3cf6fc done Copying blob 1cc6c921162a done Copying blob 9d48c3bd43c5 done Copying blob 17877ce0de23 done Copying config 141bb8d01f done Writing manifest to image destination Storing signatures 73da05b0be6142883f89ad770de50a8317352b0ae4de91acb1f79e7d0646a6f1 [root@preserve-42stg-5wt9r-worker-zmgnz core]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 73da05b0be61 docker.io/library/httpd:alpine httpd-foreground 7 seconds ago Up 6 seconds ago hungry_kepler [root@preserve-42stg-5wt9r-worker-zmgnz core]# systemctl restart crio [root@preserve-42stg-5wt9r-worker-zmgnz core]# systemctl status !$ systemctl status crio ● crio.service - Open Container Initiative Daemon Loaded: loaded (/usr/lib/systemd/system/crio.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/crio.service.d └─10-default-env.conf Active: active (running) since Tue 2019-10-15 08:56:15 UTC; 15s ago Docs: https://github.com/cri-o/cri-o Main PID: 67998 (crio) Tasks: 21 Memory: 61.1M CPU: 1.777s CGroup: /system.slice/crio.service └─67998 /usr/bin/crio --enable-metrics=true --metrics-port=9537 Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: with error: exit status 1]" Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: time="2019-10-15 08:56:15.449802346Z" level=error msg="error loading cached network config: network "multus-cni-network" not found in CNI cache" Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: time="2019-10-15 08:56:15.451606368Z" level=error msg="Error while checking pod to CNI network "multus-cni-network": neither IPv4 nor IPv6 found when retrieving network status: [Unexpected command output nsen> Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: with error: exit status 1 Unexpected command output nsenter: cannot open /proc/4527/ns/net: No such file or directory Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: with error: exit status 1]" Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: time="2019-10-15 08:56:15.452316959Z" level=error msg="error loading cached network config: network "multus-cni-network" not found in CNI cache" Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: time="2019-10-15 08:56:15.454277401Z" level=error msg="Error while checking pod to CNI network "multus-cni-network": neither IPv4 nor IPv6 found when retrieving network status: [Unexpected command output nsen> Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: with error: exit status 1 Unexpected command output nsenter: cannot open /proc/4485/ns/net: No such file or directory Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz crio[67998]: with error: exit status 1]" Oct 15 08:56:15 preserve-42stg-5wt9r-worker-zmgnz systemd[1]: Started Open Container Initiative Daemon. [root@preserve-42stg-5wt9r-worker-zmgnz core]# podman ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 73da05b0be61 docker.io/library/httpd:alpine httpd-foreground 37 seconds ago Up 36 seconds ago hungry_kepler [root@preserve-42stg-5wt9r-worker-zmgnz core]# podman inspect 73da05b0be61 [ { "Id": "73da05b0be6142883f89ad770de50a8317352b0ae4de91acb1f79e7d0646a6f1", "Created": "2019-10-15T08:55:59.385370338Z", "Path": "httpd-foreground", "Args": [ "httpd-foreground" ], "State": { "OciVersion": "1.0.1-dev", "Status": "running", "Running": true, "Paused": false, "Restarting": false, [...] Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0062 |