Bug 1708605

Summary: container runtime error after upgrade RHEL 7 worker node
Product: OpenShift Container Platform Reporter: Weihua Meng <wmeng>
Component: ContainersAssignee: Mrunal Patel <mpatel>
Status: CLOSED ERRATA QA Contact: weiwei jiang <wjiang>
Severity: high Docs Contact:
Priority: high    
Version: 4.1.0CC: aos-bugs, dwalsh, jokerman, juzhao, mmccomas, mpatel, nagrawal, smunilla, sponnaga, tbielawa, vlaad, xtian
Target Milestone: ---Keywords: Reopened
Target Release: 4.1.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-04 10:48:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1710226    
Bug Blocks:    

Description Weihua Meng 2019-05-10 10:35:47 UTC
Description of problem:
container runtime error after upgrade RHEL 7 worker node

Version-Release number of selected component (if applicable):


How reproducible:
N/A

Steps to Reproduce:
1. upgrade RHEL 7 worker node

before upgrade
[root@dell-r730-068 ~]# rpm -qa | grep openshift
openshift-clients-4.1.0-201904021330.git.0.4ab0784.el7.x86_64
openshift-hyperkube-4.1.0-201904021330.git.0.4ab0784.el7.x86_64
[root@dell-r730-068 ~]# rpm -q cri-o 
cri-o-1.13.4-1.rhaos4.1.git30006b3.el7.x86_64

upgrade to
[root@dell-r730-068 ~]# rpm -qa | grep openshift
openshift-hyperkube-4.1.0-201905080833.git.0.4f60fbe.el7.x86_64
openshift-clients-4.1.0-201905080833.git.0.4f60fbe.el7.x86_64
[root@dell-r730-068 ~]# rpm -q cri-o
cri-o-1.13.6-1.dev.rhaos4.1.gitee2e748.el7.x86_64


Actual results:
$ oc get nodes 
NAME                                         STATUS     ROLES    AGE     VERSION
--
dell-r730-068.dsal.lab.eng.rdu2.redhat.com   NotReady   worker   4h2m    v1.13.4+4f60fbe4a


  Warning  FailedCreatePodSandBox  3m8s (x8 over 4m23s)   kubelet, dell-r730-068.dsal.lab.eng.rdu2.redhat.com  Failed create pod sandbox: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_multus-64qqg_openshift-multus_73d358a1-72eb-11e9-a243-801844ef10ac_1": image not known
  Normal   SandboxChanged          2m56s (x9 over 4m23s)  kubelet, dell-r730-068.dsal.lab.eng.rdu2.redhat.com  Pod sandbox changed, it will be killed and re-created.

Expected results:
container running, node ready

Additional info:
# journalctl -f -u kubelet |grep -i "remote_runtime.go" 

May 10 06:32:09 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:09.202674   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_ovs-zvl8f_openshift-sdn_c04f0a7d-7301-11e9-b139-801844ef10ac_1": image not known
May 10 06:32:11 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:11.202693   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_node-exporter-w2fg6_openshift-monitoring_73d07ead-72eb-11e9-a243-801844ef10ac_1": image not known
May 10 06:32:13 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:13.202887   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_multus-64qqg_openshift-multus_73d358a1-72eb-11e9-a243-801844ef10ac_1": image not known
May 10 06:32:13 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:13.204072   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_machine-config-daemon-rp9pq_openshift-machine-config-operator_d8bf5471-7303-11e9-b67e-801844ef10ac_1": image not known
May 10 06:32:15 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:15.202616   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_tuned-7p98l_openshift-cluster-node-tuning-operator_73d58b3d-72eb-11e9-a243-801844ef10ac_1": image not known
May 10 06:32:17 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:17.203060   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_sdn-v8h6m_openshift-sdn_96962f11-7301-11e9-b139-801844ef10ac_1": image not known
May 10 06:32:23 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:23.202626   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_node-exporter-w2fg6_openshift-monitoring_73d07ead-72eb-11e9-a243-801844ef10ac_1": image not known
May 10 06:32:24 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:24.203930   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_multus-64qqg_openshift-multus_73d358a1-72eb-11e9-a243-801844ef10ac_1": image not known
May 10 06:32:24 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:24.204210   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_ovs-zvl8f_openshift-sdn_c04f0a7d-7301-11e9-b139-801844ef10ac_1": image not known
May 10 06:32:24 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:24.204531   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_machine-config-daemon-rp9pq_openshift-machine-config-operator_d8bf5471-7303-11e9-b67e-801844ef10ac_1": image not known
May 10 06:32:28 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:28.203213   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_tuned-7p98l_openshift-cluster-node-tuning-operator_73d58b3d-72eb-11e9-a243-801844ef10ac_1": image not known
May 10 06:32:29 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[13461]: E0510 06:32:29.202820   13461 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_sdn-v8h6m_openshift-sdn_96962f11-7301-11e9-b139-801844ef10ac_1": image not known

Comment 9 Weihua Meng 2019-05-14 10:27:51 UTC
cri-o-1.13.9-1 meet same error.

[root@dell-r730-068 ~]# journalctl -f -u kubelet |grep -i "remote_runtime.go"
5月 14 06:21:37 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[16838]: E0514 06:21:37.103383   16838 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_machine-config-daemon-6x9pm_openshift-machine-config-operator_2bbf2e7a-75f5-11e9-b937-801844eeec50_1": image not known
5月 14 06:21:40 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[16838]: E0514 06:21:40.103562   16838 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_tuned-w8bz9_openshift-cluster-node-tuning-operator_19cde302-75f5-11e9-b937-801844eeec50_1": image not known
5月 14 06:21:41 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[16838]: E0514 06:21:41.106127   16838 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_node-exporter-6ctwc_openshift-monitoring_19d02aee-75f5-11e9-b937-801844eeec50_1": image not known
5月 14 06:21:41 dell-r730-068.dsal.lab.eng.rdu2.redhat.com hyperkube[16838]: E0514 06:21:41.106852   16838 remote_runtime.go:96] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = error creating pod sandbox with name "k8s_ovs-ctgzr_openshift-sdn_19d03f43-75f5-11e9-b937-801844eeec50_1": image not known
^C
[root@dell-r730-068 ~]# rpm -q cri-o
cri-o-1.13.9-1.rhaos4.1.gitd70609a.el7.x86_64

Comment 10 Mrunal Patel 2019-05-14 17:03:27 UTC
Which version did you start with? 
If you start with 1.13.6 and upgrade to 1.13.9, it should work fine. 
However, if you start with 1.13.4, 1.13.5, 1.13.7 or 1.13.8 it will run into this issue.

Comment 11 Weihua Meng 2019-05-15 00:36:17 UTC
(In reply to Mrunal Patel from comment #10)
> Which version did you start with? 
> If you start with 1.13.6 and upgrade to 1.13.9, it should work fine. 
> However, if you start with 1.13.4, 1.13.5, 1.13.7 or 1.13.8 it will run into
> this issue.

it was cri-o://1.13.4-1.rhaos4.1.git30006b3.el7 before upgrade.

so it is a known issue. Thanks for info.

I will try other path next time.

Comment 13 Weihua Meng 2019-05-17 10:14:25 UTC
Fixed.

openshift-ansible-4.1.0-201905161641.git.158.458bd44.el7.noarch

From 
cri-o-1.13.6-1.dev.rhaos4.1.gitee2e748.el7.x86_64
to
cri-o-1.13.9-1.rhaos4.1.gitd70609a.el7.x86_64

Comment 15 errata-xmlrpc 2019-06-04 10:48:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758