Bug 1506813 - No event when pod fails because pause/ose-pod image doesn't exist
Summary: No event when pod fails because pause/ose-pod image doesn't exist
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.7.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 3.7.z
Assignee: Seth Jennings
QA Contact: DeShuai Ma
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-10-26 20:22 UTC by Eric Paris
Modified: 2018-04-05 09:31 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-04-05 09:30:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0636 0 None None None 2018-04-05 09:31:32 UTC

Description Eric Paris 2017-10-26 20:22:56 UTC
If the ose-pod image cannot be pulled pods will (obviously) not launch. But this is not easy to detect.

oc describe on a pod that is unable to start because of this looks like:
```
Events:
  FirstSeen	LastSeen	Count	From					SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----					-------------	--------	------		-------
  26m		26m		1	default-scheduler					Normal		Scheduled	Successfully assigned frontend-3v3gk to ip-172-31-55-154.ec2.internal
  26m		1m		112	kubelet, ip-172-31-55-154.ec2.internal			Warning		FailedSync	Error syncing pod
```

In the node logs you might see a message like:
```
remote_runtime.go:86] RunPodSandbox from runtime service failed: rpc error: code = 2 desc = unable to pull sandbox image "registry.reg-aws.openshift.com:443/openshift3/ose-pod:v3.6.173.0.49": manifest unknown: manifest unknown
```

We found this because we had the node-config.yaml pointing to a registry which didn't contain the ose-pod image:
```
imageConfig:
  format: registry.something.com:443/openshift3/ose-${component}:${version}
  latest: false
```

Comment 1 Seth Jennings 2017-10-27 02:31:09 UTC
I believe that this PR here will improve that situation:
https://github.com/kubernetes/kubernetes/pull/48589

Backported to Origin in this pick set:
https://github.com/openshift/origin/pull/16865

Basically the generic "Error syncing pod", which tells the user nothing, will be replaced with a event with Reason "FailedCreatePodSandBox" and Message "Failed create pod sandbox.".

The error from the runtime is still eaten though.

Derek, you were just in this part of the code.  What do you think?

Comment 2 Seth Jennings 2017-11-28 18:14:07 UTC
Kube PR:
https://github.com/kubernetes/kubernetes/pull/56506

Comment 3 Seth Jennings 2018-01-05 18:14:30 UTC
Origin PR:
https://github.com/openshift/origin/pull/18002

OSE 3.7 PR:
https://github.com/openshift/ose/pull/985

Comment 5 weiwei jiang 2018-01-25 07:54:02 UTC
Checked with 
# openshift version 
openshift v3.7.26
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8

And currently will display a FailedCreatePodSandBox event when describe the pod.
So verify this issue.

Comment 9 errata-xmlrpc 2018-04-05 09:30:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0636


Note You need to log in before you can comment on or make changes to this bug.