Bug 1656119 - exec liveness probe fails because docker exec fails
Summary: exec liveness probe fails because docker exec fails
Status: CLOSED DUPLICATE of bug 1655214
Alias: None
Product: OKD
Classification: Red Hat
Component: Containers
Version: 3.x
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
: ---
Assignee: Urvashi Mohnani
QA Contact: Xiaoli Tian
URL:
Whiteboard:
Keywords:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-12-04 17:41 UTC by Marc Boorshtein
Modified: 2018-12-05 17:40 UTC (History)
3 users (show)

(edit)
Clone Of:
(edit)
Last Closed: 2018-12-05 17:40:12 UTC


Attachments (Terms of Use)

Description Marc Boorshtein 2018-12-04 17:41:59 UTC
Description of problem:

docker exec into the etcd pod fails:

docker exec -ti cc1195d1292c bash
rpc error: code = 2 desc = oci runtime error: exec failed: container_linux.go:247: starting container process caused "process_linux.go:110: decoding init error from pipe caused \"read parent: connection reset by peer\""

this is causing the OKD master to kill the etcd pod repeatedly.

see detailed debugging from https://github.com/openshift/origin/issues/21609

CC vrutkovs@redhat.com


Version-Release number of selected component (if applicable):

OKD 3.11
oc version
oc v3.11.0+62803d0-1
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://os-int.tremolo.local:443
openshift v3.11.0+06cfa24-67
kubernetes v1.11.0+d4cacc0

Docker 1.13.1
 docker version
Client:
 Version:         1.13.1
 API version:     1.26
 Package version: docker-1.13.1-84.git07f3374.el7.centos.x86_64
 Go version:      go1.10.2
 Git commit:      07f3374/1.13.1
 Built:           Fri Nov 30 02:48:45 2018
 OS/Arch:         linux/amd64

Server:
 Version:         1.13.1
 API version:     1.26 (minimum version 1.12)
 Package version: docker-1.13.1-84.git07f3374.el7.centos.x86_64
 Go version:      go1.10.2
 Git commit:      07f3374/1.13.1
 Built:           Fri Nov 30 02:48:45 2018
 OS/Arch:         linux/amd64
 Experimental:    false

CentOS 7.6
cat /etc/redhat-release 
CentOS Linux release 7.6.1810 (Core) 


How reproducible:
Very, multiple users reporting the issue

Steps to Reproduce:
1. Install OKD on CentOS 7.6
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Vadim Rutkovsky 2018-12-04 17:47:34 UTC
Doesn't seem to be happening on OKD CI with docker-1.13.1-84.git07f3374.el7.centos.x86_64

Comment 2 Vadim Rutkovsky 2018-12-04 17:49:44 UTC
Note, that OKD CI (openshift-ansible's PRs in particular) are using CentOS 7.5 as a base and only install docker

Comment 3 Marc Boorshtein 2018-12-05 01:26:53 UTC
ran "sudo yum downgrade docker-1.13.1-75.git8633870.el7.centos.x86_64 docker-client-1.13.1-75.git8633870.el7.centos.x86_64 docker-common-1.13.1-75.git8633870.el7.centos.x86_64" on each host and was able to get back to normal

Comment 4 Scott Dodson 2018-12-05 17:40:12 UTC

*** This bug has been marked as a duplicate of bug 1655214 ***


Note You need to log in before you can comment on or make changes to this bug.