Bug 1310576 - Sometime can't start atomic-openshift-master.service in container install env for docker1.9
Sometime can't start atomic-openshift-master.service in container install env...
Status: CLOSED DEFERRED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers (Show other bugs)
3.2.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Jhon Honce
DeShuai Ma
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2016-02-22 04:51 EST by DeShuai Ma
Modified: 2016-02-23 15:31 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-02-23 15:31:28 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description DeShuai Ma 2016-02-22 04:51:22 EST
Description of problem:
In container install env with docker1.9, sometime when restart atomic-openshift-master.service, it always failed with error "Could not find container for entity id <id>"

Version-Release number of selected component (if applicable):
openshift v3.1.1.904
kubernetes v1.2.0-alpha.7-703-gbc4550d
etcd 2.2.5
docker version: 1.9.1

How reproducible:
Sometime

Steps to Reproduce:
1.Restart atomic-openshift-master.service
$ systemctl restart atomic-openshift-master

error logs:
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com systemd[1]: Failed to start atomic-openshift-master.service.
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com systemd[1]: Unit atomic-openshift-master.service entered failed state.
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com systemd[1]: atomic-openshift-master.service failed.
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com systemd[1]: atomic-openshift-master.service holdoff time over, scheduling restart.
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com systemd[1]: Starting atomic-openshift-master.service...
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com docker[16971]: Error response from daemon: no such id: atomic-openshift-master
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com docker[16971]: Error: failed to remove containers: [atomic-openshift-master]
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com docker[16976]: Error response from daemon: Could not find container for entity id 9c5f29934c463c4cd51b440b1c35cc3dcc276e06df3396369eabbbeebcc07c56
Feb 22 17:10:34 openshift-135.lab.sjc.redhat.com systemd[1]: atomic-openshift-master.service: main process exited, code=exited, status=1/FAILURE
2.
3.

Actual results:
1.When this occur, after remove "/var/lib/docker/linkgraph.db", then restart master success.

Expected results:
1.Restart atomic-openshift-master.service success

Additional info:
Upstream related issue: 
https://github.com/docker/docker/issues/17691
https://github.com/kubernetes/kubernetes/issues/20904
Comment 1 Jhon Honce 2016-02-23 15:31:28 EST
Fixed in Docker 1.10.2

Note You need to log in before you can comment on or make changes to this bug.