Bug 1570163 - Master api pod does not start
Summary: Master api pod does not start
Status: CLOSED DUPLICATE of bug 1568583
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.10.0
Assignee: Antonio Murdaca
QA Contact: DeShuai Ma
Depends On:
TreeView+ depends on / blocked
Reported: 2018-04-20 19:04 UTC by Vikas Laad
Modified: 2018-04-24 15:37 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2018-04-24 15:37:37 UTC
Target Upstream Version:

Attachments (Terms of Use)
ansible log with -vvv (1.30 MB, text/plain)
2018-04-20 19:08 UTC, Vikas Laad
no flags Details
master logs (2.35 MB, application/zip)
2018-04-20 19:08 UTC, Vikas Laad
no flags Details
docker, node, complete journal (7.29 MB, application/x-gzip)
2018-04-23 13:32 UTC, Scott Dodson
no flags Details

Description Vikas Laad 2018-04-20 19:04:44 UTC
Description of problem:
Running the installation playbook today does not work, master api pod does not start. Please see ansible and master logs attached.

Version-Release number of the following components:
Running ansible playbook with head c0714203aa9274e2c2a5c6dd97180e7606c3b096

rpm -q ansible

ansible --version

How reproducible:

Steps to Reproduce:
1. Run prerequisite playbook and then run deploy_cluster playbook

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

Expected results:
Playbook fails

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Attached inventory, ansible output and master logs

Comment 2 Vikas Laad 2018-04-20 19:08:20 UTC
Created attachment 1424649 [details]
ansible log with -vvv

Comment 3 Vikas Laad 2018-04-20 19:08:45 UTC
Created attachment 1424650 [details]
master logs

Comment 4 Scott Dodson 2018-04-23 13:32:34 UTC
Created attachment 1425669 [details]
docker, node, complete journal

When I looked at this environment it was unable to pull images from the registry so assigning this to containers team.

Apr 20 19:52:39 ip-172-31-40-92.us-west-2.compute.internal dockerd-current[25575]: time="2018-04-20T19:52:39.186670765Z" level=error msg="Error trying v2 registry: failed to register layer: lstat /var/lib/docker/overlay2/2e9ee9d77ecd642d7bb35037b4144928c181eb0e9a4efe9ace56a9f5050ed72e: no such file or directory"
Apr 20 19:52:39 ip-172-31-40-92.us-west-2.compute.internal dockerd-current[25575]: time="2018-04-20T19:52:39.186713381Z" level=error msg="Attempting next endpoint for pull after error: failed to register layer: lstat /var/lib/docker/overlay2/2e9ee9d77ecd642d7bb35037b4144928c181eb0e9a4efe9ace56a9f5050ed72e: no such file or directory"

Comment 5 Mike Fiedler 2018-04-24 13:26:24 UTC
@vlaad Is this different from the other control plane/node start issue?  https://bugzilla.redhat.com/show_bug.cgi?id=1568583

Does this problem go away after reboot?   

The messages are different  but both point to lstat issues as the root problem.

Comment 6 Vikas Laad 2018-04-24 13:40:42 UTC

The problem did not go away after restarting many times, also the error messages were different. Could be the same problem with different symptoms.

Comment 7 Mike Fiedler 2018-04-24 15:37:37 UTC
Marking this as a duplicate of bug 1568583. @vlaad will reopen if he sees the issue again on a system running as configured in https://bugzilla.redhat.com/show_bug.cgi?id=1568450#c7

*** This bug has been marked as a duplicate of bug 1568583 ***

Note You need to log in before you can comment on or make changes to this bug.