Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1570163

Summary: Master api pod does not start
Product: OpenShift Container Platform Reporter: Vikas Laad <vlaad>
Component: ContainersAssignee: Antonio Murdaca <amurdaca>
Status: CLOSED DUPLICATE QA Contact: DeShuai Ma <dma>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, dwalsh, hongkliu, jokerman, mifiedle, mmccomas, vgoyal, vlaad, wmeng
Target Milestone: ---   
Target Release: 3.10.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-04-24 15:37:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ansible log with -vvv
none
master logs
none
docker, node, complete journal none

Description Vikas Laad 2018-04-20 19:04:44 UTC
Description of problem:
Running the installation playbook today does not work, master api pod does not start. Please see ansible and master logs attached.

Version-Release number of the following components:
Running ansible playbook with head c0714203aa9274e2c2a5c6dd97180e7606c3b096

rpm -q ansible

ansible --version
ansible-2.4.3.0-1.el7ae.noarch

How reproducible:
Always

Steps to Reproduce:
1. Run prerequisite playbook and then run deploy_cluster playbook

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated
Attached

Expected results:
Playbook fails

Additional info:
Please attach logs from ansible-playbook with the -vvv flag
Attached inventory, ansible output and master logs

Comment 2 Vikas Laad 2018-04-20 19:08:20 UTC
Created attachment 1424649 [details]
ansible log with -vvv

Comment 3 Vikas Laad 2018-04-20 19:08:45 UTC
Created attachment 1424650 [details]
master logs

Comment 4 Scott Dodson 2018-04-23 13:32:34 UTC
Created attachment 1425669 [details]
docker, node, complete journal

When I looked at this environment it was unable to pull images from the registry so assigning this to containers team.

Apr 20 19:52:39 ip-172-31-40-92.us-west-2.compute.internal dockerd-current[25575]: time="2018-04-20T19:52:39.186670765Z" level=error msg="Error trying v2 registry: failed to register layer: lstat /var/lib/docker/overlay2/2e9ee9d77ecd642d7bb35037b4144928c181eb0e9a4efe9ace56a9f5050ed72e: no such file or directory"
Apr 20 19:52:39 ip-172-31-40-92.us-west-2.compute.internal dockerd-current[25575]: time="2018-04-20T19:52:39.186713381Z" level=error msg="Attempting next endpoint for pull after error: failed to register layer: lstat /var/lib/docker/overlay2/2e9ee9d77ecd642d7bb35037b4144928c181eb0e9a4efe9ace56a9f5050ed72e: no such file or directory"

Comment 5 Mike Fiedler 2018-04-24 13:26:24 UTC
@vlaad Is this different from the other control plane/node start issue?  https://bugzilla.redhat.com/show_bug.cgi?id=1568583

Does this problem go away after reboot?   

The messages are different  but both point to lstat issues as the root problem.

Comment 6 Vikas Laad 2018-04-24 13:40:42 UTC
Mike,

The problem did not go away after restarting many times, also the error messages were different. Could be the same problem with different symptoms.

Comment 7 Mike Fiedler 2018-04-24 15:37:37 UTC
Marking this as a duplicate of bug 1568583. @vlaad will reopen if he sees the issue again on a system running as configured in https://bugzilla.redhat.com/show_bug.cgi?id=1568450#c7

*** This bug has been marked as a duplicate of bug 1568583 ***