Bug 1426279

Summary: Master's apiserver becomes unavailable, stays that way for over an hour (or logs are dropped)
Product: OpenShift Container Platform Reporter: Marek Mahut <mmahut>
Component: MasterAssignee: Jordan Liggitt <jliggitt>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Chuan Yu <chuyu>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.4.1CC: aos-bugs, ccoleman, jgoulding, jokerman, mfojtik, mmahut, mmccomas, twiest
Target Milestone: ---Keywords: OpsBlocker
Target Release: ---Flags: mfojtik: needinfo? (mmahut)
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-03-27 19:46:37 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marek Mahut 2017-02-23 15:32:24 UTC
Description of problem:

We have seen a problem twice today on preview, where the resolution
provided by skydns became unavailable, fixed by a master API restart.

We are not able to find a hint about a root cause.

Comment 5 Clayton Coleman 2017-02-25 21:26:08 UTC
This has nothing to do with SkyDNS, the master apiserver is down for an extended period and either isn't restarted, can't restart, is restarted (fails) and drops logs out of journald, or is blocking on something in the OS for very long periods.

Comment 7 Clayton Coleman 2017-02-28 16:47:08 UTC
Has this recurred?

Comment 9 Michal Fojtik 2017-03-27 12:14:04 UTC
Is this still a problem?

Comment 10 Thomas Wiest 2017-03-27 19:46:37 UTC
Closing since we haven't seen this since and aren't able to reproduce it.