Bug 1574849

Summary: static pods should be controlled by systemd wrapper
Product: OpenShift Container Platform Reporter: Aleksandar Kostadinov <akostadi>
Component: InstallerAssignee: Scott Dodson <sdodson>
Status: CLOSED NOTABUG QA Contact: Johnny Liu <jialiu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 3.10.0CC: aos-bugs, hongli, jiajliu, jokerman, mmccomas, wmeng
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-07 12:49:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Aleksandar Kostadinov 2018-05-04 07:28:35 UTC
Description of problem:
In 3.10 there is a change where master and etcd services are running as static pods [1]. The way to control lifecycle of these pods is through an external script `master-restart`.

I see this as a major divergence from the Red Hat Enterprise Linux UX and the way our customers are used to work. I think it would be a source of frustration.

The reasons for the changes presented in the email are:

> better align us with the future converged platform and reduce the amount of supported installation paths for OpenShift
> reduce the number of ways that the platform can be configured 
> guide customers in paths that allow us to automatically update components
> beginning our transition away from installing bits on the host to a self-hosted configuration that leverages Kubernetes to do the busy work and prepares us to better align with an immutable host

I don't see the advantages of having a script to control lifecycle of the static pods compared to having a Systemd service. I don't see how having a systemd service prevents us achieving the above listed goals. Systemd services are flexible enough to do allow interaction with the system in a way that we see fit. For example they are used to control and report DBus services.

As an implementation, we can have a systemd service that uses kubelet API to restart and (un)deploy the static pods. It can use kubelet API to report status of said static pods also. They don't have to be synchronous or guarantee overall cluster availability. Btw such approach might even be more flexible than existing static pod definitions in files.

I believe this will preserve RHEL consistent experience with OpenShift as well still let us move ahead with our future goals. 

[1] http://post-office.corp.redhat.com/archives/aos-devel/2018-April/msg00015.html

Version-Release number of selected component (if applicable):
3.10

Comment 1 Scott Dodson 2018-05-07 12:49:42 UTC
Aleksander,

I agree, but I think we should drive this as a discussion on the mailing lists. Can you bring the topic back up there and once there's consensus that we should make a change we can re-open this bug?