Bug 1809668

Summary: Router and default exposed frontends (oauth and console) should gracefully terminate
Product: OpenShift Container Platform Reporter: Clayton Coleman <ccoleman>
Component: NetworkingAssignee: Clayton Coleman <ccoleman>
Networking sub component: router QA Contact: Hongan Li <hongli>
Status: CLOSED WONTFIX Docs Contact:
Severity: high    
Priority: medium CC: amcdermo, aos-bugs, bbennett, chuffman, hongli, jeder, mmasters, sgreene, tnozicka, vlaad, wking
Version: 4.3.zKeywords: Reopened
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1809667 Environment:
Last Closed: 2021-01-18 21:18:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1809667    
Bug Blocks: 1805690, 1809742, 1818104, 1819147, 1868486    

Description Clayton Coleman 2020-03-03 16:17:49 UTC
+++ This bug was initially created as a clone of Bug #1809667 +++

+++ This bug was initially created as a clone of Bug #1809665 +++

The router, console, and oauth endpoints should all gracefully terminate when their pods are marked deleted without dropping traffic.

Console and oauth can have simple "wait before shutdown" logic because they do not execute long running transactions.  The router needs to wait longer (it is a service load balancer) and then instruct HAProxy to gracefully terminate, then wait up to a limit, and then shut down.

In combination these fixes will ensure end users see no disruption of the control plane or web console, or their frontend web applications, during upgrade.

Comment 1 Miciah Dashiel Butler Masters 2020-03-03 20:06:30 UTC
I'm deleting bug 1809665 from this report's "Depends On" (but keeping bug 1809667, which in turn depends on bug 1809665) in order to satisfy openshift-ci-robot, which is currently issuing the following complaint:

    expected dependent Bugzilla bug 1809665 to target the "4.4.0" release, but it targets "4.5.0" instead

https://github.com/openshift/cluster-ingress-operator/pull/369#issuecomment-594036425
https://github.com/openshift/cluster-ingress-operator/pull/371#issuecomment-594139624

Comment 3 Ben Bennett 2020-05-08 19:59:37 UTC
Waiting for the master work to complete.

Comment 4 Andrew McDermott 2020-05-28 16:05:14 UTC
Per comment #3 - Waiting for the master work to complete.

Comment 5 Andrew McDermott 2020-07-09 12:13:18 UTC
I’m adding UpcomingSprint, because I was occupied by fixing bugs with
higher priority/severity, developing new features with higher
priority, or developing new features to improve stability at a macro
level. I will revisit this bug next sprint.

Comment 6 Andrew McDermott 2020-07-30 10:13:09 UTC
I’m adding UpcomingSprint, because I was occupied by fixing bugs with
higher priority/severity, developing new features with higher
priority, or developing new features to improve stability at a macro
level. I will revisit this bug next sprint.

Comment 7 Miciah Dashiel Butler Masters 2020-08-20 16:01:16 UTC
*** Bug 1850074 has been marked as a duplicate of this bug. ***

Comment 8 Miciah Dashiel Butler Masters 2020-08-21 05:11:53 UTC
We'll continue tracking this issue in the upcoming sprint.

Comment 9 Miciah Dashiel Butler Masters 2020-08-24 06:13:42 UTC
*** Bug 1869785 has been marked as a duplicate of this bug. ***

Comment 10 Andrew McDermott 2020-09-10 11:55:56 UTC
I’m adding UpcomingSprint, because I was occupied by fixing bugs with
higher priority/severity, developing new features with higher
priority, or developing new features to improve stability at a macro
level. I will revisit this bug next sprint.

Comment 11 Miciah Dashiel Butler Masters 2020-10-26 05:37:18 UTC
The remaining known issue is being tracked in https://issues.redhat.com/browse/NE-348 (graceful termination for LoadBalancer-type services using the "Local" external traffic policy); no backport of NE-348 is planned for 4.3.z.

Comment 12 Scott Dodson 2020-11-05 18:23:26 UTC
The 4.4 bug this depends on says it won't be fixed. Marking this as CLOSED DEFERRED and closing the PR as 4.3 is now EOL.

Comment 13 Andrew McDermott 2020-12-08 17:23:17 UTC
CLOSING Per comment #12