Bug 1749448

Summary:	[upgrade] During 4.1 to 4.2 upgrade the load balancer availability test reported a failure
Product:	OpenShift Container Platform	Reporter:	Clayton Coleman <ccoleman>
Component:	Networking	Assignee:	Miciah Dashiel Butler Masters <mmasters>
Networking sub component:	router	QA Contact:	Hongan Li <hongli>
Status:	CLOSED DUPLICATE	Docs Contact:
Severity:	high
Priority:	medium	CC:	aos-bugs, bbennett
Version:	4.2.0
Target Milestone:	---
Target Release:	4.2.0
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2019-09-05 17:19:38 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Clayton Coleman 2019-09-05 15:58:54 UTC

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade-4.1-to-4.2/336

Sep  5 08:47:09.319: INFO: Poke("http://ab0696926cfb611e9b2000a76ca579a6-1017163478.us-east-1.elb.amazonaws.com:80/echo?msg=hello"): Get http://ab0696926cfb611e9b2000a76ca579a6-1017163478.us-east-1.elb.amazonaws.com:80/echo?msg=hello: EOF
Sep  5 08:47:09.319: INFO: Could not reach HTTP service through ab0696926cfb611e9b2000a76ca579a6-1017163478.us-east-1.elb.amazonaws.com:80 after 2m0s

This test verifies that pods behind a service load balancer remain reachable during an upgrade.  There are two pods and they have a PDB that should ensure that at least one pod is available at all times.  The test does not flake in 4.1 z stream upgrades (which also reboot nodes).

This is a release blocker because either something serious is broken in LB, nodes, machine-config, or PDB such that the pods behind the service aren't reachable.

Comment 1 Ben Bennett 2019-09-05 17:19:38 UTC


*** This bug has been marked as a duplicate of bug 1749446 ***