Bug 1825967 - etcd quorum guard: should not use hostNetwork
Summary: etcd quorum guard: should not use hostNetwork
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Etcd
Version: 4.5
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.5.0
Assignee: Sam Batschelet
QA Contact: ge liu
URL:
Whiteboard:
Depends On:
Blocks: 1825976
TreeView+ depends on / blocked
 
Reported: 2020-04-20 15:24 UTC by Sam Batschelet
Modified: 2020-07-13 17:29 UTC (History)
0 users

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1825976 (view as bug list)
Environment:
Last Closed: 2020-07-13 17:29:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
4.5-fixed.png (52.60 KB, image/png)
2020-04-20 15:41 UTC, Sam Batschelet
no flags Details
4.5-before-fix.png (144.27 KB, image/png)
2020-04-20 15:42 UTC, Sam Batschelet
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1732 0 None closed Bug 1825967: revert: etcd quorum guard: don't set hostNetwork 2020-11-27 20:00:46 UTC
Red Hat Product Errata RHBA-2020:2409 0 None None None 2020-07-13 17:29:23 UTC

Description Sam Batschelet 2020-04-20 15:24:22 UTC
Description of problem: because quorum-guard is using hostNetwork all network traffic to be counted as a container traffic, so some pods had > 4MBps network in/out in console.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results: quorum-guard preforms localhost checks against host 127.0.0.1 
 and is counted as container traffic.


Expected results:quorum-guard preforms localhost checks against host 127.0.0.1 without be counted as container traffic.


Additional info:

Comment 1 Sam Batschelet 2020-04-20 15:40:16 UTC
Verification steps.

stand up prometheus based on prom data file from the following PR run.

https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/1552/pull-ci-openshift-machine-config-operator-master-e2e-aws/6758

query
container_network_transmit_bytes_total{name="k8s_POD_etcd-quorum-guard-f9fc8d4c-6x4g5_openshift-machine-config-operator_a7a8ac88-81ac-4b37-bdc9-e148c88a53e7_0"}

result is max of about 483.9 k of traffic.

see attached file (4.5-fixed.png)

Before this patfch you can see container traffic in the hundreds of mb see (4.5-before-fix.png)

Comment 2 Sam Batschelet 2020-04-20 15:41:34 UTC
Created attachment 1680326 [details]
4.5-fixed.png

Comment 3 Sam Batschelet 2020-04-20 15:42:49 UTC
Created attachment 1680327 [details]
4.5-before-fix.png

Comment 4 ge liu 2020-04-20 15:45:55 UTC
Verified with steps of Sam updated, thanks!

Comment 9 errata-xmlrpc 2020-07-13 17:29:09 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2409


Note You need to log in before you can comment on or make changes to this bug.