Bug 1898097 - mDNS floods the baremetal network
Summary: mDNS floods the baremetal network
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.6
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Ben Nemec
QA Contact: Oleg Sher
URL:
Whiteboard:
: 1893670 1898101 (view as bug list)
Depends On:
Blocks: 1936539
TreeView+ depends on / blocked
 
Reported: 2020-11-16 11:30 UTC by Yuval Kashtan
Modified: 2021-07-11 16:06 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The library used to provide mDNS services in the cluster did not properly implement the mDNS protocol. Consequence: Excessive multicast traffic was generated. Fix: Limit multicast frequency to once per second. Result: Significantly reduced multicast traffic.
Clone Of:
Environment:
Last Closed: 2021-02-24 15:33:27 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
mdns tcpdump capture (6.25 MB, application/vnd.tcpdump.pcap)
2020-11-16 11:30 UTC, Yuval Kashtan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift mdns-publisher pull 21 0 None closed Bug 1898097: Update zeroconf to pull in rate-limiting fix 2021-02-17 02:04:41 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:33:55 UTC

Description Yuval Kashtan 2020-11-16 11:30:52 UTC
Created attachment 1729743 [details]
mdns tcpdump capture

Version: 4.6

$ openshift-install version
openshift-baremetal-install 4.6.0

Platform: IPI

What happened?
PnT Lab had to disconnect us from network due to excessive multicasts flooding the network.
these multicasts are all mDNS traffic.
see attachment for tcpdump

What did you expect to happen?
as all our servers have proper DNS records, we dont need mDNS.
I'd want a install-config parameter to disable mDNS

How to reproduce it (as minimally and precisely as possible)?
just install a bunch of cluster on same broadcast domain
and observe tcpdump

Comment 1 Andrea Fasano 2020-11-24 17:50:56 UTC
*** Bug 1898101 has been marked as a duplicate of this bug. ***

Comment 2 Yuval Kashtan 2020-11-25 11:09:14 UTC
The reason I opened 2 distinct BZs is that 
for the flood bug, even changing the frequency would be a solution
but for the scalability one,
really need a different solution (than mDNS, or at least it's current implementation)

also, the other BZ include some (very) good discussion which we're missing here.

Comment 3 Sai Sindhur Malleni 2020-12-02 16:52:21 UTC
Even the temporary fix mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1898101#c4 does not fix the issue. Please see: https://bugzilla.redhat.com/show_bug.cgi?id=1898101#c19


Also can we use https://bugzilla.redhat.com/show_bug.cgi?id=1898101 instead of this as the main tracking BZ as there is a lot of history in that.

Comment 5 Sai Sindhur Malleni 2020-12-04 20:27:31 UTC
I had two nodes in the cluster that did not pick up the machineconfig change fue to bad nodeselectors and those were enough to DDoS the network :) . Fixing that, I see a drastic drop in the number of mcast packets.

Comment 9 Ben Nemec 2021-02-02 17:32:49 UTC
*** Bug 1893670 has been marked as a duplicate of this bug. ***

Comment 12 errata-xmlrpc 2021-02-24 15:33:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.