Bug 1309886 - Diagnostic Enhancement: check MTU size is valid across the Openshift Cluster
Diagnostic Enhancement: check MTU size is valid across the Openshift Cluster
Status: CLOSED DEFERRED
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking (Show other bugs)
3.1.0
Unspecified Unspecified
medium Severity medium
: ---
: ---
Assigned To: Eric Paris
Meng Bo
: UpcomingRelease
Depends On:
Blocks: OSOPS_V3
  Show dependency treegraph
 
Reported: 2016-02-18 16:46 EST by Matt Woodson
Modified: 2016-04-12 15:23 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-04-12 15:23:32 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matt Woodson 2016-02-18 16:46:19 EST
Description of problem:

When installing clusters we have found inconsistent MTU sizes between interfaces.  The eth0 would have MTU of 9000, the tun0 would have MTU of 1500.  This has caused problems with pods being deployed.  This is also a very hard bug to diagnose, taking hours to understand and pinpoint.

It would be helpful if Openshift could alert the administrator that there are MTU differences and/or offer to help correct the MTU configurations.  This could help prevent possible problems that would arise.



Version-Release number of selected component (if applicable):

atomic-openshift-master-3.1.1.6-2.git.10.15b47fc.el7aos.x86_64
atomic-openshift-node-3.1.1.6-2.git.10.15b47fc.el7aos.x86_64
Comment 1 Clayton Coleman 2016-02-18 16:51:22 EST
At a minimum we could try a TLS connection and suggest MTU misconfiguration as an option.  But MTU checking is even better.
Comment 2 Ben Bennett 2016-04-12 15:23:32 EDT
This was fixed in the ansible installer when it detects the interface MTU and assigns the SDN MTU appropriately.

Further work should be done in the network diagnostics tool to help catch this case, we already gather the right data across the cluset, but should flag the discrpancy.

Closing this because the work is tracked in https://trello.com/c/HtaFZbiR/68-13-network-diagnostics-utility-supportability

Note You need to log in before you can comment on or make changes to this bug.