Bug 1309886 - Diagnostic Enhancement: check MTU size is valid across the Openshift Cluster
Summary: Diagnostic Enhancement: check MTU size is valid across the Openshift Cluster
Keywords:
Status: CLOSED DEFERRED
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 3.1.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Eric Paris
QA Contact: Meng Bo
URL:
Whiteboard:
Depends On:
Blocks: OSOPS_V3
TreeView+ depends on / blocked
 
Reported: 2016-02-18 21:46 UTC by Matt Woodson
Modified: 2016-04-12 19:23 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-04-12 19:23:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Matt Woodson 2016-02-18 21:46:19 UTC
Description of problem:

When installing clusters we have found inconsistent MTU sizes between interfaces.  The eth0 would have MTU of 9000, the tun0 would have MTU of 1500.  This has caused problems with pods being deployed.  This is also a very hard bug to diagnose, taking hours to understand and pinpoint.

It would be helpful if Openshift could alert the administrator that there are MTU differences and/or offer to help correct the MTU configurations.  This could help prevent possible problems that would arise.



Version-Release number of selected component (if applicable):

atomic-openshift-master-3.1.1.6-2.git.10.15b47fc.el7aos.x86_64
atomic-openshift-node-3.1.1.6-2.git.10.15b47fc.el7aos.x86_64

Comment 1 Clayton Coleman 2016-02-18 21:51:22 UTC
At a minimum we could try a TLS connection and suggest MTU misconfiguration as an option.  But MTU checking is even better.

Comment 2 Ben Bennett 2016-04-12 19:23:32 UTC
This was fixed in the ansible installer when it detects the interface MTU and assigns the SDN MTU appropriately.

Further work should be done in the network diagnostics tool to help catch this case, we already gather the right data across the cluset, but should flag the discrpancy.

Closing this because the work is tracked in https://trello.com/c/HtaFZbiR/68-13-network-diagnostics-utility-supportability


Note You need to log in before you can comment on or make changes to this bug.