Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
This project is now read‑only. Starting Monday, February 2, please use https://ibm-ceph.atlassian.net/ for all bug tracking management.

Bug 1455711

Summary: [RFE] OSD: Add heartbeat message for Jumbo Frames(MTU 9000)
Product: [Red Hat Storage] Red Hat Ceph Storage Reporter: Vikhyat Umrao <vumrao>
Component: RADOSAssignee: Josh Durgin <jdurgin>
Status: CLOSED ERRATA QA Contact: Manohar Murthy <mmurthy>
Severity: medium Docs Contact: Bara Ancincova <bancinco>
Priority: medium    
Version: 1.3.3CC: ceph-eng-bugs, dzafman, edonnell, hnallurv, icolle, jbautist, jdurgin, jquinn, kchai, linuxkidd, mhackett, vikumar, vumrao
Target Milestone: rcKeywords: FutureFeature
Target Release: 3.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: RHEL: ceph-12.1.4-1.el7cp Ubuntu: ceph_12.1.4-2redhat1xenial Doc Type: Bug Fix
Doc Text:
.A heartbeat message for Jumbo frames has been added Previously, if a network included jumbo frames and the maximum transmission unit (MTU) was not configured properly on all network parts, a lot of problems, such as slow requests, and stuck peering and backfilling processes occurred. In addition, the OSD logs did not include any heartbeat timeout messages because the heartbeat message packet size is below 1500 bytes. This update adds a heartbeat message for Jumbo frames.
Story Points: ---
Clone Of:
: 1461581 (view as bug list) Environment:
Last Closed: 2017-12-05 23:33:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1494421    

Description Vikhyat Umrao 2017-05-25 22:27:12 UTC
Description of problem:
[RFE] OSD: Add heartbeat message for Jumbo Frames(MTU 900)
http://tracker.ceph.com/issues/20087

- When we have jumbo frames enabled in cluster network and if MTU is not configured properly like the recommendation is all interconnecting network gear must also have jumbo frames enabled but if any device is misconfigured for jumbo frames then we see a lot of issues like peering stuck, slow requests and backfilling not progressing.

- And the issue is we do not see heartbeat timeout messages in the OSD logs because heartbeat messages packet size is below 1500.

- We checked the communication issue with below command:

~~~
# ping -W 2 -I <interface> -M do -s <pkt size> <IP address>
~~~


Version-Release number of selected component (if applicable):
Red Hat Ceph Storage 1.3.2

Comment 9 Manohar Murthy 2017-11-08 10:41:38 UTC
Hi Vikhyat,

Can you please provide steps to recreate this bug and verification steps too.



Thanks,
Manohar

Comment 10 Michael J. Kidd 2017-11-08 22:14:49 UTC
Manohar, reproduction steps are as follows:

* Configure OSD and MON nodes to use jumbo frames ( typically, 9000 byte MTU )
* Configure interconnecting switch gear to *NOT* allow jumbo frames ( typically configured for 1500 byte MTU )
* Start MON and OSD processes

Comment 16 errata-xmlrpc 2017-12-05 23:33:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3387