Bug 1028583

Summary: gluster-swift with default TCP configuration incurs thousands of errors while running catalyst - please consider modifying the default state of one or more TCP tunables
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Nick Dokos <ndokos>
Component: gluster-swiftAssignee: Luis Pabón <lpabon>
Status: CLOSED WONTFIX QA Contact: SATHEESARAN <sasundar>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 2.1CC: madam, ndokos, ppai, rhs-bugs, sasundar
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-11-20 06:14:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Number of sockets in TW state vs time none

Description Nick Dokos 2013-11-08 19:53:06 UTC
Created attachment 821745 [details]
Number of sockets in TW state vs time

Description of problem:

We ran into the problem running catalyst with a subset of the standard workload.
The configuration consists of 8 clients with 64 threads each. There
are 6 servers in "standard" configuration. The workload is a subset of
the standard workload: 100K small files - each fileset of 10K files
consists of mostly small files (5 bytes to about 10KB) plus a single
somewhat larger file (3MB).

We ran the PUT phase in order to create the files on the servers and
then ran the GET phase repeatedly, varying the setting of some TCP
tunables. We did not drop cache between the runs: all the files are
served from the servers' page cache.

In the default configuration, the run completes but incurs about 38000
errors, so we only GET about 60% of the files. That behavior is consistent between clients.

Turning on a couple of TCP tunables, net.ipv4.tcp_tw_reuse and
net.ipv4.tcp_tw_recycle, that modify the behavior of the TCP code wrt
to TIMED_WAIT sockets makes a marked difference: when either (or both) of those
is turned on, the number of errors drops to 0 and the time it takes to
complete the run drops by about 25%.

The accompanying graph shows the number of sockets in TIMED_WAIT state
during the runs and the subsequent recovery as the sockets transition
out of that state. The vertical lines show the (rough) time of
completion of the GETs for three of the configurations. The fourth
configuration where both of the tunables were turned on roughly
coincides with the one where only tcp_tw_recycle was turned on, but
all three of the "good" configurations completed the GETs within a
couple of seconds of each other.

Most if not all of the sockets are localhost-only: they connect the
local Swift proxy workers to the local Swift object workers. Unfortunately,
the tunables affect *all* sockets on the system. Nevertheless, we think
that the default setting of at least one tunable should be changed, for
otherwise gluster-swift falls down rather badly.

Setting tcp_tw_reuse is probably the most conservative option: it
leaves the behavior closer to the default, but still allows TW sockets
to be reused if necessary.

Version-Release number of selected component (if applicable):


How reproducible:
Always.

Steps to Reproduce:
1. Please contact me if you need to reproduce.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 Prashanth Pai 2015-11-20 06:14:43 UTC
Closing this bug as RHS 2.1 is EOL.
If this bug persists in recent versions, it should be opened against RHGS 3.1.x