Bug 1676430

Summary: distribute: Perf regression in mkdir path
Product: [Community] GlusterFS Reporter: Susant Kumar Palai <spalai>
Component: distributeAssignee: bugs <bugs>
Status: CLOSED NEXTRELEASE QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, guillaume.pavese
Target Milestone: ---Keywords: Reopened
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1676429 Environment:
Last Closed: 2019-03-06 03:24:43 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1676429    
Bug Blocks:    

Description Susant Kumar Palai 2019-02-12 09:32:42 UTC
Description of problem:
There seems to be perf regression of around 30% in mkdir path with patch : https://review.gluster.org/#/c/glusterfs/+/21062/.

Here is the result from gbench which runs smallfile tool internally.
Without patch: 3187.402238 2544.658604  2400.662029 (mkdir per second)
With patch: 2439.311086  1654.222631 1634.522184 (mkdir per second)

This bug is created to address the revert of the above commit.

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
Run gbench

Comment 1 Worker Ant 2019-02-12 09:39:27 UTC
REVIEW: https://review.gluster.org/22202 (Revert \"dht: Operate internal fops with negative pid\") posted (#1) for review on master by Susant Palai

Comment 2 Susant Kumar Palai 2019-02-20 10:16:43 UTC
Update:
After taking statedumps, io-threads xlator showed differences in latency. And here is the responsible code path.

<<<<
int
iot_schedule (call_frame_t *frame, xlator_t *this, call_stub_t *stub)
{
        int             ret = -1;
        iot_pri_t       pri = IOT_PRI_MAX - 1;
        iot_conf_t      *conf = this->private;

        if ((frame->root->pid < GF_CLIENT_PID_MAX) && conf->least_priority) {
                pri = IOT_PRI_LEAST;
                goto out;
        }
>>>>

It seems requests with -ve pid gets the least priority.


After testing with performance.enable-least-priority to be off, the results are normalized now. Here is the summary.

Numbers are in files/sec

Post with performance.enable-least-priority on:   5448.965051804044, 5382.812519425897, 5358.221152245441,

Post with performance.enable-least-priority off:  6589.996990998271, 6458.350431426266, 6568.009725869085

Pre:                                              6387.711992865287,  6412.12706152037, 6570.547263693283



Will send a patch to prioritize ops with no-root-squash pid.

Susant

Comment 3 Worker Ant 2019-02-20 11:08:39 UTC
REVIEW: https://review.gluster.org/22238 (io-threads: Prioritize fops with NO_ROOT_SQUASH pid) posted (#1) for review on master by Susant Palai

Comment 4 Worker Ant 2019-03-05 13:56:00 UTC
REVIEW: https://review.gluster.org/22238 (io-threads: Prioritize fops with NO_ROOT_SQUASH pid) merged (#3) on master by Ravishankar N

Comment 5 Worker Ant 2019-03-06 03:16:06 UTC
REVIEW: https://review.gluster.org/22304 (io-threads: Prioritize fops with NO_ROOT_SQUASH pid) posted (#1) for review on release-6 by Susant Palai

Comment 6 Worker Ant 2019-03-06 03:18:29 UTC
REVISION POSTED: https://review.gluster.org/22304 (io-threads: Prioritize fops with NO_ROOT_SQUASH pid) posted (#2) for review on release-6 by Susant Palai