Bug 762592 - (GLUSTER-860) GNFS is so slow to respond to client while running SFS2008 that sfs crashes.
GNFS is so slow to respond to client while running SFS2008 that sfs crashes.
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
nfs-beta
All Linux
medium Severity medium
: ---
: ---
Assigned To: Shehjar Tikoo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-27 02:54 EDT by Prithu Tiwari
Modified: 2015-12-01 11:45 EST (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: RTNR
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
sfslog (7.72 KB, application/octet-stream)
2010-04-27 01:08 EDT, Prithu Tiwari
no flags Details

  None (edit)
Description Shehjar Tikoo 2010-04-27 00:00:38 EDT
Log copied to dev:/share/tickets/860
Comment 1 Prithu Tiwari 2010-04-27 01:08:47 EDT
Created attachment 187 [details]
yet another test program.
Comment 2 Shehjar Tikoo 2010-04-27 01:23:28 EDT
The gnfs trace log file does not hint at any timeout on the NFS side because the requests are getting served right till the last line of the log file. It is possible that the timeout referred to in the SFS log is a timeout for the sfsmanager and sfs client communication. We're going to test with a higher timeout value.
Comment 3 Prithu Tiwari 2010-04-27 02:54:52 EDT
We tried to run SFS2008 in one server one client mode. While for low load values 
(target IOPS in SFS) it runs though the CPU load of the glusterfs process is very high n the server-side.

Detailed set-up
US-server - brickX
GNFS - GlusterNFS-beta-rc1
export directory (/exports/gnfs   -  JBOD)
volfile :-

---------------------------------------------------------------------

volume localdisk-posix
        type storage/posix
        option directory /exports/gnfs
end-volume

volume localdisk-ac
        type features/access-control
        subvolumes localdisk-posix
end-volume

volume localdisk
        type features/locks
        subvolumes localdisk-ac
end-volume

volume brick
  type performance/io-threads
  option thread-count 8
  subvolumes localdisk
end-volume


volume nfsd
        type nfs/server
        subvolumes brick
        option rpc-auth.addr.allow *
end-volume

-------------------------------------------------------------------------


US-clientYY

SFS2008 - patched SFS to use mount-proto "tcp".

sfs_nfs_rc file as follows 

--------------------------------------------------------------------------


##############################################################################
#
#       @(#)sfs_nfs_rc  $Revision: 1.13 $
#
# Specify SFS parameters for sfs runs in this file.
#
# The following parameters are configurable within the SFS run and
# reporting rules.
#
# See below for details.
#
# Example shows an NFS V3 run of 100 to 1000 ops/sec
#
LOAD="1000"
INCR_LOAD=1000
NUM_RUNS=4
PROCS=1
CLIENTS="client08"
MNT_POINTS="brick5:/brick"
BIOD_MAX_WRITES=2
BIOD_MAX_READS=2
IPV6_ENABLE="off"
FS_PROTOCOL="nfs"
SFS_DIR="/home/prithu/sfs/bin"
SUFFIX=""
WORK_DIR="result"
PRIME_MON_SCRIPT=""
PRIME_MON_ARGS=""
INIT_TIMEOUT=8000
# Leaving BLOCK_SIZE un-set is the default. This will permit auto-negotiation.
# If you over-ride this and set it to a particular value, you must
# add the value that you used to the Other Notes section of the 
# submission/disclosure.
BLOCK_SIZE=
# SFS_NFS_USER_ID only needed if running NFS load on Windows client 
# It's value should match the UID, of the user's account, on the NFS server.
SFS_NFS_USER_ID=500
# SFS_NFS_GROUP_ID only needed if running NFS load on Windows client 
# It's value should match the GID, of the user's account, on the NFS server.
SFS_NFS_GROUP_ID=500
#
# The following parameters are strictly defined within the SFS
# run and reporting rules and may not be changed.
#
RUNTIME=300
WARMUP_TIME=300
MIXFILE=""
ACCESS_PCNT=30
APPEND_PCNT=70
BLOCK_FILE=""
DIR_COUNT=30
FILE_COUNT=
SYMLINK_COUNT=20
TCP="on"
#
# The following parameters are useful for debugging or general system
# tuning.  They may not be used during during a reportable SFS run.
#
DEBUG=""
DUMP=
POPULATE=
LAT_GRAPH=
PRIME_SLEEP=0
PRIME_TIMEOUT=0

----------------------------------------------------------------------------
rest is comment
----------------------------------------------------------------------------
----------------------------------------------------------------------------

sfs run command

java SfsManager -r sfs_nfs_rc -s junk


The sfs out-put file is attached it can be seen it crashed due to response time-out.

The trace-log of the server-side is at dev.gluster.com at ~prithu/gntr.l.bz2.
Comment 4 Shehjar Tikoo 2010-04-28 00:53:10 EDT
Prithu has verified that the run finishes after increasing the timeout value for SFS. Closing.

Note You need to log in before you can comment on or make changes to this bug.