Bug 1174466

Summary: RDMA: GFAPI benchmark segfaults when ran with greater than 2 threads, no segfaults are seen over TCP
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ben Turner <bturner>
Component: rdmaAssignee: Raghavendra G <rgowdapp>
Status: CLOSED WONTFIX QA Contact: Ben Turner <bturner>
Severity: high Docs Contact:
Priority: urgent    
Version: rhgs-3.0CC: chrisw, nbalacha, nlevinki, rcyriac, rwheeler, sankarshan, smohan
Target Milestone: ---Keywords: ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1176543 (view as bug list) Environment:
Last Closed: 2018-04-16 18:01:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1176543    
Bug Blocks:    

Description Ben Turner 2014-12-15 21:48:56 UTC
Description of problem:

When running BenE's GFAPI benchmark over RDMA we are seeing segfaults when run with greater than two threads.

Version-Release number of selected component (if applicable):

glusterfs-3.6.0.38-1.el6rhs.x86_64

How reproducible:

Every time we tested with greater than 2 threads.

Steps to Reproduce:
1.  git clone https://github.com/bengland2/parallel-libgfapi
2.  gcc -pthread -g -O0  -Wall --pedantic -o gfapi_perf_test -I /usr/include/glusterfs/api gfapi_perf_test.c  -lgfapi -lrt
3.  export GFAPI_HOSTNAME=gqas015
4.  export GFAPI_VOLNAME=testvol
5.  PGFAPI_PROCESSES=8 PGFAPI_FILES=1 PGFAPI_RECSZ=16384 PGFAPI_TOPDIR=/ben PGFAPI_FILESIZE=4194304 PGFAPI_MOUNTPOINT=/gluster-mount PGFAPI_LOAD=seq-wr PGFAPI_DIRECT=1 ./parallel_gfapi_test.sh 

Actual results:

The GFAPI benchmark app segfaults.

Expected results:

The GFAPI benchmark behaves similar to TCP.

Additional info:

Comment 1 Ben Turner 2014-12-15 21:53:32 UTC
Here is what happens when I run the benchmark:

[root@gqac022 parallel-libgfapi-master]# PGFAPI_PROCESSES=8 PGFAPI_FILES=1 PGFAPI_RECSZ=16384 PGFAPI_TOPDIR=/ben PGFAPI_FILESIZE=4194304 PGFAPI_MOUNTPOINT=/gluster-mount PGFAPI_LOAD=seq-wr PGFAPI_DIRECT=1 ./parallel_gfapi_test.sh 
/usr/local/bin/gfapi_perf_test
volume name: testvol
Gluster server in the volume: gqas015
workload: seq-wr
list of clients in file: clients.list
record size (KB): 64
file size (KB): 4194304
files per thread: 1
processes per client: 8
threads per process: 1
test driver glusterfs mountpoint: /gluster-mount
top directory within Gluster volume: /ben
each thread (process) runs program at: gfapi_perf_test
using direct I/O
log files for each libgfapi process at /tmp/parallel_gfapi_logs.6286
starting gun timeout = 12
removing any previous files
./parallel_gfapi_test.sh: line 163:  6319 Segmentation fault      (core dumped) GFAPI_LOAD=unlink GFAPI_FUSE=0 GFAPI_FILES=1 GFAPI_BASEDIR=/ben/smf-gfapi-localhost.06 GFAPI_VOLNAME=testvol GFAPI_HOSTNAME=gqas015 GFAPI_THREADS_PER_PROC=1 gfapi_perf_test > /tmp/unlink.localhost.06.log 2>&1
log directory is /tmp/par-for-all.6425

Mon Dec 15 16:47:20 EST 2014: starting 1 clients ... localhost 
ls: cannot access /gluster-mount//ben/*.ready: No such file or directory
Mon Dec 15 16:47:25 EST 2014: clients are all ready
Mon Dec 15 16:47:25 EST 2014 : clients should all start running within a few seconds
process 6509 exited with status 255
process 6526 exited with status 255
process 6560 exited with status 255
Mon Dec 15 16:48:38 EST 2014: clients completed
ERROR: at least one process exited with error status 255
per-thread results in /tmp/parallel_gfapi_logs.6286/result.csv
5 threads finished out of 8 
transfer-rate: 357.00 MBytes/s
file-rate:  5712.12 files/sec
IOPS:    0.00 requests/sec