1219358 – Disperse volume: client crashed while running iozone

Bug 1219358 - Disperse volume: client crashed while running iozone

Summary: Disperse volume: client crashed while running iozone

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	disperse
Sub Component:
Version:	3.7.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Ashish Pandey
QA Contact:
Docs Contact:
URL:
Whiteboard:
Depends On:	1188242 1192378 1224115 1224118 1233632
Blocks:	qe_tracker_everglades
TreeView+	depends on / blocked

Reported:	2015-05-07 07:22 UTC by Ashish Pandey
Modified:	2015-07-30 09:50 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.7.3
Clone Of:	1188242
Environment:
Last Closed:	2015-05-15 17:09:51 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Ashish Pandey 2015-05-07 07:22:32 UTC

+++ This bug was initially created as a clone of Bug #1188242 +++

Description of problem:
=======================

Fuse mounted on the client and tried to run iozone for 10 files in parallel using below command. The gluster process has crashed and when tried to cd to the mount point it gives "Transport end point not connected" message.

for i in `seq 1 10`; do /opt/iozone3_430/src/current/iozone -az -i0 -i1 & done


Version-Release number of selected component (if applicable):
=============================================================
glusterfs 3.7dev built on Feb  2 2015 01:04:49

Package Information:
====================
Downloaded from :

http://download.gluster.org/pub/gluster/glusterfs/nightly/glusterfs/epel-6-x86_64/glusterfs-3.7dev-0.555.gite927623.autobuild/

How reproducible:
100%

Steps to Reproduce:
===================

1. Create a fuse mount
2. Run iozone. as for i in `seq 1 10`; do ./iozone3_430/src/current/iozone -az -i0 -i1 & done

Number of volumes :
===================
1

Volume Names:
=============
testvol

Volume on which the particular issue is seen [ if applicable ] :
================================================================
testvol 

Type of volumes :
=================
Disperse (1x(4+2))

Volume options if available :
=============================

[root@dhcp37-178 ~]# gluster volume get testvol all
Option                                  Value                                   
------                                  -----                                   
cluster.lookup-unhashed                 on                                      
cluster.min-free-disk                   10%                                     
cluster.min-free-inodes                 5%                                      
cluster.rebalance-stats                 off                                     
cluster.subvols-per-directory           (null)                                  
cluster.readdir-optimize                off                                     
cluster.rsync-hash-regex                (null)                                  
cluster.extra-hash-regex                (null)                                  
cluster.dht-xattr-name                  trusted.glusterfs.dht                   
cluster.randomize-hash-range-by-gfid    off                                     
cluster.local-volume-name               (null)                                  
cluster.weighted-rebalance              on                                      
cluster.switch-pattern                  (null)                                  
cluster.entry-change-log                on                                      
cluster.read-subvolume                  (null)                                  
cluster.read-subvolume-index            -1                                      
cluster.read-hash-mode                  1                                       
cluster.background-self-heal-count      16                                      
cluster.metadata-self-heal              on                                      
cluster.data-self-heal                  on                                      
cluster.entry-self-heal                 on                                      
cluster.self-heal-daemon                on                                      
cluster.self-heal-window-size           1                                       
cluster.data-change-log                 on                                      
cluster.metadata-change-log             on                                      
cluster.data-self-heal-algorithm        (null)                                  
cluster.eager-lock                      on                                      
cluster.quorum-type                     none                                    
cluster.quorum-count                    (null)                                  
cluster.choose-local                    true                                    
cluster.self-heal-readdir-size          1KB                                     
cluster.post-op-delay-secs              1                                       
cluster.ensure-durability               on                                      
cluster.stripe-block-size               128KB                                   
cluster.stripe-coalesce                 true                                    
diagnostics.latency-measurement         off                                     
diagnostics.dump-fd-stats               off                                     
diagnostics.count-fop-hits              off                                     
diagnostics.brick-log-level             INFO                                    
diagnostics.client-log-level            INFO                                    
diagnostics.brick-sys-log-level         CRITICAL                                
diagnostics.client-sys-log-level        CRITICAL                                
diagnostics.brick-logger                (null)                                  
diagnostics.client-logger               (null)                                  
diagnostics.brick-log-format            (null)                                  
diagnostics.client-log-format           (null)                                  
diagnostics.brick-log-buf-size          5                                       
diagnostics.client-log-buf-size         5                                       
diagnostics.brick-log-flush-timeout     120                                     
diagnostics.client-log-flush-timeout    120                                     
performance.cache-max-file-size         0                                       
performance.cache-min-file-size         0                                       
performance.cache-refresh-timeout       1                                       
performance.cache-priority                                                      
performance.cache-size                  32MB                                    
performance.io-thread-count             16                                      
performance.high-prio-threads           16                                      
performance.normal-prio-threads         16                                      
performance.low-prio-threads            16                                      
performance.least-prio-threads          1                                       
performance.enable-least-priority       on                                      
performance.least-rate-limit            0                                       
performance.cache-size                  128MB                                   
performance.flush-behind                on                                      
performance.nfs.flush-behind            on                                      
performance.write-behind-window-size    1MB                                     
performance.nfs.write-behind-window-size1MB                                     
performance.strict-o-direct             off                                     
performance.nfs.strict-o-direct         off                                     
performance.strict-write-ordering       off                                     
performance.nfs.strict-write-ordering   off                                     
performance.lazy-open                   yes                                     
performance.read-after-open             no                                      
performance.read-ahead-page-count       4                                       
performance.md-cache-timeout            1                                       
features.encryption                     off                                     
encryption.master-key                   (null)                                  
encryption.data-key-size                256                                     
encryption.block-size                   4096                                    
network.frame-timeout                   1800                                    
network.ping-timeout                    42                                      
network.tcp-window-size                 (null)                                  
features.lock-heal                      off                                     
features.grace-timeout                  10                                      
network.remote-dio                      disable                                 
network.tcp-window-size                 (null)                                  
network.inode-lru-limit                 16384                                   
auth.allow                              *                                       
auth.reject                             (null)                                  
transport.keepalive                     (null)                                  
server.allow-insecure                   (null)                                  
server.root-squash                      off                                     
server.anonuid                          65534                                   
server.anongid                          65534                                   
server.statedump-path                   /var/run/gluster                        
server.outstanding-rpc-limit            64                                      
features.lock-heal                      off                                     
features.grace-timeout                  (null)                                  
server.ssl                              (null)                                  
auth.ssl-allow                          *                                       
server.manage-gids                      off                                     
client.send-gids                        on                                      
server.gid-timeout                      2                                       
server.own-thread                       (null)                                  
performance.write-behind                on                                      
performance.read-ahead                  on                                      
performance.readdir-ahead               off                                     
performance.io-cache                    on                                      
performance.quick-read                  on                                      
performance.open-behind                 on                                      
performance.stat-prefetch               on                                      
performance.client-io-threads           off                                     
performance.nfs.write-behind            on                                      
performance.nfs.read-ahead              off                                     
performance.nfs.io-cache                off                                     
performance.nfs.quick-read              off                                     
performance.nfs.stat-prefetch           off                                     
performance.nfs.io-threads              off                                     
performance.force-readdirp              true                                    
features.file-snapshot                  off                                     
features.uss                            off                                     
features.snapshot-directory             .snaps                                  
features.show-snapshot-directory        off                                     
network.compression                     off                                     
network.compression.window-size         -15                                     
network.compression.mem-level           8                                       
network.compression.min-size            0                                       
network.compression.compression-level   -1                                      
network.compression.debug               false                                   
features.limit-usage                    (null)                                  
features.quota-timeout                  0                                       
features.default-soft-limit             80%                                     
features.soft-timeout                   60                                      
features.hard-timeout                   5                                       
features.alert-time                     86400                                   
features.quota-deem-statfs              off                                     
geo-replication.indexing                off                                     
geo-replication.indexing                off                                     
geo-replication.ignore-pid-check        off                                     
geo-replication.ignore-pid-check        off                                     
features.quota                          on                                      
debug.trace                             off                                     
debug.log-history                       no                                      
debug.log-file                          no                                      
debug.exclude-ops                       (null)                                  
debug.include-ops                       (null)                                  
debug.error-gen                         off                                     
debug.error-failure                     (null)                                  
debug.error-number                      (null)                                  
debug.random-failure                    off                                     
debug.error-fops                        (null)                                  
nfs.enable-ino32                        no                                      
nfs.mem-factor                          15                                      
nfs.export-dirs                         on                                      
nfs.export-volumes                      on                                      
nfs.addr-namelookup                     off                                     
nfs.dynamic-volumes                     off                                     
nfs.register-with-portmap               on                                      
nfs.outstanding-rpc-limit               16                                      
nfs.port                                2049                                    
nfs.rpc-auth-unix                       on                                      
nfs.rpc-auth-null                       on                                      
nfs.rpc-auth-allow                      all                                     
nfs.rpc-auth-reject                     none                                    
nfs.ports-insecure                      off                                     
nfs.trusted-sync                        off                                     
nfs.trusted-write                       off                                     
nfs.volume-access                       read-write                              
nfs.export-dir                                                                  
nfs.disable                             false                                   
nfs.nlm                                 on                                      
nfs.acl                                 on                                      
nfs.mount-udp                           off                                     
nfs.mount-rmtab                         /var/lib/glusterd/nfs/rmtab             
nfs.rpc-statd                           /sbin/rpc.statd                         
nfs.server-aux-gids                     off                                     
nfs.drc                                 off                                     
nfs.drc-size                            0x20000                                 
nfs.read-size                           (1 * 1048576ULL)                        
nfs.write-size                          (1 * 1048576ULL)                        
nfs.readdir-size                        (1 * 1048576ULL)                        
features.read-only                      off                                     
features.worm                           off                                     
storage.linux-aio                       off                                     
storage.batch-fsync-mode                reverse-fsync                           
storage.batch-fsync-delay-usec          0                                       
storage.owner-uid                       -1                                      
storage.owner-gid                       -1                                      
storage.node-uuid-pathinfo              off                                     
storage.health-check-interval           30                                      
storage.build-pgfid                     off                                     
storage.bd-aio                          off                                     
cluster.server-quorum-type              off                                     
cluster.server-quorum-ratio             0                                       
changelog.changelog                     off                                     
changelog.changelog-dir                 (null)                                  
changelog.encoding                      ascii                                   
changelog.rollover-time                 15                                      
changelog.fsync-interval                5                                       
changelog.changelog-barrier-timeout     120                                     
features.barrier                        disable                                 
features.barrier-timeout                120                                     
locks.trace                             disable                                 
cluster.disperse-self-heal-daemon       enable                                  
[root@dhcp37-178 ~]# 

Output of gluster volume info :
================================
[root@dhcp37-178 ~]# gluster v info
 
Volume Name: testvol
Type: Disperse
Volume ID: ad1a31fb-2e69-4d5d-9ae0-d057879b8fd5
Status: Started
Number of Bricks: 1 x (4 + 2) = 6
Transport-type: tcp
Bricks:
Brick1: dhcp37-120:/var/run/gluster/snaps/1e9ced492e2048cf9f906f45a4869238/brick1/b1
Brick2: dhcp37-208:/var/run/gluster/snaps/1e9ced492e2048cf9f906f45a4869238/brick2/b1
Brick3: dhcp37-178:/var/run/gluster/snaps/1e9ced492e2048cf9f906f45a4869238/brick3/b1
Brick4: dhcp37-183:/var/run/gluster/snaps/1e9ced492e2048cf9f906f45a4869238/brick4/b1
Brick5: dhcp37-120:/var/run/gluster/snaps/1e9ced492e2048cf9f906f45a4869238/brick5/b2
Brick6: dhcp37-208:/var/run/gluster/snaps/1e9ced492e2048cf9f906f45a4869238/brick6/b2
Options Reconfigured:
features.uss: off
features.quota: on
[root@dhcp37-178 ~]# 



Output of gluster volume status :
=================================
[root@dhcp37-178 ~]# gluster v status
Status of volume: testvol
Gluster process						Port	Online	Pid
------------------------------------------------------------------------------
Brick dhcp37-120:/var/run/gluster/snaps/1e9ced492e2048c
f9f906f45a4869238/brick1/b1				49156	Y	3225
Brick dhcp37-208:/var/run/gluster/snaps/1e9ced492e2048c
f9f906f45a4869238/brick2/b1				49167	Y	3238
Brick dhcp37-178:/var/run/gluster/snaps/1e9ced492e2048c
f9f906f45a4869238/brick3/b1				49166	Y	3192
Brick dhcp37-183:/var/run/gluster/snaps/1e9ced492e2048c
f9f906f45a4869238/brick4/b1				49166	Y	3173
Brick dhcp37-120:/var/run/gluster/snaps/1e9ced492e2048c
f9f906f45a4869238/brick5/b2				49157	Y	3236
Brick dhcp37-208:/var/run/gluster/snaps/1e9ced492e2048c
f9f906f45a4869238/brick6/b2				49168	Y	3249
NFS Server on localhost					2049	Y	3206
Quota Daemon on localhost				N/A	Y	3221
NFS Server on dhcp37-208				2049	Y	3262
Quota Daemon on dhcp37-208				N/A	Y	3276
NFS Server on dhcp37-183				2049	Y	3186
Quota Daemon on dhcp37-183				N/A	Y	3199
NFS Server on 10.70.37.120				2049	Y	3250
Quota Daemon on 10.70.37.120				N/A	Y	3263
 
Task Status of Volume testvol
------------------------------------------------------------------------------
There are no active volume tasks
 
[root@dhcp37-178 ~]# 


Actual results:
================

Gluster client crashed

Expected results:
================

It should not be crashed

Additional info:
================
Attaching the client mount log.

--- Additional comment from Bhaskarakiran on 2015-02-24 06:33:12 EST ---



--- Additional comment from Bhaskarakiran on 2015-02-24 06:34:39 EST ---

Log snippet: 
============

pending frames:
frame : type(1) op(LOOKUP)
frame : type(1) op(LOOKUP)
frame : type(1) op(FTRUNCATE)
frame : type(0) op(0)
frame : type(1) op(UNLINK)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(1) op(FLUSH)
frame : type(1) op(STAT)
frame : type(0) op(0)
patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash:
2015-02-24 11:41:47
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.7dev
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x306ae20aa6]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x306ae3bdcf]
/lib64/libc.so.6[0x342d4326a0]
/usr/lib64/glusterfs/3.7dev/xlator/cluster/distribute.so(dht_writev_cbk+0x268)[0x7f300993cbf8]
/usr/lib64/libglusterfs.so.0(default_writev_cbk+0xcc)[0x306ae2e5ec]
/usr/lib64/glusterfs/3.7dev/xlator/cluster/disperse.so(ec_manager_writev+0x10d)[0x7f3009b8647d]
/usr/lib64/glusterfs/3.7dev/xlator/cluster/disperse.so(__ec_manager+0x34)[0x7f3009b6a654]
/usr/lib64/glusterfs/3.7dev/xlator/cluster/disperse.so(ec_resume+0x91)[0x7f3009b6a461]
/usr/lib64/glusterfs/3.7dev/xlator/cluster/disperse.so(ec_combine+0x196)[0x7f3009b88fa6]
/usr/lib64/glusterfs/3.7dev/xlator/cluster/disperse.so(ec_writev_cbk+0x27b)[0x7f3009b844bb]
/usr/lib64/glusterfs/3.7dev/xlator/protocol/client.so(client3_3_writev_cbk+0x6cc)[0x7f3009de301c]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x306aa0ea65]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x142)[0x306aa0ff02]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x306aa0b5f8]
/usr/lib64/glusterfs/3.7dev/rpc-transport/socket.so(+0x9759)[0x7f30103fc759]
/usr/lib64/glusterfs/3.7dev/rpc-transport/socket.so(+0xb1bd)[0x7f30103fe1bd]
/usr/lib64/libglusterfs.so.0[0x306ae78ffc]
/lib64/libpthread.so.0[0x342d8079d1]
/lib64/libc.so.6(clone+0x6d)[0x342d4e89dd]
---------

--- Additional comment from Ashish Pandey on 2015-03-03 23:59:25 EST ---

dht_fsync_cbk() function is being called with op_ret = -1,  op_errno = 2 (ENOENT)  and postbuf and prebuff is NULL.
Inside the function dht_fsync_cbk, skipping the error handling of op_errno = ENOENT 
( if (op_ret == -1 && !dht_inode_missing(op_errno)) ) 
which causes control to go to -  
if (IS_DHT_MIGRATION_PHASE1 (postbuf)) 
Macro  IS_DHT_MIGRATION_PHASE1 trying to access the attributes of file using  postbuf pointer which is NULL. This leads to crash.
Bug id 960843 made some changes to not to include op_errno = ENOENT in error handling.

Need to investigate the reason to skip op_errno = ENOENT case and also modify marco definitions to handle NULL pointers properly.

--- Additional comment from Pranith Kumar K on 2015-03-09 02:44:05 EDT ---

Ashish,
     I just realized, on an active fd, fsync should never give ESTALE/ENOENT as the fd is already opened on the file. Why is EC returning this error? This could be ec bug after all?

Pranith

--- Additional comment from Anand Avati on 2015-04-09 08:21:47 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#1) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-04-13 07:19:29 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#2) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-04-13 07:19:32 EDT ---

REVIEW: http://review.gluster.org/10218 (Comments implemeted) posted (#1) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-04-14 05:45:23 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#3) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-04-28 02:06:23 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#4) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-05-01 11:04:55 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#5) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-05-03 07:46:40 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#6) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-05-04 07:37:02 EDT ---

REVIEW: http://review.gluster.org/10176 (cluster/ec: Use fd instead of loc for get_size_version) posted (#7) for review on master by Ashish Pandey (aspandey)

--- Additional comment from Anand Avati on 2015-05-04 22:43:51 EDT ---

COMMIT: http://review.gluster.org/10176 committed in master by Pranith Kumar Karampuri (pkarampu) 
------
commit 582b252e3a418ee332cf3d4b1a415520e242b599
Author: Ashish Pandey <aspandey>
Date:   Thu Apr 9 17:27:46 2015 +0530

    cluster/ec: Use fd instead of loc for get_size_version
    
    Change-Id: Ia7d43cb3b222db34ecb0e35424f1766715ed8e6a
    BUG: 1188242
    Signed-off-by: Ashish Pandey <aspandey>
    Reviewed-on: http://review.gluster.org/10176
    Reviewed-by: Xavier Hernandez <xhernandez>
    Tested-by: Gluster Build System <jenkins.com>

Comment 1 Anand Avati 2015-05-07 08:48:26 UTC

REVIEW: http://review.gluster.org/10626 (Adding 64 bits in "version" key of extended attributes. First 64 bits (Left) represents Data version. Last 64 bits (right) represents Meta Data version.) posted (#1) for review on release-3.7 by Ashish Pandey (aspandey)

Comment 2 Anand Avati 2015-05-08 11:21:37 UTC

REVIEW: http://review.gluster.org/10625 (cluster/ec: Use fd instead of loc for get_size_version) posted (#3) for review on release-3.7 by Ashish Pandey (aspandey)

Comment 3 Niels de Vos 2015-05-15 17:09:51 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.0, please open a new bug report.

glusterfs-3.7.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/10939
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Comment 4 Vijaikumar Mallikarjuna 2015-06-19 09:16:55 UTC

Patch http://review.gluster.org/#/c/11326/ fixes this issue in quota-xlator

Comment 5 Anand Avati 2015-06-22 11:11:45 UTC

COMMIT: http://review.gluster.org/11326 committed in release-3.7 by Raghavendra G (rgowdapp) 
------
commit 4673b50ecf8ed55b7d8bde55e9580cfde748ef0a
Author: vmallika <vmallika>
Date:   Thu Jun 18 12:02:50 2015 +0530

    quota: allow writes when with ENOENT/ESTALE on active fd
    
    This is a backport of http://review.gluster.org/#/c/11307/
    
    > We may get ENOENT/ESTALE in case of below scenario
    >         fd = open file.txt
    >         unlink file.txt
    >         write on fd
    > Here build_ancestry can fail as the file is removed.
    > For now ignore ENOENT/ESTALE on active fd with
    > writev and fallocate.
    > We need to re-visit this code once we understand
    > how other file-system behave in this scenario
    >
    > Below patch fixes the issue in DHT:
    > http://review.gluster.org/#/c/11097
    >
    > Change-Id: I7be683583b808c280e3ea2ddd036c1558a6d53e5
    > BUG: 1188242
    > Signed-off-by: vmallika <vmallika>
    
    Change-Id: Ic836d200689fe6f27d4675bc0ff89063b7dc3882
    BUG: 1219358
    Signed-off-by: vmallika <vmallika>
    Reviewed-on: http://review.gluster.org/11326
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Raghavendra G <rgowdapp>
    Tested-by: Raghavendra G <rgowdapp>

Comment 6 Kaushal 2015-07-30 09:50:50 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.7.3, please open a new bug report.

glusterfs-3.7.3 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://thread.gmane.org/gmane.comp.file-systems.gluster.devel/12078
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user

Note You need to log in before you can comment on or make changes to this bug.