Bug 1122120

Summary: Bricks crashing after disable and re-enabled quota on a volume
Product: [Community] GlusterFS Reporter: Peter Auyeung <pauyeung>
Component: coreAssignee: Krutika Dhananjay <kdhananj>
Status: CLOSED EOL QA Contact:
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.1CC: bugs, hgowtham, kdhananj, pauyeung
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: disable and re-enabled quota on a volume Consequence: bricks keep crashing on that volume Fix: Result:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-17 16:24:04 UTC Type: Bug
Regression: --- Mount Type: nfs
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1144315    
Bug Blocks:    

Description Peter Auyeung 2014-07-22 14:57:44 UTC
Description of problem:
When a gluster volume quota command issue we got the following errors:
2014-07-18 21:12:59.706929] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-sas02-client-2: remote operation failed: No such file or directory. Path: <gfid:abd01245-73e1-4ef6-aba6-dc087cf0bccd> (abd01245-73e1-4ef6-aba6-dc087cf0bccd)

After we disabled and re-enabled quota on a volume, bricks on that volume keep crashing and dumping core

[2014-07-22 14:35:49.270983] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
[2014-07-22 14:35:49.281775] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
[2014-07-22 14:35:49.406603] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
[2014-07-22 14:35:49.410136] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
pending frames:
frame : type(0) op(40)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(27)
frame : type(0) op(40)
frame : type(0) op(17)
frame : type(0) op(17)
frame : type(0) op(33)
frame : type(0) op(40)
frame : type(0) op(27)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-07-22 14:36:19configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.1
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f26cb95b4a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x163321)[0x7f26cba88321]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_loc_fill_from_name+0x89)[0x7f26c6475c59]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_readdir_cbk+0x21f)[0x7f26c647673f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir_cbk+0xc2)[0x7f26cc34d072]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_readdir_cbk+0xc2)[0x7f26c6895c02]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir_cbk+0xc2)[0x7f26c6cc6ca2]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_do_readdir+0x1b8)[0x7f26c72fd8c8]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_readdir+0x13)[0x7f26c72fdd43]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f26cc356a38]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir+0x23c)[0x7f26c6cc908c]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f26cc356a38]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_readdir_wrapper+0x150)[0x7f26c6899a80]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x1c5)[0x7f26cc36bcd5]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_worker+0x146)[0x7f26c689da66]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f26cbcece9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f26cba193fd]
---------

We restarted the bricks with "gluster volume start $vol force"

Bricks restarted but then still keep crashing once a while.

We also noticed some directories quota ran up and filled up over 100% 

Right now we have to keep watching the system as bricks will die anytime and we just have to keep restarting the volume....

Core file is 145MB and not able to upload....

Version-Release number of selected component (if applicable):


How reproducible:
N/A

Steps to Reproduce:
1.gluster volume start $vol force
2.brick crashed
3.gluster volume start $vol force 

Actual results:


Expected results:


Additional info:

Comment 1 Peter Auyeung 2014-07-22 15:00:04 UTC
gdb --core=core

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

	

[2014-07-22 07:36:33.901295] W [marker-quota.c:1404:mq_release_parent_lock] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_lookup_cbk+0xd9) [0x7f8890c7f609] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lookup_cbk+0xd9) [0x7f8896736f39] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_update_inode_contribution+0x44f) [0x7f889086080f]))) 0-sas02-marker: An operation during quota updation of path (/HyperionBackupTempSata02/hyperionprod005.shopzilla.laxhq/home/hyperion/Oracle/Middleware/EPMSystem11R1/common/JRE/Sun/1.6.0/lib/zi/Africa) failed (Invalid argument)
[2014-07-22 07:37:01.342914] W [quota.c:3669:quota_statfs_validate_cbk] 0-sas02-quota: quota context is not present in inode (gfid:00000000-0000-0000-0000-000000000001)
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(27)
frame : type(0) op(27)
frame : type(0) op(20)
frame : type(0) op(27)
frame : type(0) op(20)
frame : type(0) op(40)
frame : type(0) op(17)
frame : type(0) op(17)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(29)
frame : type(0) op(29)
frame : type(0) op(29)
frame : type(0) op(30)
frame : type(0) op(29)
frame : type(0) op(29)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-07-22 07:37:09configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.1
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f8895d424a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x163321)[0x7f8895e6f321]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_loc_fill_from_name+0x89)[0x7f889085cc59]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_readdir_cbk+0x21f)[0x7f889085d73f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir_cbk+0xc2)[0x7f8896734072]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_readdir_cbk+0xc2)[0x7f8890c7cc02]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir_cbk+0xc2)[0x7f88910adca2]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_do_readdir+0x1b8)[0x7f88916e48c8]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_readdir+0x13)[0x7f88916e4d43]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f889673da38]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir+0x23c)[0x7f88910b008c]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f889673da38]

Comment 2 Peter Auyeung 2014-07-22 15:01:10 UTC
volume status
http://pastie.org/9412255

gdb --core=core
http://pastie.org/9411618

brick log

http://pastie.org/9403564
http://pastie.org/9412323

Comment 3 Peter Auyeung 2014-07-23 02:05:36 UTC
Gluster seems become unusable as whenever a write started happened, a brick crash.
http://pastie.org/9413749

and also noticed these in glustershd.log
http://pastie.org/9413757

Comment 4 Krutika Dhananjay 2014-09-28 04:53:32 UTC
The same crash was fixed as part of https://bugzilla.redhat.com/show_bug.cgi?id=1144315 in 3.5 and the fix will be available in glusterfs-3.5.3.
Hence, moving the state of the bug to MODIFIED.

Comment 5 Niels de Vos 2016-06-17 16:24:04 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.