Bug 1122120 - Bricks crashing after disable and re-enabled quota on a volume
Summary: Bricks crashing after disable and re-enabled quota on a volume
Keywords:
Status: CLOSED EOL
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: 3.5.1
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Krutika Dhananjay
QA Contact:
URL:
Whiteboard:
Depends On: 1144315
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-22 14:57 UTC by Peter Auyeung
Modified: 2016-06-17 16:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: disable and re-enabled quota on a volume Consequence: bricks keep crashing on that volume Fix: Result:
Clone Of:
Environment:
Last Closed: 2016-06-17 16:24:04 UTC
Regression: ---
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Peter Auyeung 2014-07-22 14:57:44 UTC
Description of problem:
When a gluster volume quota command issue we got the following errors:
2014-07-18 21:12:59.706929] W [client-rpc-fops.c:2772:client3_3_lookup_cbk] 0-sas02-client-2: remote operation failed: No such file or directory. Path: <gfid:abd01245-73e1-4ef6-aba6-dc087cf0bccd> (abd01245-73e1-4ef6-aba6-dc087cf0bccd)

After we disabled and re-enabled quota on a volume, bricks on that volume keep crashing and dumping core

[2014-07-22 14:35:49.270983] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
[2014-07-22 14:35:49.281775] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
[2014-07-22 14:35:49.406603] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
[2014-07-22 14:35:49.410136] W [marker-quota.c:1270:mq_get_parent_inode_local] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/locks.so(pl_common_inodelk+0x2af) [0x7f26c6abb4ef] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_inodelk_cbk+0xb9) [0x7f26c6895ac9] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_inodelk_cbk+0xc5) [0x7f26c6479ef5]))) 0-sas02-marker: failed to build parent loc of <gfid:ac3de7cf-2bfa-4ed7-95e7-7d0e1dc151d1>/home
pending frames:
frame : type(0) op(40)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(27)
frame : type(0) op(40)
frame : type(0) op(17)
frame : type(0) op(17)
frame : type(0) op(33)
frame : type(0) op(40)
frame : type(0) op(27)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-07-22 14:36:19configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.1
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f26cb95b4a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x163321)[0x7f26cba88321]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_loc_fill_from_name+0x89)[0x7f26c6475c59]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_readdir_cbk+0x21f)[0x7f26c647673f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir_cbk+0xc2)[0x7f26cc34d072]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_readdir_cbk+0xc2)[0x7f26c6895c02]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir_cbk+0xc2)[0x7f26c6cc6ca2]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_do_readdir+0x1b8)[0x7f26c72fd8c8]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_readdir+0x13)[0x7f26c72fdd43]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f26cc356a38]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir+0x23c)[0x7f26c6cc908c]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f26cc356a38]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_readdir_wrapper+0x150)[0x7f26c6899a80]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x1c5)[0x7f26cc36bcd5]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_worker+0x146)[0x7f26c689da66]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a)[0x7f26cbcece9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f26cba193fd]
---------

We restarted the bricks with "gluster volume start $vol force"

Bricks restarted but then still keep crashing once a while.

We also noticed some directories quota ran up and filled up over 100% 

Right now we have to keep watching the system as bricks will die anytime and we just have to keep restarting the volume....

Core file is 145MB and not able to upload....

Version-Release number of selected component (if applicable):


How reproducible:
N/A

Steps to Reproduce:
1.gluster volume start $vol force
2.brick crashed
3.gluster volume start $vol force 

Actual results:


Expected results:


Additional info:

Comment 1 Peter Auyeung 2014-07-22 15:00:04 UTC
gdb --core=core

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88

	

[2014-07-22 07:36:33.901295] W [marker-quota.c:1404:mq_release_parent_lock] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_lookup_cbk+0xd9) [0x7f8890c7f609] (-->/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lookup_cbk+0xd9) [0x7f8896736f39] (-->/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_update_inode_contribution+0x44f) [0x7f889086080f]))) 0-sas02-marker: An operation during quota updation of path (/HyperionBackupTempSata02/hyperionprod005.shopzilla.laxhq/home/hyperion/Oracle/Middleware/EPMSystem11R1/common/JRE/Sun/1.6.0/lib/zi/Africa) failed (Invalid argument)
[2014-07-22 07:37:01.342914] W [quota.c:3669:quota_statfs_validate_cbk] 0-sas02-quota: quota context is not present in inode (gfid:00000000-0000-0000-0000-000000000001)
pending frames:
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(27)
frame : type(0) op(27)
frame : type(0) op(20)
frame : type(0) op(27)
frame : type(0) op(20)
frame : type(0) op(40)
frame : type(0) op(17)
frame : type(0) op(17)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(0)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(31)
frame : type(0) op(29)
frame : type(0) op(29)
frame : type(0) op(29)
frame : type(0) op(30)
frame : type(0) op(29)
frame : type(0) op(29)

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2014-07-22 07:37:09configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.5.1
/lib/x86_64-linux-gnu/libc.so.6(+0x364a0)[0x7f8895d424a0]
/lib/x86_64-linux-gnu/libc.so.6(+0x163321)[0x7f8895e6f321]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_loc_fill_from_name+0x89)[0x7f889085cc59]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/marker.so(mq_readdir_cbk+0x21f)[0x7f889085d73f]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir_cbk+0xc2)[0x7f8896734072]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/performance/io-threads.so(iot_readdir_cbk+0xc2)[0x7f8890c7cc02]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir_cbk+0xc2)[0x7f88910adca2]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_do_readdir+0x1b8)[0x7f88916e48c8]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/storage/posix.so(posix_readdir+0x13)[0x7f88916e4d43]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f889673da38]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.1/xlator/features/access-control.so(posix_acl_readdir+0x23c)[0x7f88910b008c]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_readdir+0x88)[0x7f889673da38]

Comment 2 Peter Auyeung 2014-07-22 15:01:10 UTC
volume status
http://pastie.org/9412255

gdb --core=core
http://pastie.org/9411618

brick log

http://pastie.org/9403564
http://pastie.org/9412323

Comment 3 Peter Auyeung 2014-07-23 02:05:36 UTC
Gluster seems become unusable as whenever a write started happened, a brick crash.
http://pastie.org/9413749

and also noticed these in glustershd.log
http://pastie.org/9413757

Comment 4 Krutika Dhananjay 2014-09-28 04:53:32 UTC
The same crash was fixed as part of https://bugzilla.redhat.com/show_bug.cgi?id=1144315 in 3.5 and the fix will be available in glusterfs-3.5.3.
Hence, moving the state of the bug to MODIFIED.

Comment 5 Niels de Vos 2016-06-17 16:24:04 UTC
This bug is getting closed because the 3.5 is marked End-Of-Life. There will be no further updates to this version. Please open a new bug against a version that still receives bugfixes if you are still facing this issue in a more current release.


Note You need to log in before you can comment on or make changes to this bug.