Bug 791087 - [glusterfs-3.3.0qa22]: glusterfs server crashed due to assert in marker_get_xattr since gfid was NULL
Summary: [glusterfs-3.3.0qa22]: glusterfs server crashed due to assert in marker_get_x...
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: GlusterFS
Classification: Community
Component: quota
Version: mainline
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
Assignee: Raghavendra Bhat
QA Contact: Raghavendra Bhat
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2012-02-16 07:05 UTC by Raghavendra Bhat
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-04-20 06:26:54 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Raghavendra Bhat 2012-02-16 07:05:36 UTC
Description of problem:
Created a replicate volume with replica count 2. 1 fuse mount and 1 nfs mount. Started running sanity script on fuse mount and running other tools on nfs mount. While tests were running added 2 more bricks (thus making the setup 2x2 distributed replicate). Rebalanced the volume. Enabled quota and profiling. Was bringing some bricks down and then up to trigger self-heal. Also volume set operations. 

glusterfs server on one of the peers crashed due to assert since loc->gfid was NULL. This is the backtrace.

Core was generated by `/usr/local/sbin/glusterfsd -s localhost --volfile-id mirror.10.1.11.144.export-'.
Program terminated with signal 6, Aborted.
#0  0x000000390f432905 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.25.el6_1.3.x86_64 libgcc-4.4.5-6.el6.x86_64
(gdb) bt
#0  0x000000390f432905 in raise () from /lib64/libc.so.6
#1  0x000000390f4340e5 in abort () from /lib64/libc.so.6
#2  0x000000390f42b9be in __assert_fail_base () from /lib64/libc.so.6
#3  0x000000390f42ba80 in __assert_fail () from /lib64/libc.so.6
#4  0x00007fad979badc4 in mq_get_xattr (frame=0x7fad9f95ed70, cookie=0x7fada05c7a60, this=0x18abfd0, op_ret=0, op_errno=0)
    at ../../../../../xlators/features/marker/src/marker-quota.c:1165
#5  0x00007fad97bd9bab in iot_inodelk_cbk (frame=0x7fada05c7a60, cookie=0x7fada053e0a4, this=0x18aad70, op_ret=0, op_errno=0)
    at ../../../../../xlators/performance/io-threads/src/io-threads.c:1978
#6  0x00007fad97df73dc in pl_common_inodelk (frame=0x7fada053e0a4, this=0x18a9b60, volume=0x7fad5ce858b0 "mirror-marker", inode=0x448f50c, 
    cmd=7, flock=0x7fad9f37c334, loc=0x7fad9f37c2ec, fd=0x0) at ../../../../../xlators/features/locks/src/inodelk.c:653
#7  0x00007fad97df745a in pl_inodelk (frame=0x7fada053e0a4, this=0x18a9b60, volume=0x7fad5ce858b0 "mirror-marker", loc=0x7fad9f37c2ec, 
    cmd=7, flock=0x7fad9f37c334) at ../../../../../xlators/features/locks/src/inodelk.c:663
#8  0x00007fad97bd9e10 in iot_inodelk_wrapper (frame=0x7fada05c7a60, this=0x18aad70, volume=0x7fad5ce858b0 "mirror-marker", 
    loc=0x7fad9f37c2ec, cmd=7, lock=0x7fad9f37c334) at ../../../../../xlators/performance/io-threads/src/io-threads.c:1987
#9  0x00007fada18e3fff in call_resume_wind (stub=0x7fad9f37c2ac) at ../../../libglusterfs/src/call-stub.c:2419
#10 0x00007fada18eb31c in call_resume (stub=0x7fad9f37c2ac) at ../../../libglusterfs/src/call-stub.c:3938
#11 0x00007fad97bcc8cd in iot_worker (data=0x18b5a00) at ../../../../../xlators/performance/io-threads/src/io-threads.c:138
#12 0x000000390fc077e1 in start_thread () from /lib64/libpthread.so.0
#13 0x000000390f4e577d in clone () from /lib64/libc.so.6
(gdb) f 4
#4  0x00007fad979badc4 in mq_get_xattr (frame=0x7fad9f95ed70, cookie=0x7fada05c7a60, this=0x18abfd0, op_ret=0, op_errno=0)
    at ../../../../../xlators/features/marker/src/marker-quota.c:1165
1165            GF_UUID_ASSERT (local->loc.gfid);
(gdb) p local->loc
$1 = {path = 0x7fad5ce85850 "/playground/linux-2.6.31.1/include/linux", name = 0x7fad5ce85873 "linux", inode = 0x448f50c, 
  parent = 0x448da0c, gfid = '\000' <repeats 15 times>, pargfid = "t\347\n\265n\303H\270\270 \364\347߸~@"}
(gdb) p *local->loc.inode
$2 = {table = 0x18c23a0, gfid = '\000' <repeats 15 times>, lock = 1, nlookup = 0, ref = 2, ia_type = IA_INVAL, fd_list = {next = 0x448f53c, 
    prev = 0x448f53c}, dentry_list = {next = 0x448f54c, prev = 0x448f54c}, hash = {next = 0x448f55c, prev = 0x448f55c}, list = {
    next = 0x448da6c, prev = 0x448e38c}, _ctx = 0x448f5c0}
(gdb)  l
1160            }
1161
1162            if (uuid_is_null (local->loc.gfid))
1163                    uuid_copy (local->loc.gfid, local->loc.inode->gfid);
1164
1165            GF_UUID_ASSERT (local->loc.gfid);
1166
1167            STACK_WIND (frame, mq_check_n_set_inode_xattr, FIRST_CHILD(this),
1168                        FIRST_CHILD(this)->fops->lookup, &local->loc, xattr_req);
1169
(gdb) 

gfid of the inode itself is NULL.





Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.
  
Actual results:

glusterfs server crashed.
Expected results:

glusterfs server should not crash (means loc->gfid fild should not be null)

Additional info:
gluster volume info
 
Volume Name: mirror
Type: Distributed-Replicate
Volume ID: 770f045a-6e32-44e6-be2b-a9c9fb827dcc
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.1.11.130:/export-xfs/mirror
Brick2: 10.1.11.131:/export-xfs/mirror
Brick3: 10.1.11.144:/export-xfs/mirror
Brick4: 10.1.11.145:/export-xfs/mirror
Options Reconfigured:
geo-replication.indexing: on
performance.io-cache: off
performance.client-io-threads: on
diagnostics.count-fop-hits: on
diagnostics.latency-measurement: on
features.limit-usage: /playground:33GB
features.quota: on
performance.write-behind: off

[2012-02-15 08:49:32.335513] I [server3_1-fops.c:318:server_entrylk_cbk] 0-mirror-server: 1100578: ENTRYLK (null) (--) ==> -1 (No such file or
 directory)
[2012-02-15 08:49:32.336154] I [server3_1-fops.c:318:server_entrylk_cbk] 0-mirror-server: 1100580: ENTRYLK (null) (--) ==> -1 (No such file or
 directory)
[2012-02-15 08:49:32.340796] I [server3_1-fops.c:318:server_entrylk_cbk] 0-mirror-server: 1100587: ENTRYLK (null) (--) ==> -1 (No such file or
 directory)
[2012-02-15 08:49:32.341161] I [server3_1-fops.c:318:server_entrylk_cbk] 0-mirror-server: 1100588: ENTRYLK (null) (--) ==> -1 (No such file or
 directory)
[2012-02-15 08:49:32.507749] I [server3_1-fops.c:318:server_entrylk_cbk] 0-mirror-server: 1100593: ENTRYLK (null) (--) ==> -1 (No such file or
 directory)
[2012-02-15 08:49:32.508353] I [server3_1-fops.c:318:server_entrylk_cbk] 0-mirror-server: 1100594: ENTRYLK (null) (--) ==> -1 (No such file or
 directory)
[2012-02-15 08:49:32.583330] E [marker-quota-helper.c:230:mq_dict_set_contribution] (-->/usr/local/lib/glusterfs/3.3.0qa22/xlator/debug/io-sta
ts.so(io_stats_lookup+0x28c) [0x7fad977959d9] (-->/usr/local/lib/glusterfs/3.3.0qa22/xlator/features/marker.so(marker_lookup+0x1a3) [0x7fad979
b426d] (-->/usr/local/lib/glusterfs/3.3.0qa22/xlator/features/marker.so(mq_req_xattr+0x123) [0x7fad979bf21b]))) 0-marker: invalid argument: lo
c->parent
[2012-02-15 08:49:32.583576] W [marker-quota.c:2039:mq_inspect_directory_xattr] 0-mirror-marker: cannot add a new contribution node
[2012-02-15 08:49:32.632551] W [posix-handle.c:527:posix_handle_soft] 0-mirror-posix: symlink ../../74/e7/74e70ab5-6ec3-48b8-b820-f4e7dfb87e40/linux -> /export-xfs/mirror/.glusterfs/05/e8/05e8225c-d6c0-4043-8c71-ac16596ac87c failed (File exists)
[2012-02-15 08:49:32.632583] E [posix.c:908:posix_mkdir] 0-mirror-posix: setting gfid on /export-xfs/mirror/.glusterfs/d6/ee/d6eea5b4-bac6-4b7d-8fb7-6c6b962a35ec/include/linux failed
[2012-02-15 08:49:33.122634] W [inode.c:866:inode_lookup] (-->/usr/local/lib/glusterfs/3.3.0qa22/xlator/features/marker.so(marker_lookup_cbk+0x23d) [0x7fad979b3fcd] (-->/usr/local/lib/glusterfs/3.3.0qa22/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x262) [0x7fad97790c4a] (-->/usr/local/lib/glusterfs/3.3.0qa22/xlator/protocol/server.so(server_lookup_cbk+0x5a2) [0x7fad97566ef8]))) 0-mirror-server: inode not found

Comment 1 Raghavendra Bhat 2012-02-16 07:07:33 UTC
 kernel untar from fuse and rm -rf of the untarred kernel directory from nfs client were happening parallely.

Comment 2 Amar Tumballi 2012-03-12 09:46:55 UTC
please update these bugs w.r.to 3.3.0qa27, need to work on it as per target milestone set.

Comment 3 Raghavendra Bhat 2012-04-20 06:26:54 UTC
This bug is not observed again. Please re-open if found again.

Comment 4 Anand Avati 2012-05-14 21:45:11 UTC
CHANGE: http://review.gluster.com/3323 (features/marker: use the gfid from the stat structure instead of inode) merged in master by Anand Avati (avati)


Note You need to log in before you can comment on or make changes to this bug.