| Summary: | [glusterfs-3.2.3]: glusterfs server crashed | ||
|---|---|---|---|
| Product: | [Community] GlusterFS | Reporter: | Raghavendra Bhat <rabhat> |
| Component: | quota | Assignee: | Raghavendra G <rgowdapp> |
| Status: | CLOSED WORKSFORME | QA Contact: | |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 3.2.3 | CC: | amarts, gluster-bugs |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-02-01 05:17:58 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
Hi Jhonny, Is it on version 3.2.2? This seems to be a case of memory corruption. After 3.2.2 following patches aimed at fixing corruptions have went in: e559ea5f8056 276142d543f61296 0564d1198bd7fa9 Can you take a release which has these fixes and rerun the tests? regards, Raghavendra. It was found in 3.2.3 only. Since bugzilla did not have 3.2.3 field in the release section I marked it 3.2.2. You can see that subject contains 3.2.3. CHANGE: http://review.gluster.com/390 (Change-Id: I060e62c1fbb288179063a6d64d73bad1a6572661) merged in master by Vijay Bellur (vijay) CHANGE: http://review.gluster.com/389 (Change-Id: Idb31e845bc876f46b476d8fa769d67d8db89e4a1) merged in release-3.2 by Vijay Bellur (vijay) There is a patch in, but need to still solve the actual root cause. Hi Raghu, Most likely this is a duplicate of bug #765356. Can you check whether http://review.gluster.com/#patch,sidebyside,538,1,xlators/features/marker/src/marker-quota.c fixes the issue? or you can just use latest release-3.2. regards, Raghavendra. with later releases this works fine. Please file a new bug if happens again. |
glusterfs server crashes while running some tests. Setup: Replicate setup with replica count 2 1 fuse client and 1 nfs client quota and profile enabled with quota set on a directory Both fuse client and nfs client were running sanity scripts inside the directory where quota limit is set. Brought one of the servers down, slept for some time and brought it up. On the other server volume set operations were running in a loop. Both the clients were running find <mount_point> | xargs stat to trigger self-heal whenever the server wakes up. After the server waking up, both the servers had died. This is the backtrace of the core: Core was generated by `/usr/local/sbin/glusterfsd --xlator-option mirror-server.listen-port=24010 -s l'. Program terminated with signal 11, Segmentation fault. #0 0x00000030b8e7288e in free () from /lib64/libc.so.6 (gdb) bt #0 0x00000030b8e7288e in free () from /lib64/libc.so.6 #1 0x00002acd2cd6cac2 in dict_destroy (this=0x2aaab000d080) at ../../../libglusterfs/src/dict.c:404 #2 0x00002acd2cd6cb81 in dict_unref (this=0x2aaab000d080) at ../../../libglusterfs/src/dict.c:430 #3 0x00002acd2cda3345 in call_stub_destroy_wind (stub=0x2acd2e191c34) at ../../../libglusterfs/src/call-stub.c:3540 #4 0x00002acd2cda3857 in call_stub_destroy (stub=0x2acd2e191c34) at ../../../libglusterfs/src/call-stub.c:3832 #5 0x00002acd2cda3979 in call_resume (stub=0x2acd2e191c34) at ../../../libglusterfs/src/call-stub.c:3865 #6 0x00002aaaab883199 in iot_worker (data=0x136ea60) at ../../../../../xlators/performance/io-threads/src/io-threads.c:129 #7 0x00000030b960673d in start_thread () from /lib64/libpthread.so.0 #8 0x00000030b8ed44bd in clone () from /lib64/libc.so.6 (gdb) info thr 11 Thread 6093 0x00000030b8ed48a8 in epoll_wait () from /lib64/libc.so.6 10 Thread 6094 0x00000030b960e838 in do_sigwait () from /lib64/libpthread.so.0 9 Thread 6095 0x00000030b8e9a541 in nanosleep () from /lib64/libc.so.6 8 Thread 6102 0x00000030b960b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 7 Thread 6103 0x00000030b960b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 6 Thread 6104 0x00000030b960d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0 5 Thread 6105 0x00000030b960b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 4 Thread 6106 0x00000030b960b732 in ?? () from /lib64/libpthread.so.0 3 Thread 6107 0x00000030b960b150 in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 2 Thread 6153 0x00000030b960b737 in ?? () from /lib64/libpthread.so.0 * 1 Thread 8783 0x00000030b8e7288e in free () from /lib64/libc.so.6 (gdb) t 2 [Switching to thread 2 (Thread 6153)]#0 0x00000030b960b737 in ?? () from /lib64/libpthread.so.0 (gdb) bt #0 0x00000030b960b737 in ?? () from /lib64/libpthread.so.0 #1 0x00002aaaabaada3f in quota_update_inode_contribution (frame=0x2acd2df58650, cookie=0x2acd2dcdc374, this=0x1368fe0, op_ret=0, op_errno=22, inode=0x2aaaac3c0d54, buf=0x419cee30, dict=0x2aaab80077e0, postparent=0x419cedc0) at ../../../../../xlators/features/marker/src/marker-quota.c:1626 #2 0x00002aaaab88340f in iot_lookup_cbk (frame=0x2acd2dcdc374, cookie=0x2acd2dc8a418, this=0x1367f20, op_ret=0, op_errno=22, inode=0x2aaaac3c0d54, buf=0x419cee30, xattr=0x2aaab80077e0, postparent=0x419cedc0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:199 #3 0x00002aaaab6720cb in pl_lookup_cbk (frame=0x2acd2dc8a418, cookie=0x2acd2dcdb7ec, this=0x1366f20, op_ret=0, op_errno=22, inode=0x2aaaac3c0d54, buf=0x419cee30, dict=0x2aaab80077e0, postparent=0x419cedc0) at ../../../../../xlators/features/locks/src/posix.c:1452 #4 0x00002aaaab45a2f9 in posix_acl_lookup_cbk (frame=0x2acd2dcdb7ec, cookie=0x2acd2dcc5460, this=0x1365de0, op_ret=0, op_errno=22, inode=0x2aaaac3c0d54, buf=0x419cee30, xattr=0x2aaab80077e0, postparent=0x419cedc0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:708 #5 0x00002aaaab23d4a4 in posix_lookup (frame=0x2acd2dcc5460, this=0x1364c40, loc=0x2acd2e183278, xattr_req=0x2aaab8021dd0) at ../../../../../xlators/storage/posix/src/posix.c:616 #6 0x00002aaaab45a60c in posix_acl_lookup (frame=0x2acd2dcdb7ec, this=0x1365de0, loc=0x2acd2e183278, xattr=0x2aaab8021dd0) at ../../../../../xlators/system/posix-acl/src/posix-acl.c:753 #7 0x00002aaaab672575 in pl_lookup (frame=0x2acd2dc8a418, this=0x1366f20, loc=0x2acd2e183278, xattr_req=0x2aaab8021dd0) at ../../../../../xlators/features/locks/src/posix.c:1491 #8 0x00002aaaab883629 in iot_lookup_wrapper (frame=0x2acd2dcdc374, this=0x1367f20, loc=0x2acd2e183278, xattr_req=0x2aaab8021dd0) at ../../../../../xlators/performance/io-threads/src/io-threads.c:209 #9 0x00002acd2cd9d358 in call_resume_wind (stub=0x2acd2e183240) at ../../../libglusterfs/src/call-stub.c:2408 #10 0x00002acd2cda3954 in call_resume (stub=0x2acd2e183240) at ../../../libglusterfs/src/call-stub.c:3859 #11 0x00002aaaab883199 in iot_worker (data=0x136ea60) at ../../../../../xlators/performance/io-threads/src/io-threads.c:129 #12 0x00000030b960673d in start_thread () from /lib64/libpthread.so.0 #13 0x00000030b8ed44bd in clone () from /lib64/libc.so.6 (gdb) f 1 #1 0x00002aaaabaada3f in quota_update_inode_contribution (frame=0x2acd2df58650, cookie=0x2acd2dcdc374, this=0x1368fe0, op_ret=0, op_errno=22, inode=0x2aaaac3c0d54, buf=0x419cee30, dict=0x2aaab80077e0, postparent=0x419cedc0) at ../../../../../xlators/features/marker/src/marker-quota.c:1626 1626 LOCK (&contribution->lock); (gdb) p *contribution $1 = {contri_list = {next = 0x100000000, prev = 0x200000004}, contribution = 46912720142304, gfid = "@\321\001\270\252*\000\000\000\000\000\000\000\000\000", lock = -1} (gdb) p contribution->contri_list $3 = {next = 0x100000000, prev = 0xa} (gdb) p contribution->contri_list->next $4 = (struct list_head *) 0x100000000 (gdb) p *contribution->contri_list->next Cannot access memory at address 0x100000000 (gdb)