Description of problem: ======================= While snapshot creation was in progress, glusterd crashed on one of the nodes in the cluster. Version-Release number of selected component (if applicable): ============================================================ glusterfs 3.6.0.27 How reproducible: ================= 1/1 Steps to Reproduce: ================== 12 node cluster 6x2 dist rep volume 1.Fuse and NFS mount the volume and create IO for i in {21..100}; do cp -rvf /etc nfs_etc.$i ; done for i in {21..100}; do cp -rvf /etc fuse_etc.$i ; done for i in {1..50}; do dd if=/dev/urandom of=fuse_file"$i" bs=1024M count=1; done Tried a few snapshot operations like snapshot create ,list , delete and restore 2.Now created some more snapshots in a loop 3.After some time snapshot command failed as it took more than 2 min - crossed the cli time out and the remaning snapshots failed with "Another Transaction in progress" 4. Waited for sometime untill the lock would be released and executed snapshot create command again , and it failed with quorum not met 5. [root@dhcp-8-29-179 ~]# service glusterd status glusterd dead but pid file exists Actual results: ================ glusterd crashed Expected results: ================ There should be no crash observed Additional info: ============== {trans = 0x195a790, svc = 0x194b7f0, prog = 0x1952620, xid = 2349, prognum = 1238433, progver = 3, procnum = 3, type = 0, uid = 0, gid = 0, pid = 0, lk_owner = {len = 4, data = '\000' <repeats 1023 times>}, gfs_id = 0, auxgids = 0x4543fd7c, auxgidsmall = {0 <repeats 128 times>}, auxgidlarge = 0x0, auxgidcount = 0, msg = {{iov_base = 0x7f1dac276040, iov_len = 4548}, { iov_base = 0x0, iov_len = 0} <repeats 15 times>}, count = 1, iobref = 0x4543f810, rpc_status = 0, rpc_err = 0, auth_err = 0, txlist = {next = 0x454400ac, prev = 0x454400ac}, payloadsize = 0, cred = { flavour = 390039, datalen = 24, authdata = '\000' <repeats 19 times>, "\004", '\000' <repeats 379 times>}, verf = {flavour = 0, datalen = 0, authdata = '\000' <repeats 399 times>}, synctask = _gf_true, private = 0x0, trans_private = 0x0, hdr_iobuf = 0x41e22920, reply = 0x0} And following is the content of the trans obj: {ops = 0x7f1dae1db5c0, listener = 0x194be20, private = 0x193ac00, xl_private = 0x0, xl = 0x1943b30, mydata = 0x194b7f0, lock = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}, refcount = 4188, outstanding_rpc_count = 0, ctx = 0x1912010, options = 0x0, name = 0x1ade460 "socket.management", dnscache = 0x0, drc_client = 0x0, buf = 0x0, init = 0x7f1dadfd0f10 <init>, fini = 0x7f1dadfd3880 <fini>, reconfigure = 0, notify = 0x3802809ac0 <rpcsvc_notify>, notify_data = 0x0, peerinfo = {sockaddr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, sockaddr_len = 16, identifier = "10.8.29.222:1022", '\000' <repeats 91 times>, max_op_version = 0, min_op_version = 0, volname = '\000' <repeats 1023 times>}, myinfo = {sockaddr = {ss_family = 2, __ss_align = 0, __ss_padding = '\000' <repeats 111 times>}, sockaddr_len = 16, identifier = "10.8.29.179:24007", '\000' <repeats 90 times>, max_op_version = 0, min_op_version = 0, volname = '\000' <repeats 1023 times>}, total_bytes_read = 24303292, total_bytes_write = 357927, list = { next = 0xbabebabe, prev = 0xcafecafe}, bind_insecure = 0, dl_handle = 0x0} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (gdb) bt #0 0x0000003a26e32925 in raise () from /lib64/libc.so.6 #1 0x0000003a26e34105 in abort () from /lib64/libc.so.6 #2 0x0000003a26e70837 in __libc_message () from /lib64/libc.so.6 #3 0x0000003a26e76166 in malloc_printerr () from /lib64/libc.so.6 #4 0x0000003a26e78ca3 in _int_free () from /lib64/libc.so.6 #5 0x000000380244d694 in mem_put (ptr=0x4543f92c) at mem-pool.c:526 #6 0x0000003802808e56 in rpcsvc_submit_generic (req=0x4543f92c, proghdr=0x45640870, hdrcount=<value optimized out>, payload=0x0, payloadcount=0, iobref=0x7f1d60153fb0) at rpcsvc.c:1266 #7 0x00000038028092f6 in rpcsvc_error_reply (req=0x4543f92c) at rpcsvc.c:1285 #8 0x000000380280936b in rpcsvc_check_and_reply_error (ret=-1, frame=<value optimized out>, opaque=0x4543f92c) at rpcsvc.c:547 #9 0x000000380245b9ea in synctask_wrap (old_task=<value optimized out>) at syncop.c:335 #10 0x0000003a26e43bf0 in ?? () from /lib64/libc.so.6 #11 0x0000000000000000 in ?? () ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [root@dhcp-8-29-222 ~]# gluster peer status Number of Peers: 11 Hostname: 10.8.29.193 Uuid: 13b1df2a-808f-4a6e-9008-f11b36a54023 State: Peer in Cluster (Connected) Hostname: 10.8.29.179 Uuid: f79c05c5-f74b-4677-a61b-1a09bb387288 State: Peer in Cluster (Disconnected) Hostname: 10.8.29.190 Uuid: 8c50ee5b-0479-4180-b87e-55b64332e990 State: Peer in Cluster (Connected) Hostname: 10.8.29.36 Uuid: 73312a82-d515-4845-b677-7cf4613d757f State: Peer in Cluster (Connected) Hostname: 10.8.30.229 Uuid: 25ce55db-d67a-45f3-865c-116086aa774b State: Peer in Cluster (Connected) Hostname: 10.8.30.126 Uuid: c1ddd33b-3022-4e40-9fd2-371d98d1f814 State: Peer in Cluster (Connected) Hostname: 10.8.29.210 Uuid: 38639900-aa13-4397-b404-3688cec05d2d State: Peer in Cluster (Connected) Hostname: 10.8.29.220 Uuid: 4b25874f-e3f6-4f03-b53c-a50babe8714b State: Peer in Cluster (Connected) Hostname: 10.8.29.209 Uuid: c6a8c9a2-8f20-4442-902b-071e8255084e State: Peer in Cluster (Connected) Hostname: 10.8.29.212 Uuid: add11505-cbb5-4bf8-bfb2-1ff84293a909 State: Peer in Cluster (Connected) Hostname: 10.8.29.203 Uuid: 52eed23f-9bfa-4745-a033-28d7d486a942 State: Peer in Cluster (Connected) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ [root@dhcp-8-29-222 ~]# gluster v i Volume Name: Volume0_Scale Type: Distributed-Replicate Volume ID: 7d8ca209-0706-4fc2-aa09-039647c42c75 Status: Started Snap Volume: no Number of Bricks: 6 x 2 = 12 Transport-type: tcp Bricks: Brick1: 10.8.29.222:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick1/b1 Brick2: 10.8.29.193:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick2/b1 Brick3: 10.8.29.179:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick3/b1 Brick4: 10.8.29.190:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick4/b1 Brick5: 10.8.30.229:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick5/b1 Brick6: 10.8.30.126:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick6/b1 Brick7: 10.8.29.220:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick7/b1 Brick8: 10.8.29.209:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick8/b1 Brick9: 10.8.29.212:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick9/b1 Brick10: 10.8.29.203:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick10/b1 Brick11: 10.8.29.36:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick11/b1 Brick12: 10.8.29.210:/var/run/gluster/snaps/f02c456eae9743d4a16886fe420c418d/brick12/b1 Options Reconfigured: features.barrier: disable performance.readdir-ahead: on auto-delete: disable snap-max-soft-limit: 90 snap-max-hard-limit: 256
I was able to hit the crash, but the bt looks different. Followed the similar steps as mentioned in "Steps to Reproduce" along with the step of trying to create another volume (which was immediately interrupted) in between while snapshot creation was in progress and glusterd crashed immediately. Connection failed. Please check if gluster daemon is operational. real 0m0.098s user 0m0.077s sys 0m0.019s Connection failed. Please check if gluster daemon is operational. I tried the similar steps on 4 node cluster, but did not face the issue. Retried the issue on 12 node cluster and hit the crash bt : === #0 __gf_free (free_ptr=0x504e280) at mem-pool.c:252 #1 0x000000380244d694 in mem_put (ptr=0x504e29c) at mem-pool.c:526 #2 0x0000003802808e56 in rpcsvc_submit_generic (req=0x504e29c, proghdr=0xfd25770, hdrcount=<value optimized out>, payload=0x0, payloadcount=0, iobref=0x7ff8d0663820) at rpcsvc.c:1266 #3 0x00000038028092f6 in rpcsvc_error_reply (req=0x504e29c) at rpcsvc.c:1285 #4 0x000000380280936b in rpcsvc_check_and_reply_error (ret=-1, frame=<value optimized out>, opaque=0x504e29c) at rpcsvc.c:547 #5 0x000000380245b9ea in synctask_wrap (old_task=<value optimized out>) at syncop.c:335 #6 0x0000003a26e43bf0 in ?? () from /lib64/libc-2.12.so #7 0x0000000000000000 in ?? () (gdb)
The steps are already updated in Description , nevertheless reposting the steps: 12 node cluster 6x2 dist rep volume 1.Fuse and NFS mount the volume and create IO for i in {101..140}; do dd if=/dev/urandom of=file_"$i" bs=1024M count=1; done for i in {101..140}; do dd if=/dev/urandom of=file_nfs_"$i" bs=1024M count=1; done Tried a few snapshot operations like snapshot create ,list , delete and restore 2.Now created some more snapshots in a loop While snapshot creation is in progress, tried to create new volume with existing bricks (which was immediately interrupted) in between while snapshot creation was in progress and glusterd crashed immediately (Snapshot command failed as it took more than 2 min - crossed the cli time out and the remaning snapshots failed with error "Connection failed. Please check if gluster daemon is operational") 5. [root@dhcp-8-29-179 ~]# service glusterd status glusterd dead but pid file exists
Following is the initial analysis of the bug: The crash is happening during a network disconnect when brick-op handler is sending a response back to the originator node. Response is sent by calling glusterd_submit_reply. This function deletes the req object upon completion. The function returns an error code (-1) if it fails to send the reply, but it anyway deletes the req object. brick-op handler is called from the sync-op framework. sync-op framework calls rpcsvc_check_and_reply_error on error which in turns calls glusterd_submit_reply to send the error back to the originator. Which will delete the req object again. Therefore if during a brick-op handler network disconnect happens then it might lead to double deletion of req object and hence leading to crash.
Version : glusterfs-3.7.1-11.el6rhs.x86_64 Retried the steps as mentioned in Description in a 8 node cluster did not observer any crash. Marking the bug 'verified'
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2015-1495.html