This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours

Bug 763651 (GLUSTER-1919)

Summary: segfault while stopping and starting volume again
Product: [Community] GlusterFS Reporter: Harshavardhana <fharshav>
Component: glusterdAssignee: Vijay Bellur <vbellur>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: urgent    
Version: 3.1.0CC: cww, gluster-bugs, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTNR Mount Type: ---
Documentation: DNR CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:

Description Harshavardhana 2010-10-11 17:11:40 EDT
(In reply to comment #0)
> #0  __gf_free (free_ptr=0x455f504f5f444727) at mem-pool.c:262
> #1  0x00007f63eb08f5e5 in glusterd_store_handle_destroy (handle=0x1f48808)
>     at glusterd-store.c:648
> #2  0x00007f63eb090a4b in glusterd_store_delete_brick (
>     volinfo=<value optimized out>, brickinfo=0x1f47358) at glusterd-store.c:224
> #3  0x00007f63eb091bbb in glusterd_store_update_volume (volinfo=0x1f46c58)
>     at glusterd-store.c:1181
> #4  0x00007f63eb082a44 in glusterd_op_stop_volume (req=<value optimized out>)
>     at glusterd-op-sm.c:3816
> #5  glusterd_op_commit_perform (req=<value optimized out>) at
> glusterd-op-sm.c:4888
> #6  0x00007f63eb08ce11 in glusterd3_1_commit_op (frame=<value optimized out>, 

Ok to reproduce this is simple 

Stop the running volume 

Move the "info" file from "/etc/glusterd/vols/<volumename>/info" to some other location eg: /tmp, you will still be able to start the volume the server parts but not NFS part. 

Start the volume <volumename>

Stop the volume <volumename>, glusterd segfaults.

Are we going to keep information always in a filename? if "/etc/glusterd" is corrupt is my whole volume hosed? 

is there way to recover back the volume "info" file if the volume information is still present? 

In any case we can't segfault.
Comment 1 Harshavardhana 2010-10-11 17:14:06 EDT
> 
> Move the "info" file from "/etc/glusterd/vols/<volumename>/info" to some other
> location eg: /tmp, you will still be able to start the volume the server parts
> but not NFS part. 
> 
> Start the volume <volumename>
> 
> Stop the volume <volumename>, glusterd segfaults.
> 
> Are we going to keep information always in a filename? if "/etc/glusterd" is
> corrupt is my whole volume hosed? 
> 
> is there way to recover back the volume "info" file if the volume information
> is still present? 
> 
> In any case we can't segfault.

Since even moving back the "info" file back into the directory results in

[root@platform test1]# gluster volume info test1

Volume Name: test1
Type: Distribute
Status: Stopped
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: :
Brick2: :

But info files have the content intact

[root@platform test1]# less info 
type=0
count=2
status=2
sub_count=0
version=2
transport-type=0
volume-id=19401435-f3b0-4258-abd9-8a9a99c84bca
brick-0=10.1.10.202:-storage
brick-1=10.1.10.202:-data

So i can't start my volume anymore.
Comment 2 Harshavardhana 2010-10-11 19:46:49 EDT
#0  __gf_free (free_ptr=0x455f504f5f444727) at mem-pool.c:262
#1  0x00007f63eb08f5e5 in glusterd_store_handle_destroy (handle=0x1f48808)
    at glusterd-store.c:648
#2  0x00007f63eb090a4b in glusterd_store_delete_brick (
    volinfo=<value optimized out>, brickinfo=0x1f47358) at glusterd-store.c:224
#3  0x00007f63eb091bbb in glusterd_store_update_volume (volinfo=0x1f46c58)
    at glusterd-store.c:1181
#4  0x00007f63eb082a44 in glusterd_op_stop_volume (req=<value optimized out>)
    at glusterd-op-sm.c:3816
#5  glusterd_op_commit_perform (req=<value optimized out>) at glusterd-op-sm.c:4888
#6  0x00007f63eb08ce11 in glusterd3_1_commit_op (frame=<value optimized out>, 
    this=0x1f40df8, data=<value optimized out>) at glusterd3_1-mops.c:1302
#7  0x00007f63eb078625 in glusterd_op_ac_send_commit_op (
    event=<value optimized out>, ctx=<value optimized out>) at glusterd-op-sm.c:4146
#8  0x00007f63eb07ee99 in glusterd_op_sm () at glusterd-op-sm.c:5174
#9  0x00007f63eb08b246 in glusterd_handle_rpc_msg (req=0x7f63eafc803c)
    at glusterd3_1-mops.c:1491
#10 0x00007f63ecc0dfa1 in rpcsvc_handle_rpc_call (svc=0x1f3cc08, 
    trans=<value optimized out>, msg=0x1f4a838) at rpcsvc.c:992
#11 0x00007f63ecc0e0c3 in rpcsvc_notify (trans=0x1f4b188, 
    mydata=0x455f504f5f444727, event=<value optimized out>, data=0xffffffffffffffa8)
    at rpcsvc.c:1088
#12 0x00007f63ecc0e6bd in rpc_transport_notify (this=0x455f504f5f444727, 
    event=<value optimized out>, data=<value optimized out>) at rpc-transport.c:1142
#13 0x00007f63eadc1584 in socket_event_poll_in (this=0x1f4b188) at socket.c:1619
#14 0x00007f63eadc173d in socket_event_handler (fd=<value optimized out>, idx=2, 
    data=0x1f4b188, poll_in=1, poll_out=0, poll_err=<value optimized out>)
    at socket.c:1733
#15 0x00007f63ece52ad2 in event_dispatch_epoll_handler (i=<value optimized out>, 
---Type <return> to continue, or q <return> to quit---
    events=<value optimized out>, event_pool=<value optimized out>) at event.c:812
#16 event_dispatch_epoll (i=<value optimized out>, events=<value optimized out>, 
    event_pool=<value optimized out>) at event.c:876
#17 0x0000000000405194 in main (argc=1, argv=0x7fffb5f7fc18) at glusterfsd.c:141



-----------

(gdb) p *handle
$5 = {path = 0x455f504f5f444727 <Address 0x455f504f5f444727 out of bounds>, 
  fd = 1414415702, read = 0x274343415f4547}
(gdb) 


-------
Comment 3 Vijay Bellur 2010-10-28 05:02:01 EDT

*** This bug has been marked as a duplicate of bug 1757 ***
Comment 4 Vijay Bellur 2010-10-28 05:03:19 EDT

*** This bug has been marked as a duplicate of bug 1754 ***