Bug 852191

Summary: Healing failed....not sure why?
Product: [Community] GlusterFS Reporter: Rob.Hendelman
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED DUPLICATE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 3.3.0CC: gluster-bugs
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-08-29 06:47:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
evprodglx01# gluster system:: fsm log output
none
evprodglx02# gluster system:: fsm log output
none
drglx01# gluster system:: fsm log output none

Description Rob.Hendelman 2012-08-27 19:46:05 UTC
Description of problem:
Our setup:
gluster> volume info all
 
Volume Name: data
Type: Distributed-Replicate
Volume ID: cd9dbea9-6e6d-4b7f-a492-55e4baed6c49
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: evprodglx01:/mnt/gluster/bricks/1
Brick2: evprodglx02:/mnt/gluster/bricks/1
Brick3: drglx01:/mnt/gluster/bricks/1
Brick4: evprodglx01:/mnt/gluster/bricks/2
Brick5: evprodglx02:/mnt/gluster/bricks/2
Brick6: drglx01:/mnt/gluster/bricks/2
Options Reconfigured:
cluster.min-free-disk: 1%
cluster.stripe-block-size: 2560k
nfs.disable: on
performance.write-behind-window-size: 1MB
performance.io-thread-count: 16
performance.cache-size: 32MB
cluster.self-heal-window-size: 512
cluster.data-self-heal-algorithm: full
cluster.self-heal-daemon: on

All servers on 10 Gigabit ethernet.  Client on 1Gbit bonded.


Version-Release number of selected component (if applicable):


How reproducible:
Not sure.

Steps to Reproduce:
1. Setup cluster as above
2. Tried to rsync a few times
3. Bricks got out of whack (a "find /mnt/brickX -type f | wc -l" revealed different numbers of files)
4. Tried to heal.  Command starts at first & then no commands work.

Actual results:
gluster> volume heal data info healed
operation failed
gluster> volume heal data info heal-failed
operation failed
gluster> volume heal data info split-brain
operation failed


Expected results:
Volume healed



Additional info:
from mnt-gluster-client-data log:

[2012-08-27 14:28:44.175907] I [client-handshake.c:1445:client_setvolume_cbk] 0-data-client-5: Server and Client lk-version numbers are not same, reopening the fds

Let me know what else you need to debug this.  I've tried restarting each gluster server without success.  Right now the servers are functioning as clients as well with a mount...they were all installed from the official 11.04 gluster debs from gluster.org.

Comment 1 Pranith Kumar K 2012-08-28 06:01:13 UTC
hi Rob,
I have a couple of questions:
1) volume heal data info healed is giving "operation failed" at this point if you execute "gluster volume status data". Does it still give "operation failed"?
2)
  Could you post the output of the following in each of the brick-backends:
ls <brick-path>/.glusterfs/indices/xattrop/ | wc -l

Pranith.

Comment 2 Rob.Hendelman 2012-08-28 12:36:25 UTC
1) I'm also getting operation failed or "Unable to obtain volume status information"

gluster> volume status data
Unable to obtain volume status information.
gluster> quit
root@evprodglx01:/mnt/gluster/bricks/1# /etc/init.d/glusterd status
 * glusterd service is running with pid 747
root@evprodglx01:/mnt/gluster/bricks/1# gluster
gluster> volume status data
operation failed

2) 
root@evprodglx01:/# ls /mnt/gluster/bricks/1/.glusterfs/indices/xattrop/ | wc -l
27
root@evprodglx02:/# ls /mnt/gluster/bricks/1/.glusterfs/indices/xattrop/ | wc -l
2
root@drglx01:/# ls /mnt/gluster/bricks/1/.glusterfs/indices/xattrop/ | wc -l
26
root@evprodglx01:/# ls /mnt/gluster/bricks/2/.glusterfs/indices/xattrop/ | wc -l
0
root@evprodglx02:/# ls /mnt/gluster/bricks/2/.glusterfs/indices/xattrop/ | wc -l
0
root@drglx01:/# ls /mnt/gluster/bricks/2/.glusterfs/indices/xattrop/ | wc -l
0

Comment 3 Pranith Kumar K 2012-08-28 13:19:53 UTC
hi Rob,
   Seems like Op-sm is stuck. Could you provide the output of
gluster system:: fsm log
on all the machines.

Pranith

Comment 4 Rob.Hendelman 2012-08-28 13:27:59 UTC
=================================
evprodglx01
=================================
number of transitions: 50
Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Stage op sent]
New State: [Brick op sent]
Event    : [GD_OP_EVENT_STAGE_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op sent]
New State: [Brick op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:25]

Old State: [Commit op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Commit op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Commit op sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_COMMIT_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Default]
New State: [Lock sent]
Event    : [GD_OP_EVENT_START_LOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Lock sent]
New State: [Lock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Lock sent]
New State: [Lock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Lock sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Stage op sent]
New State: [Brick op sent]
Event    : [GD_OP_EVENT_STAGE_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Brick op sent]
New State: [Brick op failed]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-27 15:09:37]

Old State: [Brick op failed]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:37]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-27 15:09:37]

Old State: [Default]
New State: [Lock sent]
Event    : [GD_OP_EVENT_START_LOCK]
timestamp: [2012-08-27 15:09:49]

Old State: [Lock sent]
New State: [Lock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Lock sent]
New State: [Lock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Lock sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Stage op sent]
New State: [Brick op sent]
Event    : [GD_OP_EVENT_STAGE_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op sent]
New State: [Brick op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:49]

Old State: [Commit op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Commit op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Commit op sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_COMMIT_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Default]
New State: [Lock sent]
Event    : [GD_OP_EVENT_START_LOCK]
timestamp: [2012-08-28 07:32:55]

Old State: [Lock sent]
New State: [Ack drain]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Ack drain]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-28 07:32:55]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-28 07:33:11]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-28 07:33:11]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-28 07:33:11]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-28 07:33:11]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-28 07:33:11]

=================================
evprodglx02
=================================
number of transitions: 50
Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:18]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:18]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:18]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:18]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:18]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:06:40]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:40]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:40]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:40]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:40]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:40]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:06:52]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:52]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:53]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:53]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:53]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:53]

Old State: [Default]
New State: [Default]
Event    : [GD_OP_EVENT_LOCAL_UNLOCK_NO_RESP]
timestamp: [2012-08-27 15:08:13]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:20]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:20]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:20]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:20]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:20]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:25]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:25]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:25]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:25]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:37]

Old State: [Staged]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:49]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:49]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:49]

Old State: [Committed]
root@evprodglx02:/var/log/glusterfs# more /tmp/evprodglx02_system_fsm.log 
number of transitions: 50
Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:18]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:18]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:18]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:18]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:18]

root@evprodglx02:/var/log/glusterfs# more /tmp/evprodglx02_system_fsm.log 
number of transitions: 50
Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:18]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:18]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:18]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:18]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:18]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:06:40]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:40]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:40]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:40]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:40]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:40]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:06:52]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:52]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:53]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:53]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:53]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:53]

Old State: [Default]
New State: [Default]
Event    : [GD_OP_EVENT_LOCAL_UNLOCK_NO_RESP]
timestamp: [2012-08-27 15:08:13]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:20]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:20]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:20]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:20]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:20]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:25]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:25]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:25]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:25]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:37]

Old State: [Staged]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:49]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:49]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:49]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:49]

Old State: [Default]
New State: [Lock sent]
Event    : [GD_OP_EVENT_START_LOCK]
timestamp: [2012-08-28 07:32:55]

Old State: [Lock sent]
New State: [Ack drain]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Ack drain]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-28 07:32:55]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-28 07:33:11]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-28 07:33:11]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-28 07:33:11]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-28 07:33:11]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-28 07:33:11]

=================================
drglx01
=================================

number of transitions: 50
Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:40]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:40]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:06:52]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:06:52]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:06:53]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:06:53]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:06:53]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:06:53]

Old State: [Default]
New State: [Default]
Event    : [GD_OP_EVENT_LOCAL_UNLOCK_NO_RESP]
timestamp: [2012-08-27 15:08:13]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:20]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:20]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:20]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:20]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:20]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:25]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:25]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:25]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:25]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:25]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:37]

Old State: [Staged]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:37]

Old State: [Default]
New State: [Locked]
Event    : [GD_OP_EVENT_LOCK]
timestamp: [2012-08-27 15:09:49]

Old State: [Locked]
New State: [Staged]
Event    : [GD_OP_EVENT_STAGE_OP]
timestamp: [2012-08-27 15:09:49]

Old State: [Staged]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_COMMIT_OP]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op Committed]
New State: [Brick op Committed]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-27 15:09:49]

Old State: [Brick op Committed]
New State: [Committed]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-27 15:09:49]

Old State: [Committed]
New State: [Default]
Event    : [GD_OP_EVENT_UNLOCK]
timestamp: [2012-08-27 15:09:49]

Old State: [Default]
New State: [Lock sent]
Event    : [GD_OP_EVENT_START_LOCK]
timestamp: [2012-08-28 07:32:55]

Old State: [Lock sent]
New State: [Ack drain]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Ack drain]
New State: [Ack drain]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Ack drain]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_RJT]
timestamp: [2012-08-28 07:32:55]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-28 07:32:55]

Old State: [Default]
New State: [Lock sent]
Event    : [GD_OP_EVENT_START_LOCK]
timestamp: [2012-08-28 07:33:11]

Old State: [Lock sent]
New State: [Lock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Lock sent]
New State: [Lock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Lock sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Stage op sent]
New State: [Stage op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Stage op sent]
New State: [Brick op sent]
Event    : [GD_OP_EVENT_STAGE_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Brick op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_ALL_ACK]
timestamp: [2012-08-28 07:33:11]

Old State: [Commit op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Commit op sent]
New State: [Commit op sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Commit op sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_COMMIT_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Unlock sent]
New State: [Unlock sent]
Event    : [GD_OP_EVENT_RCVD_ACC]
timestamp: [2012-08-28 07:33:11]

Old State: [Unlock sent]
New State: [Default]
Event    : [GD_OP_EVENT_ALL_ACC]
timestamp: [2012-08-28 07:33:11]

Comment 5 Rob.Hendelman 2012-08-28 13:29:32 UTC
Sorry looks like I posted portions of evprodglx02 log multiple times.

I'm just going to include the output of this command as attachments.

Rob

Comment 6 Rob.Hendelman 2012-08-28 13:31:20 UTC
Created attachment 607512 [details]
evprodglx01# gluster system:: fsm log output

Comment 7 Rob.Hendelman 2012-08-28 13:31:43 UTC
Created attachment 607513 [details]
evprodglx02# gluster system:: fsm log output

Comment 8 Rob.Hendelman 2012-08-28 13:32:05 UTC
Created attachment 607514 [details]
drglx01# gluster system:: fsm log output

Comment 9 Pranith Kumar K 2012-08-28 13:44:06 UTC
Everything is fine according to logs. Does it happen even now?.

Comment 10 Rob.Hendelman 2012-08-28 14:00:45 UTC
evprodglx01
=====

gluster> volume heal data info healed
Heal operation on volume data has been successful

Brick evprodglx01:/mnt/gluster/bricks/1
Number of entries: 1023
at                    path on brick
-----------------------------------
3101-06-23 12:44:48 <gfid:82137185-3b41-42e9-b9a9-bcf36ac89482>
Segmentation fault (core dumped)

evprodglx02
=====
gluster> volume heal data info healed
Heal operation on volume data has been successful

Brick evprodglx01:/mnt/gluster/bricks/1
Number of entries: 1023
at                    path on brick
-----------------------------------
3101-06-23 12:44:48 <gfid:82137185-3b41-42e9-b9a9-bcf36ac89482>
Segmentation fault (core dumped)

drglx01
======

gluster> volume heal data info healed
Heal operation on volume data has been successful

Brick evprodglx01:/mnt/gluster/bricks/1
Number of entries: 1023
at                    path on brick
-----------------------------------
3101-06-23 12:44:48 <gfid:82137185-3b41-42e9-b9a9-bcf36ac89482>
Segmentation fault (core dumped)

Comment 11 Rob.Hendelman 2012-08-28 19:53:49 UTC
I wonder if this could be related:

http://thr3ads.net/gluster-users/2012/07/1968747-Segfault-in-gluster-volume-heal

is that a date "3101-06-23" under "at" ?

Thanks,

Robert

Comment 12 Pranith Kumar K 2012-08-29 06:47:45 UTC
Yes you are correct. "Segmentation fault" bug is already fixed.
"Operation failed" bug is same as 829170. I will be marking it dup.
The reason it is working fine now is because you restarted glusterds.

*** This bug has been marked as a duplicate of bug 829170 ***

Comment 13 Rob.Hendelman 2012-08-29 12:29:39 UTC
What commit should I look for in git for the segmentation fault bug?  I Would like to try out a newer/fixed version if possible.

Thanks for looking into this.

Rob

Comment 14 Pranith Kumar K 2012-08-31 06:21:41 UTC
Fix for sef-fault fix:
http://review.gluster.com/#change,3550