Bug 1378300

Summary: Modifications to AFR Events
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Ravishankar N <ravishankar>
Component: replicateAssignee: Ravishankar N <ravishankar>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: amukherj, asrivast, bugs, rhinduja, rhs-bugs, storage-qa-internal
Target Milestone: ---Keywords: Triaged
Target Release: RHGS 3.2.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: glusterfs-3.8.4-2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1378072 Environment:
Last Closed: 2017-03-23 05:48:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1378072, 1379028    
Bug Blocks: 1351528    

Description Ravishankar N 2016-09-22 04:20:40 UTC
+++ This bug was initially created as a clone of Bug #1378072 +++

regards
Aravinda

On Wednesday 21 September 2016 03:53 PM, Ravishankar N wrote:
> On 09/21/2016 03:34 PM, Aravinda wrote:
>> Hi,
>>
>> We have following SPLIT_BRAIN events
>>
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=file type mismatch;gfid=%s;ia_type-%d=%s;ia_type-%d=%s"
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=gfid mismatch. Skipping conservative merge.;file=<gfid:%s>/%s>;count=2;child-%d=%s;gfid-%d=%s;child-%d=%s;gfid-%d=%s",
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=file type mismatch. Skipping conservative merge;file=<gfid:%s>/%s>;count=2;child-%d=%s;type-%d=%s;child-%d=%s;type-%d=%s",
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=file type mismatch;file=<gfid:%s>/%s;count=2;child-%d=%s;type-%d=%s;child-%d=%s;type-%d=%s"
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=gfid mismatch;file=<gfid:%s>/%s;count=2;child-%d=%s;gfid-%d=%s;child-%d=%s;gfid-%d=%s"
>>
>> Message keys are not same even though Split brain type is same. For example,
>>
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=file type mismatch;gfid=%s;ia_type-%d=%s;ia_type-%d=%s"
>> EVENT_AFR_SPLIT_BRAIN, "subvol=%s;msg=file type mismatch;file=<gfid:%s>/%s;count=2;child-%d=%s;type-%d=%s;child-%d=%s;type-%d=%s"
>>
>> We can split these events into two types.(Message included as EVENT name itself, so that no separate msg is required.)
>>
>> EVENT_AFR_SPLIT_BRAIN_FILE_TYPE_MISMATCH, "subvol=%s;file=<gfid:%s>/%s;count=2;child-%d=%s;type-%d=%s;child-%d=%s;type-%d=%s"
>> EVENT_AFR_SPLIT_BRAIN_GFID_MISMATCH, "subvol=%s;file=<gfid:%s>/%s>;count=2;child-%d=%s;gfid-%d=%s;child-%d=%s;gfid-%d=%s"
>>
>> Let me know your thoughts.
>>
>
> I think it is better to retain one EVENT type for split-brains. Otherwise we need to keep on adding different EVENT types for example when we need to propagate data split-brain or metadata split-brain etc.
> The msg=<message> also gives us a way to specific a more verbose message that is immediately available to the consumer, should they decide to parse it. Also, for all types of split-brains, there has to be a remedial action (i.e. resolving the split-brains) required by the admin. Having to monitor only one EVENT type for all split-brains would make  it easier is what I feel.
If it is one event type, then I would prefer "type" instead of msg. (type can be gfid|file|data|meta etc). Descriptive message is not required since these APIs are consumed programatically, free text and non uniform key names breaks the parsing.
>
> -Ravi

Comment 2 Atin Mukherjee 2016-09-24 08:11:05 UTC
Upstream mainline patch : http://review.gluster.org/15550
Upstream 3.9 patch : http://review.gluster.org/15565

Comment 3 Ravishankar N 2016-09-26 05:33:13 UTC
Downstream patch: https://code.engineering.redhat.com/gerrit/#/c/85605/

Comment 6 Sweta Anandpara 2016-12-19 09:39:33 UTC
Tested and verified this on the build 3.8.4-8. 

Created different scenarios so that I hit AFR_SPLIT_BRAIN events with respect to all types - data/metadata/gfid/type-mismatch.

The 'type' field which is newly introduced in AFR_SPLIT_BRAIN correctly displays the value. Moving this blanket BZ to verified in 3.2

{u'message': {u'count': u'2', u'subvol': u'ozone-replicate-0', u'gfid-1': u'4ad208ac-7558-45ba-a807-c09f5f7339a5', u'gfid-0': u'373c519b-c93f-4e5f-9203-817a00198ad7', u'child-0': u'ozone-client-0', u'child-1': u'ozone-client-1', u'file': u'<gfid:00000000-0000-0000-0000-000000000001>/a>', u'type': u'gfid'}, u'event': u'AFR_SPLIT_BRAIN', u'ts': 1482139686, u'nodeid': u'8d1aaf3a-059e-41c2-871b-6c7f5c0dd90b'}

{u'message': {u'subvol': u'ozone-replicate-1', u'type': u'metadata', u'file': u'3012b1b1-4db6-4044-8455-d0ecd9f5e33f'}, u'event': u'AFR_SPLIT_BRAIN', u'ts': 1482139671, u'nodeid': u'95c24075-02aa-49c1-a1e4-c7e0775e7128'}

{u'message': {u'subvol': u'ozone-replicate-0', u'type': u'data', u'file': u'd3902718-cf0d-44b9-be72-c1c1a75bdde7'}, u'event': u'AFR_SPLIT_BRAIN', u'ts': 1482139664, u'nodeid': u'8d3ab9e3-c086-4474-9db2-677d5018d055'}

Comment 8 errata-xmlrpc 2017-03-23 05:48:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html