Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1276989

Summary:	ec-readdir.t is failing consistently
Product:	[Community] GlusterFS	Reporter:	Pranith Kumar K <pkarampu>
Component:	disperse	Assignee:	Pranith Kumar K <pkarampu>
Status:	CLOSED CURRENTRELEASE	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	bugs, jahernan, rtalur
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	glusterfs-3.8rc2	Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:
Clones:	1278539 1278744 (view as bug list)		Environment:
Last Closed:	2016-06-16 13:42:20 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1278539, 1278744, 1280410

Description Pranith Kumar K 2015-11-02 02:29:36 UTC

Description of problem:
if we run ec-readdir.t in a loop it fails. This only happens on mainline not on 3.7. So some regression is the reason. For now moving to bad-tests

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Vijay Bellur 2015-11-02 02:35:01 UTC

REVIEW: http://review.gluster.org/12481 (tests: Move ec-readdir.t to bad tests) posted (#1) for review on master by Pranith Kumar Karampuri (pkarampu)

Comment 2 Vijay Bellur 2015-11-02 11:43:16 UTC

COMMIT: http://review.gluster.org/12481 committed in master by Raghavendra Talur (rtalur) 
------
commit 56ccc0d2f4a30af9304852effbf2b68694d9f587
Author: Pranith Kumar K <pkarampu>
Date:   Mon Nov 2 07:56:51 2015 +0530

    tests: Move ec-readdir.t to bad tests
    
    Change-Id: Ie7f6d25cbc617ff347aeb7d77fc0a60924c83f09
    BUG: 1276989
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/12481
    Tested-by: Raghavendra Talur <rtalur>
    Reviewed-by: Raghavendra Talur <rtalur>

Comment 3 Vijay Bellur 2015-11-11 13:48:10 UTC

COMMIT: http://review.gluster.org/12561 committed in master by Xavier Hernandez (xhernandez) 
------
commit 9a69ad2c8438b9fbdcb133404a5d205f809bbb5a
Author: Pranith Kumar K <pkarampu>
Date:   Tue Nov 10 09:06:54 2015 +0530

    cluster/ec: fix bug in update_good
    
    Problem:
    Bricks that didn't participate in the fops are considered to be good. This is happening two fold.
    Examples:
    Case-1:
    1) 2+1 volume. 'd1' directory on Brick-0 is bad.
    2) readdir takes locks and lock->good_mask is '7'
    3) readdir does xattrop and fop->mask is '6'.
    4) because fop->expected is '1' lock->good_mask remains '7'
    
    Case-2:
    1) when all the bricks are up, it does lock + xattrop before op and figures out
       all the bricks are good.
    2) By the time second operation starts brick-0 is down. Now lock->good_mask
       will always have the '0' bit set as long as the operations are happening on it.
       because: "lock->good_mask &= ~fop->mask | fop->remaining" fop->mask doesn't
       have '0' th bit.
    3) When it comes time to perform the final xattrop in update_size_version
       brick-0 comes online because of which it gives the same version to brick-0
       as well thinking it has participated in all the transactions till then, even
       when it didn't participate in the transactions.
    
    Fix:
    Case-1's fix: Update lock->good_mask in ec_prepare_update_cbk with latest
    good/bad bricks
    Case-2's fix: Consider non-participating brick as bad.
    
    Change-Id: Ic01a733f8180131ded6a3cc784fcb1960758cf23
    BUG: 1276989
    Signed-off-by: Pranith Kumar K <pkarampu>
    Reviewed-on: http://review.gluster.org/12561
    Tested-by: NetBSD Build System <jenkins.org>
    Tested-by: Gluster Build System <jenkins.com>
    Reviewed-by: Xavier Hernandez <xhernandez>

Comment 4 Raghavendra Talur 2016-03-10 07:41:15 UTC

"tests" component is for tests framework only.
File a bug under test component if you find a bug in 
1. any of the *.rc files under tests/ 
2. run-tests.sh


For everything else, the bug should be filed on
1. component which is being tested by .t file if the .t file requires fix.
2. component which is causing a valid .t file to fail in regression.

I have used my best judgement here to move the bug to right component.
In case of ambiguity, I have placed the blame on the .t file component.

Please consider test bugs under the same backlog list that tracks other bugs in your component.

Comment 5 Niels de Vos 2016-06-16 13:42:20 UTC

This bug is getting closed because a release has been made available that should address the reported issue. In case the problem is still not fixed with glusterfs-3.8.0, please open a new bug report.

glusterfs-3.8.0 has been announced on the Gluster mailinglists [1], packages for several distributions should become available in the near future. Keep an eye on the Gluster Users mailinglist [2] and the update infrastructure for your distribution.

[1] http://blog.gluster.org/2016/06/glusterfs-3-8-released/
[2] http://thread.gmane.org/gmane.comp.file-systems.gluster.user