Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1329871

Summary:	tests/basic/afr/heal-info.t fails
Product:	[Community] GlusterFS	Reporter:	Pranith Kumar K <pkarampu>
Component:	replicate	Assignee:	Pranith Kumar K <pkarampu>
Status:	CLOSED WORKSFORME	QA Contact:
Severity:	unspecified	Docs Contact:
Priority:	unspecified
Version:	mainline	CC:	bugs, kdhananj
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-07-05 11:31:21 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Pranith Kumar K 2016-04-24 01:39:10 UTC

Description of problem:
#!/bin/bash
#Test that parallel heal-info command execution doesn't result in spurious
#entries with locking-scheme granular

. $(dirname $0)/../../include.rc
. $(dirname $0)/../../volume.rc

cleanup;

function heal_info_to_file {
        while [ -f $M0/a.txt ]; do
                $CLI volume heal $V0 info | grep -i number | grep -v 0 >> $1
        done
}

function write_and_del_file {
        dd of=$M0/a.txt if=/dev/zero bs=1024k count=100
        rm -f $M0/a.txt
}

TEST glusterd
TEST pidof glusterd
TEST $CLI volume create $V0 replica 2 $H0:$B0/brick{0,1}
TEST $CLI volume set $V0 locking-scheme granular
TEST $CLI volume start $V0
TEST $GFS --volfile-id=$V0 --volfile-server=$H0 $M0;
TEST touch $M0/a.txt
write_and_del_file &
touch $B0/f1 $B0/f2
heal_info_to_file $B0/f1 &
heal_info_to_file $B0/f2 &
wait
EXPECT "^$" cat $B0/f1
EXPECT "^$" cat $B0/f2

cleanup;

This test failed on NetBSD twice. While debugging it was found that if unlink is in progress when 'dirty' index is being checked for heal, on one of the bricks it gets ENOENT while on the other it will get success. This will lead to an assumption that the file needs heal.

This was leading to failure.
Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Krutika Dhananjay 2016-07-05 11:31:21 UTC

Hi,

So I ran this continuously in a loop for about 24 hrs on linux and it didn't fail even once.

For want of a NetBSD slave, I went through the most recent ~120 netbsd runs on jenkins slaves:
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/17730/ through https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/17848/ 

and heal-info.t did not fail in any of these runs.

So I am closing this bug for now with 'WORKSFORME' resolution. Please reopen the bug if the failure occurs again with a link to the "console" output and the patch against which the failure was seen.

-Krutika