Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1329871

Summary: tests/basic/afr/heal-info.t fails
Product: [Community] GlusterFS Reporter: Pranith Kumar K <pkarampu>
Component: replicateAssignee: Pranith Kumar K <pkarampu>
Status: CLOSED WORKSFORME QA Contact:
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: mainlineCC: bugs, kdhananj
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-05 11:31:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Pranith Kumar K 2016-04-24 01:39:10 UTC
Description of problem:
#!/bin/bash
#Test that parallel heal-info command execution doesn't result in spurious
#entries with locking-scheme granular

. $(dirname $0)/../../include.rc
. $(dirname $0)/../../volume.rc

cleanup;

function heal_info_to_file {
        while [ -f $M0/a.txt ]; do
                $CLI volume heal $V0 info | grep -i number | grep -v 0 >> $1
        done
}

function write_and_del_file {
        dd of=$M0/a.txt if=/dev/zero bs=1024k count=100
        rm -f $M0/a.txt
}

TEST glusterd
TEST pidof glusterd
TEST $CLI volume create $V0 replica 2 $H0:$B0/brick{0,1}
TEST $CLI volume set $V0 locking-scheme granular
TEST $CLI volume start $V0
TEST $GFS --volfile-id=$V0 --volfile-server=$H0 $M0;
TEST touch $M0/a.txt
write_and_del_file &
touch $B0/f1 $B0/f2
heal_info_to_file $B0/f1 &
heal_info_to_file $B0/f2 &
wait
EXPECT "^$" cat $B0/f1
EXPECT "^$" cat $B0/f2

cleanup;

This test failed on NetBSD twice. While debugging it was found that if unlink is in progress when 'dirty' index is being checked for heal, on one of the bricks it gets ENOENT while on the other it will get success. This will lead to an assumption that the file needs heal.

This was leading to failure.
Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Krutika Dhananjay 2016-07-05 11:31:21 UTC
Hi,

So I ran this continuously in a loop for about 24 hrs on linux and it didn't fail even once.

For want of a NetBSD slave, I went through the most recent ~120 netbsd runs on jenkins slaves:
https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/17730/ through https://build.gluster.org/job/rackspace-netbsd7-regression-triggered/17848/ 

and heal-info.t did not fail in any of these runs.

So I am closing this bug for now with 'WORKSFORME' resolution. Please reopen the bug if the failure occurs again with a link to the "console" output and the patch against which the failure was seen.

-Krutika