Bug 1533383 - tests/basic/ec/heal-info.t fails randomly
Summary: tests/basic/ec/heal-info.t fails randomly
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: GlusterFS
Classification: Community
Component: disperse
Version: mainline
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Ashish Pandey
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-11 08:52 UTC by Nithya Balachandran
Modified: 2018-08-29 03:53 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-29 03:53:35 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Nithya Balachandran 2018-01-11 08:52:12 UTC
Description of problem:


https://build.gluster.org/job/centos6-regression/8321/console


05:36:46 ./tests/basic/ec/heal-info.t: line 34:  6762 Killed                  dd status=none if=/dev/zero of=$M0/a bs=1M count=2048
05:37:35 ./tests/basic/ec/heal-info.t .. 
05:37:35 1..22
05:37:35 ok 1, LINENUM:18
05:37:35 ok 2, LINENUM:19
05:37:35 ok 3, LINENUM:20
05:37:35 ok 4, LINENUM:21
05:37:35 ok 5, LINENUM:22
05:37:35 ok 6, LINENUM:23
05:37:35 ok 7, LINENUM:24
05:37:35 ok 8, LINENUM:26
05:37:35 ok 9, LINENUM:32
05:37:35 not ok 10 Got "113" instead of "^0$", LINENUM:43
05:37:35 FAILED COMMAND: ^0$ echo 113
05:37:35 ok 11, LINENUM:47
05:37:35 ok 12, LINENUM:48
05:37:35 ok 13, LINENUM:49
05:37:35 ok 14, LINENUM:50
05:37:35 ok 15, LINENUM:51
05:37:35 ok 16, LINENUM:52
05:37:35 ok 17, LINENUM:55
05:37:35 not ok 18 Got "7" instead of "^1$", LINENUM:56
05:37:35 FAILED COMMAND: ^1$ get_pending_heal_count patchy
05:37:35 ok 19, LINENUM:57
05:37:35 ok 20, LINENUM:60
05:37:35 ok 21, LINENUM:61
05:37:35 not ok 22 Got "110" instead of "^105$", LINENUM:71
05:37:35 FAILED COMMAND: ^105$ get_pending_heal_count patchy
05:37:35 Failed 3/22 subtests 

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 Ashish Pandey 2018-01-11 12:00:18 UTC
I tried again to reproduce this issue and it fails once in a while.
It looks like when heal info tries to do heal inspection some of the bricks do not respond and it consider it as "ENOTCONN", in which case it will return it as heal needed.

This test is performing heal info in a loop while IO's are going on for 1000 files of 1MB each. Whenever, test is failing it is taking longer time.

My other observation that  heal was not happening was wrong. This test starts after it disables the heal for the volume.

--
Ashish

Comment 2 Nithya Balachandran 2018-01-12 03:16:07 UTC
From the tests on my system:


PASS:
=================


check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
check heal 1
+ heal_count=0
+ total_heal_count=0
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
+ heal_count=0
+ total_heal_count=0
check heal 1
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
check heal 1
+ heal_count=0
+ total_heal_count=0
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
++ gluster volume heal patchy info
+ rm -f /mnt/glusterfs/0/lock
+ set +x
Done
+ heal_count=0
+ total_heal_count=0
check heal 1
+ echo 'check heal 1'
+ set +x
ok 10, LINENUM:51
ok 11, LINENUM:55


===================================================

FAIL:
===================================================

check heal
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ awk '{ sum+=$4} END {print sum}'
++ grep 'Number of entries'
+ heal_count=0
+ total_heal_count=0
+ echo 'check heal 1'
check heal 1
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
+ heal_count=6                     <---------------Already non-zero
+ total_heal_count=6
check heal 1
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
check heal 1
+ heal_count=6
+ total_heal_count=12
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ awk '{ sum+=$4} END {print sum}'
++ grep 'Number of entries'
check heal 1
+ heal_count=6
+ total_heal_count=18
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ awk '{ sum+=$4} END {print sum}'
++ grep 'Number of entries'
++ gluster volume heal patchy info
+ heal_count=6
+ total_heal_count=24
+ echo 'check heal 1'
+ set +x
check heal 1
+ echo 'check heal'
check heal
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
+ heal_count=6
+ total_heal_count=30
check heal 1
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
++ gluster volume heal patchy info
check heal 1
+ heal_count=6
+ total_heal_count=36
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
+ heal_count=6
check heal 1
+ total_heal_count=42
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
check heal 1
+ heal_count=6
+ total_heal_count=48
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
check heal 1
+ heal_count=6
+ total_heal_count=54
+ echo 'check heal 1'
+ set +x
+ echo 'check heal'
check heal
++ get_pending_heal_count patchy
++ local vol=patchy
++ gluster volume heal patchy info
++ grep 'Number of entries'
++ awk '{ sum+=$4} END {print sum}'
check heal 1
+ heal_count=6
+ total_heal_count=60
+ echo 'check heal 1'
+ set +x
+ echo 'check heal'
check heal
++ get_pending_heal_count patchy
++ local vol=patchy
++ awk '{ sum+=$4} END {print sum}'
++ grep 'Number of entries'
++ gluster volume heal patchy info
check heal 1
+ heal_count=6
+ total_heal_count=66
+ echo 'check heal 1'
+ set +x
check heal
+ echo 'check heal'
++ get_pending_heal_count patchy
++ local vol=patchy
++ awk '{ sum+=$4} END {print sum}'
++ grep 'Number of entries'
++ gluster volume heal patchy info
+ rm -f /mnt/glusterfs/0/lock
Done
+ set +x
+ heal_count=6
+ total_heal_count=72
check heal 1
+ echo 'check heal 1'
+ set +x
not ok 10 Got "72" instead of "^0$", LINENUM:51
FAILED COMMAND: ^0$ echo 72

Comment 3 Amar Tumballi 2018-08-29 03:53:35 UTC
Lot of time since no activity on this bug. We have either fixed it already or it is mostly not critical anymore!

Please re-open the bug if the issue is burning for you, or you want to take the bug to closure with fixes.


Note You need to log in before you can comment on or make changes to this bug.